What is Machine Learning?
Last updated
Last updated
We generally think machine learning as training a computer on a set of data so that it can recognize patterns in the data. To be more specific, machine learning can be defined as a set of methods that can automatically extract patterns in data and use the discovered patterns to predict future data or to support decision-making under uncertainty.
Artificial Intelligence (AI), also referred to as Machine Intelligence, is a set of computer programs and algorithms that can perform human-like performance without the need of human intelligence. However, the AI algorithms need some level of human intervention to learn and become capable of performing future tasks with minimal or no human supervision. The main categories of AI include Natural Language Processing (NLP), Knowledge Representation, Automated Reasoning and Machine Learning (ML). Figure 1-1 illustrates the hierarchy of AI, ML and other artificially intelligent algorithms. In some recent studies, AI is also referred to as Artificial General Intelligence (AGI) that can be applied to most human-like applications such as self-deriving vehicles.
Machine Learning is the science of allowing computer programs to learn from data and experience rather than following defined algorithms. The very first ML algorithm was proposed by Alan Turing as ‘Turing Test’ that was a method to determine whether a computer program is capable of human-like thinking. Over the next few decades with the gradual advancement of computer algorithms and computational power, in 1996 the first convolutional neural network (CNN), which is a specific type of Artificial Neural Networks, was proposed. Later in 2011, multiple studies have started focusing on Deep Neural Networks also known as Deep Learning (DL) which is foundation of most recent advancements in AI and ML.
Figure 1-1 General definition of deep learning in the context of Machine Learning and Artificial Intelligence (source: IBM)
Figure 1-2 illustrates the historical evolution of machine learning and the transformation of conventional neural networks into the current definition of deep learning. Conventional neural networks consisted of a group of nodes interconnected through weighted links that simulate the structure of human neurons. Although the most recent deep learning algorithms were generated from the basic structure of neural nets, the internal connections and configuration of layers are quite different in each case. Figure 1-3 shows the basic structural difference between a simple neural network and a deep learning neural network. The interconnections and structure of hidden layers between input parameters and target values are more complex in a deep learning neural net. Another main advantage of deep learning algorithms compared to traditional neural nets is the enhanced performance of the network with introduction of larger training datasets. Traditional ML algorithms show a limited performance peak while deep learning algorithms show improved performance. Figure 1-4 illustrated the comparison of traditional and deep learning neural nets in terms of performance of the trained model.
Figure 1-2 History and evolution of machine learning over the past decades (source: SAS)
Figure 1-3 Comparison of a simple and deep learning neural network
Figure 1-4 Performance of traditional versus deep learning algorithms
A machine learning algorithm consists of a dataset that includes input parameters, also known as features, and desired outputs, in case of supervised learning. These ML algorithms learn from the features in the training dataset and provide a model that generates outputs from new datasets with minimum error. Classification and regression are examples of supervised learning. The unsupervised learning algorithms are trained on a set of data that includes only input parameters without any specific outputs. Such algorithms are employed for clustering and grouping of features. The third set of ML algorithms are interactively learning from a set of data and provide feedback to a human to adjust the training parameters. Such process is known as active learning. An example of active learning is Reinforcement Learning that is employed in autonomous vehicles to receive feedback from the environment. Other types of learning algorithms are feature learning, anomaly detection, association rules learning and sparse dictionary learning which will be discussed in the following chapters.
One fundamental difference between traditional programming and machine learning is that in traditional programming, a set of data and a program are inputs for a computer to generate an output while in machine learning, data and outputs are given to the computer to generate a program (Domingos, 2010). Machine learning algorithms are used when human experience in not available (such as navigation problem), building models based on large datasets (such as taxi database in a city), problems with lack of human explanation (such as natural language processing), and customizing models (such as personalized search data).