Regression and Classification Problems with Neural Networks

The NNs learn to approximate the mapping function between the pairs of input and output data using an iterative training process in a supervised learning approach. Depending on the type or the format of the output, regression and classification are the two main types of supervised learning problems. Classification refers to the problems when the output is a finite set of discrete labels or categories. For example, in the case of traffic prediction, if the prediction is in the form of explicit classes such as congested and uncongested states, the traffic prediction is a classification problem. On the other hand, regression refers to the problems when the output is numerical and continuous. When applying NN to predict the traffic state in the form of continuous numbers such as flow, density, or average speed, the prediction is considered as a regression problem.

Knowing the differences between these two types of supervised learning problems helps with the design of the output layer and selection of the loss function for the training process. For regression, the output of the network is in the form of continuous values, and the linear activation function is commonly used for the output layer. For classification, the output of the network is usually in the form of probability for each class or category. In this case, the softmax activation function is used for the last layer, which converts the output of the previous layers into probabilities that sum to one.

Different loss functions are adopted in the training of the regression and classification problems. Training of the NN is an iterative process, and the network parameters are adjusted at every step based on the performance (e.g., error) of the network estimated using the loss function. The loss function selected for classification problems for the networks with probability outputs is usually from the cross-entropy loss function family. Cross-entropy loss function enables optimizing the network parameters to return a higher probability of the correct class and lower probabilities for the other classes. For the case of regression, the loss function considers the differences between the actual and predicted values. The mean squared error is a popular loss function for the regression problems. The parameters of the network are adjusted to reduce the residuals between the prediction and the actual output variables.

PreviousThe Basic Paradigm: Multilayer Perceptron NextAdvanced Topologies

Last updated 1 year ago