AIPrimer.AI
  • 🚦AI Primer In Transportation
  • CHAPTER 1 - INTRODUCTION TO MACHINE LEARNING
    • Machine Learning in Transportation
    • What is Machine Learning?
    • Types of Machine Learning
      • Supervised Learning
      • Unsupervised Learning
      • Semi-supervised Learning
      • Reinforced Learning
    • Fundamental concepts of machine learning
      • Model Training and Testing
      • Evaluating the Model’s Prediction Accuracy
      • The Underfitting and Overfitting Problems
      • Bias-Variance Tradeoff in Overfitting
      • Model Validation Techniques
      • Hyperparameter Tuning
      • Model Regularization
      • The Curse of Ddimensionality
    • Machine Learning versus Statistics
  • CHAPTER 2 - SUPERVISED METHODS
    • Supervised Learning_Complete Draft
    • K-Nearest Neighbor (KNN) Algorithm
    • Tree-Based Methods
    • Boosting
    • Support Vector Machines (SVMs)
  • CHAPTER 3 - UNSUPERVISED LEARNING
    • Principal Component Analysis
      • How Does It Work?
      • Interpretation of PCA result
      • Applications in Transportation
    • CLUSTERING
      • K-MEANS
      • SPECTRAL CLUSTERING
      • Hierarchical Clustering
    • REFERENCE
  • CHAPTER 4 - NEURAL NETWORK
    • The Basic Paradigm: Multilayer Perceptron
    • Regression and Classification Problems with Neural Networks
    • Advanced Topologies
      • Modular Network
      • Coactive Neuro–Fuzzy Inference System
      • Recurrent Neural Networks
      • Jordan-Elman Network
      • Time-Lagged Feed-Forward Network
      • Deep Neural Networks
  • CHAPTER 5 - DEEP LEARNING
    • Convolutional Neural Networks
      • Introduction
      • Convolution Operation
      • Typical Layer Structure
      • Parameters and Hyperparameters
      • Summary of Key Features
      • Training of CNN
      • Transfer Learning
    • Recurrent Neural Networks
      • Introduction
      • Long Short-Term Memory Neural Network
      • Application in transportation
    • Recent Development
      • AlexNet, ZFNet, VggNet, and GoogLeNet
      • ResNet
      • U-Net: Full Convolutional Network
      • R-CNN, Fast R-CNN, and Faster R-CNN
      • Mask R-CNN
      • SSD and YOLO
      • RetinaNet
      • MobileNets
      • Deformable Convolution Networks
      • CenterNet
      • Exemplar Applications in Transportation
    • Reference
  • CHAPTER 6 - REINFORCEMENT LEARNING
    • Introduction
    • Reinforcement Learning Algorithms
    • Model-free v.s. Model-based Reinforcement Learning
    • Applications of Reinforcement Learning to Transportation and Traffic Engineering
    • REFERENCE
  • CHAPTER 7 - IMPLEMENTING ML AND COMPUTATIONAL REQUIREMENTS
    • Data Pipeline for Machine Learning
      • Introduction
      • Problem Definition
      • Data Ingestion
      • Data Preparation
      • Data Segregation
      • Model Training
      • Model Deployment
      • Performance Monitoring
    • Implementation Tools: The Machine Learning Ecosystem
      • Machine Learning Framework
      • Data Ingestion tools
      • Databases
      • Programming Languages
      • Visualization Tools
    • Cloud Computing
      • Types and Services
    • High-Performance Computing
      • Deployment on-premise vs on-cloud
      • Case Study: Data-driven approach for the implementation of Variable Speed Limit
      • Conclusion
  • CHAPTER 8 - RESOURCES
    • Mathematics and Statistics
    • Programming, languages, and software
    • Machine learning environments
    • Tools of the Trade
    • Online Learning Sites
    • Key Math Concepts
  • REFERENCES
  • IMPROVEMENT BACKLOG
Powered by GitBook
On this page
  1. CHAPTER 1 - INTRODUCTION TO MACHINE LEARNING
  2. Fundamental concepts of machine learning

The Underfitting and Overfitting Problems

PreviousEvaluating the Model’s Prediction AccuracyNextBias-Variance Tradeoff in Overfitting

Last updated 1 year ago

Underfitting and overfitting are serious problems in model development (Ref-DD). The Underfitting problem is the result of using limited data and/or inadequate model structure to fit this data. Underfitted models tend to fail to accurately replicate the data points in both the training and the testing dataset. The overfitting problem is encountered when the model is able to replicate the training data points at high accuracy and fails to replicate any unseen data points (i.e., testing data points) at the same accuracy. In other words, the model tends to capture the noise in the data rather than the true relationship between the dependent and independent variables.

Figure: Example of under-fitting and overfitting of the training data

Figure 4 shows the model fitting for three different cases. In case I, an underfitted model is obtained. The model structure is too simple or inflexible to learn important features in the dataset. In this figure, a linear mode structure is assumed and hence the model fails to capture the nonlinear relationship between the variables. In case II, an adequate model is obtained which reasonably fits the data and captures the general trend between the variables. Case III shows an example of an overfitted model. Adopting a complex model structure that includes too many parameters relative to the size of the dataset is a common reason for obtaining an overfitted model. In the extreme case, when there is one data point for each independent variable, the model could be able to remember this data point, resulting in an overfitted model. Thus, as mentioned above, excluding a portion of the data only for the purpose of testing the model is crucial for identifying if the model is suffering an overfitting problem or not.

Based on the above discussion, one may recognize that the ultimate remedy to the underfitting and overfitting problems is to select a model structure that is suitable to the problem under study while ensuring that an adequate data size is available. In addition, the exploratory analysis of the data, as mentioned above, helps to avoid these problems by performing adequate feature engineering and eliminating/consolidating features with limited observations.

Nonetheless, several other techniques are reported to deal with these problems. For example, the ensemble learning technique is proposed to enhance the performance of models that suffer from both the underfitting and overfitting problems (Ref-EE). The ensemble learning technique works by combining predictions from multiple separated models. The idea is to obtain a strong learner by combining multiple weak learners. The early-stopping technique is proposed for overfitted models in which the used optimization procedure is stopped before reaching its final convergence point (Ref-FF). Finally, the regularization technique artificially forces the model to be simpler (Ref-JJ). Pruning and dropouts are examples of applying the regularization technique in decision trees and neural networks, respectively. More discussion on model regularization is provided in section 1.4.7.