AIPrimer.AI
  • 🚦AI Primer In Transportation
  • CHAPTER 1 - INTRODUCTION TO MACHINE LEARNING
    • Machine Learning in Transportation
    • What is Machine Learning?
    • Types of Machine Learning
      • Supervised Learning
      • Unsupervised Learning
      • Semi-supervised Learning
      • Reinforced Learning
    • Fundamental concepts of machine learning
      • Model Training and Testing
      • Evaluating the Model’s Prediction Accuracy
      • The Underfitting and Overfitting Problems
      • Bias-Variance Tradeoff in Overfitting
      • Model Validation Techniques
      • Hyperparameter Tuning
      • Model Regularization
      • The Curse of Ddimensionality
    • Machine Learning versus Statistics
  • CHAPTER 2 - SUPERVISED METHODS
    • Supervised Learning_Complete Draft
    • K-Nearest Neighbor (KNN) Algorithm
    • Tree-Based Methods
    • Boosting
    • Support Vector Machines (SVMs)
  • CHAPTER 3 - UNSUPERVISED LEARNING
    • Principal Component Analysis
      • How Does It Work?
      • Interpretation of PCA result
      • Applications in Transportation
    • CLUSTERING
      • K-MEANS
      • SPECTRAL CLUSTERING
      • Hierarchical Clustering
    • REFERENCE
  • CHAPTER 4 - NEURAL NETWORK
    • The Basic Paradigm: Multilayer Perceptron
    • Regression and Classification Problems with Neural Networks
    • Advanced Topologies
      • Modular Network
      • Coactive Neuro–Fuzzy Inference System
      • Recurrent Neural Networks
      • Jordan-Elman Network
      • Time-Lagged Feed-Forward Network
      • Deep Neural Networks
  • CHAPTER 5 - DEEP LEARNING
    • Convolutional Neural Networks
      • Introduction
      • Convolution Operation
      • Typical Layer Structure
      • Parameters and Hyperparameters
      • Summary of Key Features
      • Training of CNN
      • Transfer Learning
    • Recurrent Neural Networks
      • Introduction
      • Long Short-Term Memory Neural Network
      • Application in transportation
    • Recent Development
      • AlexNet, ZFNet, VggNet, and GoogLeNet
      • ResNet
      • U-Net: Full Convolutional Network
      • R-CNN, Fast R-CNN, and Faster R-CNN
      • Mask R-CNN
      • SSD and YOLO
      • RetinaNet
      • MobileNets
      • Deformable Convolution Networks
      • CenterNet
      • Exemplar Applications in Transportation
    • Reference
  • CHAPTER 6 - REINFORCEMENT LEARNING
    • Introduction
    • Reinforcement Learning Algorithms
    • Model-free v.s. Model-based Reinforcement Learning
    • Applications of Reinforcement Learning to Transportation and Traffic Engineering
    • REFERENCE
  • CHAPTER 7 - IMPLEMENTING ML AND COMPUTATIONAL REQUIREMENTS
    • Data Pipeline for Machine Learning
      • Introduction
      • Problem Definition
      • Data Ingestion
      • Data Preparation
      • Data Segregation
      • Model Training
      • Model Deployment
      • Performance Monitoring
    • Implementation Tools: The Machine Learning Ecosystem
      • Machine Learning Framework
      • Data Ingestion tools
      • Databases
      • Programming Languages
      • Visualization Tools
    • Cloud Computing
      • Types and Services
    • High-Performance Computing
      • Deployment on-premise vs on-cloud
      • Case Study: Data-driven approach for the implementation of Variable Speed Limit
      • Conclusion
  • CHAPTER 8 - RESOURCES
    • Mathematics and Statistics
    • Programming, languages, and software
    • Machine learning environments
    • Tools of the Trade
    • Online Learning Sites
    • Key Math Concepts
  • REFERENCES
  • IMPROVEMENT BACKLOG
Powered by GitBook
On this page
  1. CHAPTER 1 - INTRODUCTION TO MACHINE LEARNING
  2. Fundamental concepts of machine learning

Bias-Variance Tradeoff in Overfitting

PreviousThe Underfitting and Overfitting ProblemsNextModel Validation Techniques

Last updated 1 year ago

Choosing the right model structure and developing reasonable features is an important step to prevent overfitting. This can be relatively straightforward if there are lots of similar models already developed for the domain of interest, or if the researcher is simply trying to replicate or refute models from past literatures. Similar models in your area of research may not always be readily available and the correct model structure, features, and model parameters aren’t always obvious. Therefore, it is important to be able to identify overfitting issues for your own models and correct for overfitting accordingly.

Overfitting can be caused by data sampling bias or model variance. Bias refers to the systematic error from a model trained using data that poorly represent the out-of-sample observations. The issue with bias is closely related to the concept of accuracy. A biased model that has minimized the error on the training sample will produce prediction that systematically underpredict or overpredict due to the model’s inability to generalize. To reduce the bias in a model, one can use model validation techniques to reduce training sample selection bias, or collect more data if there aren’t enough sample data to sufficiently model the complexity of the data.

While bias can be addressed by correcting sample selection bias or insufficient data, variance is often the result of unnecessarily high model complexity. Selecting appropriate level of model complexity can be done with parameter tuning. For example, in regression, model variance can be reduced by adding regularization terms that essentially penalizes models with unnecessarily high degree of nonlinearity. Often, regularization has the effect of smoothing the model and make the model behave less wavy and erratically. Beyond classical regression models, hyperparameters tuning with grid search or random search is important in finding an appropriate level of model complexity.

The idea of bias and variance tradeoff is key to measuring and reducing overfitting, and allow one to train a model with low bias and low variance. Generally, when training a model, as model complexity increase, within-sample prediction error, labeled as “Training sample”, should decrease, see Figure BWX1. However, beyond a certain “optimal” point, the within-sample error will decrease at the expense of out-of-sample error, labeled as “Test sample” on Figure BWX1. In order to minimize overfitting, one needs to balance the tradeoff between minimizing within-sample prediction error and minimizing out-of-sample prediction error. This process of optimizing this tradeoff is known as model validation. In order to optimize this tradeoff, one needs to evaluate out-of-sample prediction error constantly while increasing model complexity. The model that is the least susceptible to overfitting, based on the chosen model structure, selected features, and the available data, is the one that yield the minimum out-of-sample prediction error.

Figure BWX1: the behavior of test sample error in red and training sample error in blue demonstrates the need to balance model bias and variance when preventing overfitting (Hastie et al., 2009).