AIPrimer.AI
  • 🚦AI Primer In Transportation
  • CHAPTER 1 - INTRODUCTION TO MACHINE LEARNING
    • Machine Learning in Transportation
    • What is Machine Learning?
    • Types of Machine Learning
      • Supervised Learning
      • Unsupervised Learning
      • Semi-supervised Learning
      • Reinforced Learning
    • Fundamental concepts of machine learning
      • Model Training and Testing
      • Evaluating the Model’s Prediction Accuracy
      • The Underfitting and Overfitting Problems
      • Bias-Variance Tradeoff in Overfitting
      • Model Validation Techniques
      • Hyperparameter Tuning
      • Model Regularization
      • The Curse of Ddimensionality
    • Machine Learning versus Statistics
  • CHAPTER 2 - SUPERVISED METHODS
    • Supervised Learning_Complete Draft
    • K-Nearest Neighbor (KNN) Algorithm
    • Tree-Based Methods
    • Boosting
    • Support Vector Machines (SVMs)
  • CHAPTER 3 - UNSUPERVISED LEARNING
    • Principal Component Analysis
      • How Does It Work?
      • Interpretation of PCA result
      • Applications in Transportation
    • CLUSTERING
      • K-MEANS
      • SPECTRAL CLUSTERING
      • Hierarchical Clustering
    • REFERENCE
  • CHAPTER 4 - NEURAL NETWORK
    • The Basic Paradigm: Multilayer Perceptron
    • Regression and Classification Problems with Neural Networks
    • Advanced Topologies
      • Modular Network
      • Coactive Neuro–Fuzzy Inference System
      • Recurrent Neural Networks
      • Jordan-Elman Network
      • Time-Lagged Feed-Forward Network
      • Deep Neural Networks
  • CHAPTER 5 - DEEP LEARNING
    • Convolutional Neural Networks
      • Introduction
      • Convolution Operation
      • Typical Layer Structure
      • Parameters and Hyperparameters
      • Summary of Key Features
      • Training of CNN
      • Transfer Learning
    • Recurrent Neural Networks
      • Introduction
      • Long Short-Term Memory Neural Network
      • Application in transportation
    • Recent Development
      • AlexNet, ZFNet, VggNet, and GoogLeNet
      • ResNet
      • U-Net: Full Convolutional Network
      • R-CNN, Fast R-CNN, and Faster R-CNN
      • Mask R-CNN
      • SSD and YOLO
      • RetinaNet
      • MobileNets
      • Deformable Convolution Networks
      • CenterNet
      • Exemplar Applications in Transportation
    • Reference
  • CHAPTER 6 - REINFORCEMENT LEARNING
    • Introduction
    • Reinforcement Learning Algorithms
    • Model-free v.s. Model-based Reinforcement Learning
    • Applications of Reinforcement Learning to Transportation and Traffic Engineering
    • REFERENCE
  • CHAPTER 7 - IMPLEMENTING ML AND COMPUTATIONAL REQUIREMENTS
    • Data Pipeline for Machine Learning
      • Introduction
      • Problem Definition
      • Data Ingestion
      • Data Preparation
      • Data Segregation
      • Model Training
      • Model Deployment
      • Performance Monitoring
    • Implementation Tools: The Machine Learning Ecosystem
      • Machine Learning Framework
      • Data Ingestion tools
      • Databases
      • Programming Languages
      • Visualization Tools
    • Cloud Computing
      • Types and Services
    • High-Performance Computing
      • Deployment on-premise vs on-cloud
      • Case Study: Data-driven approach for the implementation of Variable Speed Limit
      • Conclusion
  • CHAPTER 8 - RESOURCES
    • Mathematics and Statistics
    • Programming, languages, and software
    • Machine learning environments
    • Tools of the Trade
    • Online Learning Sites
    • Key Math Concepts
  • REFERENCES
  • IMPROVEMENT BACKLOG
Powered by GitBook
On this page
  1. CHAPTER 1 - INTRODUCTION TO MACHINE LEARNING
  2. Fundamental concepts of machine learning

Evaluating the Model’s Prediction Accuracy

PreviousModel Training and TestingNextThe Underfitting and Overfitting Problems

Last updated 1 year ago

Numerous measures of performance are used to evaluate the prediction accuracy of machine learning models (Ref-BB). Consider a simple case of a typical binary classifier, Figure 2XX describes the contingency table and common ratios used to evaluate this classifier.

As shown in the figure, key four numbers are first determined including number of true positives, the number of false positives, the number of false negatives and the number of true negatives. A true positive means that the model classifies a positive data point as positive, while a false positive means that the model classifies a negative data point as positive. Similarly, a true negative means that the model classifies a negative data point as negative, while a false negative means that the model classifies a positive data point as negative. Several metrics are derived using these four numbers. One should note that these metrics are in the form of ratios which make them independent of the dataset size. While each metric provides different insight on the model’s prediction performance, we review a few of these metrics that are widely used in model evaluation exercises.

Figure: The contingency table and common ratios used for evaluating a classifier (Source: Wikipedia https://en.wikipedia.org/wiki/Evaluation_of_binary_classifiers)

Accuracy: It measures the closeness of the predicted values to their corresponding actual ones. It is calculated as the ratio of the number of cases in which the model made correct predictions to the total number of cases.

Precision: It measures the model’s ability to detect truly positive cases. It is calculated as the ratio of the number of true positive cases to the number of cases predicted by the model as positive cases.

Recall: It measures the model’s ability to detect truly positive cases. It is calculated as the ratio of the number of cases in which the model classifies a case as positive to the total number of true positive cases.

Specificity: It measures the model’s ability to detect true negative cases. It is calculated as the ratio of the number of cases in which the model classifies a case as negative to the total number of true negative cases.

Another important measure is the Receiver Operating Characteristic (ROC) plot (REF-CC). The ROC plot is used to examine how well a classifier can separate positive and negative cases. Figure 3XX gives a sketch of an ROC curve for a typical classifier. On the ROC plot, the y-axis is true positive rate (TPR), and the x-axis is false positive rate (FPR). The diagonal line on this plot represents the random classification scenario. An ROC curve that is well above this diagonal line indicates a model with strong classification ability.

Figure: A sketch of the ROC plot for a typical classifier

Regression-based models tends to use measures such as the Mean Absolute Error and the Mean squared error. The Mean Absolute Error (MAE) is the average of the difference between the true value and the predicted value. It measures how far the predictions are from the actual output. However, it does not provide information on the error direction (under-predicting the data or over predicting the data). Mathematically, it is represented as:

The Mean Squared Error (MSE) is the average of the square of the difference between the original values and the predicted values. The advantage of MSE is that it is easier to compute the gradient compared to the Mean Absolute Error which requires complicated linear programming tools to compute the gradient. As we take square of the error instead of the absolute value, the effect of larger errors becomes more pronounced than smaller error, which guides to reduce these larger errors. The MSE is written mathematically as follows.

Where and are the actual and predicted values, respectively, and is the number of evaluated data points.