AIPrimer.AI
  • 🚦AI Primer In Transportation
  • CHAPTER 1 - INTRODUCTION TO MACHINE LEARNING
    • Machine Learning in Transportation
    • What is Machine Learning?
    • Types of Machine Learning
      • Supervised Learning
      • Unsupervised Learning
      • Semi-supervised Learning
      • Reinforced Learning
    • Fundamental concepts of machine learning
      • Model Training and Testing
      • Evaluating the Model’s Prediction Accuracy
      • The Underfitting and Overfitting Problems
      • Bias-Variance Tradeoff in Overfitting
      • Model Validation Techniques
      • Hyperparameter Tuning
      • Model Regularization
      • The Curse of Ddimensionality
    • Machine Learning versus Statistics
  • CHAPTER 2 - SUPERVISED METHODS
    • Supervised Learning_Complete Draft
    • K-Nearest Neighbor (KNN) Algorithm
    • Tree-Based Methods
    • Boosting
    • Support Vector Machines (SVMs)
  • CHAPTER 3 - UNSUPERVISED LEARNING
    • Principal Component Analysis
      • How Does It Work?
      • Interpretation of PCA result
      • Applications in Transportation
    • CLUSTERING
      • K-MEANS
      • SPECTRAL CLUSTERING
      • Hierarchical Clustering
    • REFERENCE
  • CHAPTER 4 - NEURAL NETWORK
    • The Basic Paradigm: Multilayer Perceptron
    • Regression and Classification Problems with Neural Networks
    • Advanced Topologies
      • Modular Network
      • Coactive Neuro–Fuzzy Inference System
      • Recurrent Neural Networks
      • Jordan-Elman Network
      • Time-Lagged Feed-Forward Network
      • Deep Neural Networks
  • CHAPTER 5 - DEEP LEARNING
    • Convolutional Neural Networks
      • Introduction
      • Convolution Operation
      • Typical Layer Structure
      • Parameters and Hyperparameters
      • Summary of Key Features
      • Training of CNN
      • Transfer Learning
    • Recurrent Neural Networks
      • Introduction
      • Long Short-Term Memory Neural Network
      • Application in transportation
    • Recent Development
      • AlexNet, ZFNet, VggNet, and GoogLeNet
      • ResNet
      • U-Net: Full Convolutional Network
      • R-CNN, Fast R-CNN, and Faster R-CNN
      • Mask R-CNN
      • SSD and YOLO
      • RetinaNet
      • MobileNets
      • Deformable Convolution Networks
      • CenterNet
      • Exemplar Applications in Transportation
    • Reference
  • CHAPTER 6 - REINFORCEMENT LEARNING
    • Introduction
    • Reinforcement Learning Algorithms
    • Model-free v.s. Model-based Reinforcement Learning
    • Applications of Reinforcement Learning to Transportation and Traffic Engineering
    • REFERENCE
  • CHAPTER 7 - IMPLEMENTING ML AND COMPUTATIONAL REQUIREMENTS
    • Data Pipeline for Machine Learning
      • Introduction
      • Problem Definition
      • Data Ingestion
      • Data Preparation
      • Data Segregation
      • Model Training
      • Model Deployment
      • Performance Monitoring
    • Implementation Tools: The Machine Learning Ecosystem
      • Machine Learning Framework
      • Data Ingestion tools
      • Databases
      • Programming Languages
      • Visualization Tools
    • Cloud Computing
      • Types and Services
    • High-Performance Computing
      • Deployment on-premise vs on-cloud
      • Case Study: Data-driven approach for the implementation of Variable Speed Limit
      • Conclusion
  • CHAPTER 8 - RESOURCES
    • Mathematics and Statistics
    • Programming, languages, and software
    • Machine learning environments
    • Tools of the Trade
    • Online Learning Sites
    • Key Math Concepts
  • REFERENCES
  • IMPROVEMENT BACKLOG
Powered by GitBook
On this page
  1. CHAPTER 1 - INTRODUCTION TO MACHINE LEARNING
  2. Types of Machine Learning

Semi-supervised Learning

Semi-supervised learning is the third type of machine learning where both labelled and unlabeled data is used to train the model . This models are expected to perform better than supervised learning models as they have both labelled and unlabeled training data. Less human annotation effort is required in semi-supervised learning compared to the supervised learning. Assuming a researcher wants to develop a semi-supervised image classification model to classify human social interactions, he only needs a small dataset with human-annotated images. He can use any publicly available open image dataset to train the model.

The question is ‘How do semi-supervised models use the unlabeled data?’. There is no label available, so how do the models map an instance with unlabeled data to the predictor? In order to do that the semi-supervised learning models make assumptions about the underlying data distributions. Some assumptions include: label is the same for nearby data points or data points in the same cluster, and low data density signifies decision boundary. For example, let us consider a dataset for binary classification, and the two classes are: Class A and Class B. As there are both labelled and unlabeled data available for both classes, one assumption can be that the underlying data follow a gaussian distribution. Based on this assumption, the unlabelled data can be labeled with respect to the labelled data.

In semi-supervised learning, two types of learning happen. The first one is called the inductive learning, where inference happens for new data. The second learning is called the transductive learning, where labelling of unlabelled data is learned.

Types of Semi-supervised Learning

  1. Self-Training Model: In this model, a learning algorithm is first trained with only labelled data. Using the developed model, labelling is done for unlabelled data. In an iterative way, the unlabelled data is included in the classifier if only the model is confident about the learning algorithm provided label.

  2. Graph-based Model: In this model, a graph is constructed by connecting similar data points. At first the label of the labelled dataset is known. Based on that, the label of the unlabelled data is inferred.

  3. Co-Training Model: In this model, the assumption is that two different, and complementary features are available to express each example. For each feature, different model is developed. Later the most confident prediction of unlabelled data is used to construct the model.

  4. Semi-supervised Support Vector Machine: The semi-supervised SVM creates a maximum margin to separate both labelled and unlabelled data. The boundary has to ensure that the classification on the unlabeled data has the lowest generalization error.

Applications of Semi-supervised Learning

The application of semi-supervised learning in transportation is limited compared to that of supervised and unsupervised learning. Supervised and unsupervised learning are used for many transportation applications. The application of semi-supervised learning includes pedestrian counting, Driver’s distraction detection, incident detection etc.

PreviousUnsupervised LearningNextReinforced Learning

Last updated 1 year ago