AIPrimer.AI
  • 🚦AI Primer In Transportation
  • CHAPTER 1 - INTRODUCTION TO MACHINE LEARNING
    • Machine Learning in Transportation
    • What is Machine Learning?
    • Types of Machine Learning
      • Supervised Learning
      • Unsupervised Learning
      • Semi-supervised Learning
      • Reinforced Learning
    • Fundamental concepts of machine learning
      • Model Training and Testing
      • Evaluating the Model’s Prediction Accuracy
      • The Underfitting and Overfitting Problems
      • Bias-Variance Tradeoff in Overfitting
      • Model Validation Techniques
      • Hyperparameter Tuning
      • Model Regularization
      • The Curse of Ddimensionality
    • Machine Learning versus Statistics
  • CHAPTER 2 - SUPERVISED METHODS
    • Supervised Learning_Complete Draft
    • K-Nearest Neighbor (KNN) Algorithm
    • Tree-Based Methods
    • Boosting
    • Support Vector Machines (SVMs)
  • CHAPTER 3 - UNSUPERVISED LEARNING
    • Principal Component Analysis
      • How Does It Work?
      • Interpretation of PCA result
      • Applications in Transportation
    • CLUSTERING
      • K-MEANS
      • SPECTRAL CLUSTERING
      • Hierarchical Clustering
    • REFERENCE
  • CHAPTER 4 - NEURAL NETWORK
    • The Basic Paradigm: Multilayer Perceptron
    • Regression and Classification Problems with Neural Networks
    • Advanced Topologies
      • Modular Network
      • Coactive Neuro–Fuzzy Inference System
      • Recurrent Neural Networks
      • Jordan-Elman Network
      • Time-Lagged Feed-Forward Network
      • Deep Neural Networks
  • CHAPTER 5 - DEEP LEARNING
    • Convolutional Neural Networks
      • Introduction
      • Convolution Operation
      • Typical Layer Structure
      • Parameters and Hyperparameters
      • Summary of Key Features
      • Training of CNN
      • Transfer Learning
    • Recurrent Neural Networks
      • Introduction
      • Long Short-Term Memory Neural Network
      • Application in transportation
    • Recent Development
      • AlexNet, ZFNet, VggNet, and GoogLeNet
      • ResNet
      • U-Net: Full Convolutional Network
      • R-CNN, Fast R-CNN, and Faster R-CNN
      • Mask R-CNN
      • SSD and YOLO
      • RetinaNet
      • MobileNets
      • Deformable Convolution Networks
      • CenterNet
      • Exemplar Applications in Transportation
    • Reference
  • CHAPTER 6 - REINFORCEMENT LEARNING
    • Introduction
    • Reinforcement Learning Algorithms
    • Model-free v.s. Model-based Reinforcement Learning
    • Applications of Reinforcement Learning to Transportation and Traffic Engineering
    • REFERENCE
  • CHAPTER 7 - IMPLEMENTING ML AND COMPUTATIONAL REQUIREMENTS
    • Data Pipeline for Machine Learning
      • Introduction
      • Problem Definition
      • Data Ingestion
      • Data Preparation
      • Data Segregation
      • Model Training
      • Model Deployment
      • Performance Monitoring
    • Implementation Tools: The Machine Learning Ecosystem
      • Machine Learning Framework
      • Data Ingestion tools
      • Databases
      • Programming Languages
      • Visualization Tools
    • Cloud Computing
      • Types and Services
    • High-Performance Computing
      • Deployment on-premise vs on-cloud
      • Case Study: Data-driven approach for the implementation of Variable Speed Limit
      • Conclusion
  • CHAPTER 8 - RESOURCES
    • Mathematics and Statistics
    • Programming, languages, and software
    • Machine learning environments
    • Tools of the Trade
    • Online Learning Sites
    • Key Math Concepts
  • REFERENCES
  • IMPROVEMENT BACKLOG
Powered by GitBook
On this page
  1. CHAPTER 5 - DEEP LEARNING
  2. Convolutional Neural Networks

Convolution Operation

PreviousIntroductionNextTypical Layer Structure

Last updated 1 year ago

Convolution is a mathematical operation on two functions (f and g) that produces a third function expressing how the shape of one is modified by the other. It is defined as the integral of the product of the two functions after one is reversed and shifted and can be written in Eq. 1 [37].

Since we mostly deal with discrete inputs, such as pixels for images, the convolution operation is simply an element-wise multiplication or dot product, as shown in Figure 2-24.

Figure 2-25 Pixel-wise convolution

As illustrated in Figure 2-24, the pixel value (O55) in the original image becomes the pixel value (P55) in the filtered image after convolution. To see the convolution operation in action, an animation is viewable at [38].

Given the 3x3 filters in Figure 2-23, to retain the feature map in the same size of the input image, padding is needed. Normally a zero padding is used with zero-value pixels being added at the edges of the image as shown in Figure 2-25.

The same filter “scans” the image by sliding one pixel at a time (referred to as stride 1) horizontally and vertically till it scans the whole image. One step of horizontal scanning for the top row of the image is illustrated in Figure 2-26.

The filtered image in Figure 2-27 shows that t$he vertical edge has been extracted.

Similarly, Figure 2-28 shows the result of applying the horizontal filter.

Note the purpose of the example above is to illustrate how convolution and feature extraction work in general. Some variations have been proposed, including transposed convolution or deconvolution, separable convolution, deformable convolution [39], dilated convolutions or atrous convolution [40].

In CNN, the filters or weights are parameters to be learned from data during the training stage. A convolution layer often consists of a stack of filters, resulting in a stack of feature maps. This can be visualized in Figure 2-29 as each convolutional layer transforms a 3D input volume to a 3D output volume of neuron activations.

As shown in Figure 2-29, the input layer is an image, for which the width and height would be the dimensions of the image, and the depth would be 3 (i.e., Red, Green, Blue channels). The sequential convolution operations would commonly reduce the width and height of subsequent 3D volume. The depth of the subsequent 3D volume corresponds to the number of filters used, which is a hyperparameter of CNN. The number of filters determines the number of feature maps to generate from the 3D input volume in the previous layer. It should be pointed out that the connectivity for each filter along the depth axis is always equal to the depth of the input volume. In other words, each filter itself is a 3D shape with the depth equal to the depth of the input volume (see the square-based prism in dash lines in Figure 2-29). For classification problems, the last layer is typically a fully connected layer resulting in a 1x1xN shape volume (N is the number of classes).

Figure 2-28 Illustration of applying the vertical edge filter
Figure 2-29 Illustration of applying the horizontal edge filter
Figure 2-30 Illustration of convolution layers and feature extraction
Figure 2-26 Illustration of zero padding
Figure 2-27 Filter sliding horizontally with stride 1.