Exemplar Applications in Transportation
Last updated
Last updated
Given the amazing capability of CNNs for feature extraction and visual recognition, they have been widely applied in the transportation field. The most popular applications are autonomous or self-driving vehicles that use vision sensors to sense their surroundings (fixed objects, traffic signals, signs, pavement markings, other vehicles, motorcycles, bicyclists, pedestrians, etc.) to control themselves and avoid potential collisions (e.g, Tesla, Mobileye). Other applications include, but not limited to, traffic condition monitoring (e.g., congestion, incidents), truck taxonomy, etc.
YOLO models are among the most popular network architectures for transportation applications owning to their relatively high accuracy and real-time performance. For example, Sharma, et al. [67] used YOLOv3 model for vehicle detection (Figure 2-54) and based on the detections, SORT algorithm was applied to track each vehicle (Figure 2-55). In another study, Sharma, et al. [68] used both traditional deep convolutional neural network and YOLO models to detect traffic congestion from camera images, which performed well in nighttime and poor weather conditions. The study also explored a solution to the incident detection problem using semi-supervised techniques based on trajectory classification.
Figure 2-55 Detection results using YOLOv3 [67].
Figure 2-56 Tracking results [67].
In a different application, He, et al. [69] used YOLO object detector to detecting potential trucks shown in image frames (Figure 2-56), followed by the popular DeepLabV2 model (which introduces multiple parallel atrous convolutional layers[40]) to estimate the vehicle shape. The decision tree classifier was used to classify tractors and trailers.
Figure 2-57 truck detection using YOLO [69]
Besides YOLO detectors, Yang, et al. [70] used Inception v2 architecture to detect vehicles (Figure 2-57) and then track them using an IoU-based method together with smoothing techniques (Figure 2-58). The extracted trajectories were further analyzed to identify potential conflict events (Figure 2-59).
Figure 2-58 Detection and classification of vehicles from a traffic camera [70].
Figure 2-59 Vehicle Tracking by IoU followed by a structured smoothing algorithm [70].
Figure 2-60 Detection and Quantification of Live Traffic Conflict Events from a Traffic Camera. An example of rear-end conflict on the northbound approach of the intersection. Left: live camera view; Center: projected trajectories in the plan view; Right: quantified conflict event [70].
CNN has also been used for learning and predicting network-level traffic conditions. For example, Ma, et al. [71] proposes a convolutional neural network (CNN)-based method that learns traffic as images and predicts large-scale, network-wide traffic speed with a high accuracy. First, spatiotemporal traffic dynamics are converted to images describing the time and space relations of traffic flow via a two-dimensional time-space matrix (Figure 2-60). Then, a CNN is applied to the image following two consecutive steps: abstract traffic feature extraction and network-wide traffic speed prediction.
In addition to traffic applications previously described, another popular area for CNN applications is pavement condition assessment using pavement surface images ([72], [73], [74], etc.). For example, Li & Zhao [74] designed A CNN by modifying AlexNet and then trained and validated it using a built database with 60000 images. Figure 1 shows the flow chart of developing a CNN for cracks detection.
Dabiri & Heaslip, [75] used CNN architectures to predict travel modes based on only raw GPS trajectories, where the modes are labeled as walk, bike, bus, driving, and train. Four fundamental motion characteristics of a moving object, including speed, acceleration, jerk, and bearing rate, represent four channels in the input layer. Each channel has the shape of (1×M), where M is the segment length (i.e., the number of GPS points that forms the segment). The input structure is shown in Figure 2-62. A variety of CNN configurations are evaluated and the highest accuracy of 84.8% has been achieved through the ensemble of the best CNN configuration.