Applications in Transportation
The most well-known ability of PCA is the dimensional reduction. Most of real-world datasets with very high dimensions are very challenging for visualization. By finding the first two or three principal components of the dataset, the samples can be represented in a 2D or 3D space. Besides, PCA can compress the original dataset by dimensional reduction. For example, Li et. al. (Li et al., 2007) used PCA method to compress the traffic flow data. By keeping a certain number of principal components, the compressed data can reserve the majority information of original dataset with an acceptable compression ratio.
Another application in transportation is the traffic flow prediction (Tsekeris and Stathopoulos, 2006, Li et al., 2007, Xing et al., 2015). Xing et.al (Xing et al., 2015) used a robust principal component analysis to predict a real-life highway traffic. The prediction results overperform other prediction methods. Li et.al. (Li et al., 2014) also demonstrated PCA based detrending method can be used for traffic prediction by decomposing traffic time series data into independent components.
Given the attribute that the top principal components of a dataset can capture the most important information rather than background noise, PCA is also a great tool for feature extraction and data denoising. Zhang et.al.(Zhang et al., 2006) used a PCA-SVM method for the vehicle classification. The PCA is used to extract the most similarity characteristics among a group of vehicle dataset. Then the extracted features were fed into a Support Vector Machine (SVM) for classification. Patrik (Jonsson, 2011) used PCA to cluster different road conditions by using traffic camera and whether data.
The principal directions are very sensitive to the anomaly data. Therefore, PCA is an efficient way for anomaly detection. Lee et.al.(Lee et al., 2012) denoted that when removing or adding a normal data from or into a dataset, the principal direction won’t change much. However, if an abnormal data is removed or added, there will be a significant variant for the principal direction. When it comes to the traffic network, Huang et.al. (Huang et al., 2007) extended this work for traffic network anomaly detection by considering both spatial and temporal relationship.
Limitations There are two main limitations of PCA:
Firstly, the PCA method relies on linear assumptions. In other words, PCA will get the best performance when the variables of our dataset are linearly correlated. However, when the variables are not linearly correlated, the result of PCA may be not good.
The second limitation of this method is that PCA considers all the principal components are orthogonal to each other. However, this may restrict the projections to find the real principal direction with highest variance.
Summary
In summary, we explained the principal components, principal directions and calculation process of PCA method. The main advantage of principal component analysis is that it can cluster different classes in an efficient way. The meaning of PCA is straightforward and the result of this process is interpretable. By giving some examples of traffic data compression, traffic data prediction, vehicle feature extraction and abnormal detection, we have a first impression about the application of PCA. Furthermore, the limitations of this method were also discussed in this section.
Last updated