In recent years, with the advanced of ITS, IoT and wireless communication technology, and also the positive attitudes toward open data from the government, we can retrieve a big amount of traffic data. These data contain detailed spatial and temporal information, and even features with complicated data dimension. In order to extract useful information hidden within the data, a multi-dimensional data analysis technique are required to extract useful information hidden in the data. This study designs an unsupervised machine learning approach for multi-dimensional network data. The algorithm adopts the concepts of network weight matrix and space-time matrix to calculate multi-dimensional distances in the network space. In combine with K-Medoids algorithm, which has the capability of dealing with discrete data, a clustering algorithm is developed. To solve the problems of the sensitivity to initial seeds and K value of K-Medoids algorithm, two methods are adopted. First, a systematic sampling approach for seeds generation is adopted to cut down on the randomness of the algorithm. Cluster splitting and merging method is introduced to compensate the poor seeds selection in the initial phase. From the case of highway traffic clustering, the algorithm demonstrates several advantages. First, the algorithm possesses consistency and robustness. Because systematic sampling seeds generation removes the randomness of the algorithm, the results can be expected throughout several experiments giving the same inputs and parameters. The algorithm also demonstrates that it respects the topology of the highway network. Features that are proximate in space but distant in network space will not be assigned as the same clusters. The algorithm can also recognize cross-system traffic patterns. The results of clustering also demonstrate that the algorithm can identify the difference in temporal dimension and the data dimension of traffic. Features with unique temporal and traffic patterns will be grouped together. This study can provide an approach for systematically analyse space-time or multi-dimensional network data, which can be used in researches like transportation management, logistics and transportation geography. The medoids of the clusters can serve as the rules for traffic patterns. Also, the clusters can be used as operational units for further decision making.
|Effective start/end date||2018/08/01 → 2019/07/31|
- network analysis
- unsupervised machine learning
Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.