Open Access iconOpen Access



Impact of Distance Measures on the Performance of AIS Data Clustering

Marta Mieczyńska1,*, Ireneusz Czarnowski2

1 Department of Maritime Telecommunications, Gdynia Maritime University, Morska 81-87, 81-225, Gdynia, Poland
2 Department of Information Systems, Gdynia Maritime University, Morska 81-87, 81-225, Gdynia, Poland

* Corresponding Author: Marta Mieczyńska. Email: email

Computer Systems Science and Engineering 2021, 36(1), 69-82.


Automatic Identification System (AIS) data stream analysis is based on the AIS data of different vessel’s behaviours, including the vessels’ routes. When the AIS data consists of outliers, noises, or are incomplete, then the analysis of the vessel’s behaviours is not possible or is limited. When the data consists of outliers, it is not possible to automatically assign the AIS data to a particular vessel. In this paper, a clustering method is proposed to support the AIS data analysis, to qualify noises and outliers with respect to their suitability, and finally to aid the reconstruction of the vessel’s trajectory. In this paper, clustering results have been obtained using selected algorithms, including k-means, k-medoids, and fuzzy c-means. Based on the clustering results, it is possible to decide on the qualification of data with outliers and on their usefulness in the reconstruction of the vessel trajectory. The main aim of this paper is to answer how different distance measures during a clustering process can influence AIS data clustering quality. The main core question is whether or not they have an impact on the process of reconstruction of the vessel trajectories when the data are damaged. The research question during the computational experiments asked whether or not distance measure influence AIS data clustering quality. The computational experiments have been carried out using original AIS data. In general, the experiment and the results confirm the usefulness of the cluster-based analysis when the data include outliers that are derived from the natural environment. It is also possible to monitor and to analyse AIS data using clustering when the data include outliers. The computational experiment results confirm that the k-means with Euclidean distance has the best performance.


Cite This Article

M. Mieczyńska and I. Czarnowski, "Impact of distance measures on the performance of ais data clustering," Computer Systems Science and Engineering, vol. 36, no.1, pp. 69–82, 2021.

cc This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 2056


  • 1114


  • 1


Share Link