[BACK]
Computer Systems Science & Engineering
DOI:10.32604/csse.2023.028309
images
Article

An Ontology Based Cyclone Tracks Classification Using SWRL Reasoning and SVM

N. Vanitha1,*, C. R. Rene Robin1 and D. Doreen Hephzibah Miriam2

1Sri Sai Ram Engineering College, Chennai, 600044, Tamilnadu, India
2Director of Computational Intelligence Research Foundation, Chennai, 600023, Tamilnadu, India
*Corresponding Author: N. Vanitha. Email: vanithanphd1@gmail.com
Received: 07 February 2022; Accepted: 16 March 2022

Abstract: Abstract: Tropical cyclones (TC) are often associated with severe weather conditions which cause great losses to lives and property. The precise classification of cyclone tracks is significantly important in the field of weather forecasting. In this paper we propose a novel hybrid model that integrates ontology and Support Vector Machine (SVM) to classify the tropical cyclone tracks into four types of classes namely straight, quasi-straight, curving and sinuous based on the track shape. Tropical Cyclone TRacks Ontology (TCTRO) described in this paper is a knowledge base which comprises of classes, objects and data properties that represent the interaction among the TC characteristics. A set of SWRL (Semantic Web Rule Language) rules are directly inserted to the TCTRO ontology for reasoning and inferring new knowledge from ontology. Furthermore, we propose a learning algorithm which utilizes the inferred knowledge for optimizing the feature subset. According to experiments on the IBTrACS dataset, the proposed ontology based SVM classifier achieves an accuracy of 98.3% with reduced classification error rates.

Keywords: Tropical cyclones classification; support vector machine; ontology; SWRL reasoning; SVM classification

1  Introduction

Tropical cyclones are extreme weather phenomena that cause heavy destruction during landfall. TC is characterized by a warm core vortex in the atmosphere consisting of cyclonic circulation in the lower troposphere and anticyclonic circulation in the upper troposphere. The vortex develops from a low-pressure system that exists over a warm tropical ocean of high sea surface temperature [1]. However out of many lows that exist in the intertropical convergence zone (ITCZ), only a few develop into TC. The subtle shift of ITCZ over the north and south of the equator leads to the change of season. Also, the intensification of vortex and movement are the significant aspect of TC. However the intensification of TC weakens over oceanic areas due to reasons like travelling over colder SST regions, intrusion of cold air. The 4-DVar data assimilation [2] is used to represent the initial structure of the vortex in the simulation of Indian Ocean Tropical Cyclones. Thus, there is a need for exploratory analysis of TC tracks. Hence this paper aims to develop a hybrid system consisting of two parts,

i)   design and development of the TC tracks ontology namely TCTRO

ii)   using the TCTRO ontology for the classification of tracks.

In the first part, the ontology is developed by creating the concept, instance, and relation between them. Also, the concepts and sub-concepts of the ontology are weighted by generating scores based on concept hierarchy. In the second part, this paper uses an SVM classifier to classify the TCs based on track types [3]. When applying classification, it is of great importance to determine the optimal combination of weights of different features. Therefore, to achieve classification effectiveness, this paper selects features based on weights. We therefore, belief that this study contributes to semantic web intelligence as well as to TC tracks analysis.

The rest of this article is organized as follows. Section 2 describes the literature review. Section 3 presents the details of the proposed Hybrid Ontology-SVM model. Here the reasoning in the TCTRO ontology was performed by inserting the SWRL rules. Section 4 presents the performance and evaluation of the proposed work. Finally, Section 5 draws conclusions and outlines our future work.

2  Related Work

Since TCs are destructive in nature, there is always a need for an improved understanding of the behaviour of TC tracks and their landfall locations. The synthetic track is the most probabilistic track constructed using the historical database of cyclone tracks. The spatial and temporal pattern discovery scheme is discussed for TC [4]. The information on the track of the movement of a storm is necessary to determine the landfall point on the coast and therefore, it is very important in disaster management. Recognition state of spatial and temporal pyramid model is presented for 3D recognition system [5,6]. Due to changes in climate, weather events are becoming more frequent and severe. Simulation experiments are conducted and observed an increase in frequency in tropical disturbances. To reveal the spatial and temporal characteristics of tropical cyclone forecast an ensemble of tracks are constructed from a smaller representative sample [7]. In a climatic perspective, track of the cyclone exhibit large variation in behaviour and is influenced by many factors like atmospheric patterns, different seasons, cyclogenesis location etc., and the categorization of the track shapes into four categories namely straight, quasi-straight, curving and sinuous based on the sinuosity index values is analyzed [8]. The present work uses this metric to classify the tracks. Quite interestingly, [9] there exist a strong positive correlation between track sinuosity and cyclone longevity.

Within the Semantic technologies the definition of ontology over the past decade, ontology generation has become one of the most revolutionary developments in many fields [10]. The timbre features are extracted from various musical instruments and they are classified [11]. Another study incorporated the concept of ontology to achieve effective video classification and automatic annotation of large-scale video archives [12]. To address the issue of suitable feature extraction of video content representation, the semantic visual templates are generated in order to retrieve video shots [13]. The Drosophila Gene Expression Pattern Annotation model was developed for multi-label learning and an ontology is employed for automatic video annotation for 3D TV applications [14,15]. In recent years we have witnessed the continual growth of ontology-based classification which introduces the use of ontologies to solve the problem of classification by enabling reasoning. In the medical domain, cancer staging is intensive work. The interoperation with the semantic content in medical imaging was developed to overcome those problems [16,17]. Visualization is becoming increasingly important in the semantic web, [18] developed and visualized ontology for software risk planning and controlling. It is important to identify the purpose of developing ontology and its intended uses, [19] delineates an ontology for e-learning by defining concepts, properties, and relationships of teaching resources, thereby improving the quality of teaching resources. Feature selection enables the selection of the set of attributes that are most relevant to the classification modeling problem [20].

3  Hybrid Ontology Based TC Tracks Classification

The proposed Hybrid Ontology SVM for TC tracks classifications presented in this section. Fig. 1 outlines the architecture used in this work. In the hybrid classification model, both the domain ontology and the SVM have different capabilities to capture the data characteristics. The proposed approach starts with the development of domain ontology TCTRO along with the SWRL rule-based extension to formalize the domain knowledge. Then a rule based reasonor provides the inferred results identifying the feature vectors. The obtained feature patterns containing the optimized parameters are applied to the SVM model to classify the TC tracks. Detailed explanation on each process will be presented in the following sections.

images

Figure 1: Architecture of the proposed approach

The ontology is expressed using Web Ontology Language (OWL) to represent the knowledge entities and showing the relationship between them. In an ontology model classes, object properties and data properties represent the concepts, relations, and attributes respectively. Axioms are typically expressed as logical expressions intended to refine the concepts and relations. Moreover, defining the classes and relationship properties in a hierarchical order refines the logical based ontology modeling in semantic graph databases. By applying the ontology modeling, the structure of TCTRO is denoted by a knowledge graph defined by KG = (T, L). T denotes all the tracks of the knowledge representation with the nodes of the graph defined by a finite set of vertices T = {t1, t2, ……..tn}. Here, T is partitioned into quadripartite.

•   Tstr set of straight tracks

•   Tsin set of sinuous tracks

•   Tqstr set of Quasi straight tracks

•   Tcur set of curving tracks

T = Tstr ∪ Tsin ∪ Tqstr ∪ Tcur

Tx ∩ Ty = where x ϵ {Tstr, Tsin, Tqstr, Tcur} and x ≠ y.

3.1 Reasoning with SWRL

SWRL is a semantic web rule language based on the combination of parts of OWL with the datalog of Rule Markup language. SWRL includes a high-level abstract for horn like rules of the form antecedent ⇒ consequent. In this form antecedent and consequent are conjunctions of atoms written a1 ^………^ an. Atoms in rules can be of types class C(x), individual property P (x, y), data range restricted property Q (x, z), sameAs (x, y), differentFrom (x, y) or Built-in atoms. The SWRLTab is a Protégé plug-in that provides a development environment for working with SWRL rules. In it we edit and execute SWRL rules in order to provide a more completed TCTRO and to enable the possibility of inferring new knowledge.

Rule 1: The following rule models the conditions that are favorable of the formation of TCs [21]. The rule checks for the enhanced activity of TC by examining a synthesis of environmental conditions needed for TCs to form and thrive. TCs form over warm ocean surface and the essential climatology aspects of TC formation are: (a) sea surface temperature SST greater than 26°C, (b) low values of vertical wind shear (VWS), (c) cyclogenesis location, (d) vortex and (e) high relative humidity.

images

Rule 2: The Rule 2 represents the classification of cyclonic disturbances in the North Indian Ocean (NIO) as per the India Meteorological Department (IMD). The below rule uses data range restrictions on the wind speed variable ws to map the intensity of TC.

images

Rule 3: This rule interprets the enhancement of TC activity. In particular, it is not known how quickly a TC may intensify or weaken as it interacts with various environmental factors. However the mechanism of drastic drop of pressure at the center indicates the rapid intensification of TC vortex. Further pressure drops by a value greater than 4 hPa were more likely to start the intensification events.

images

Rule 4: Phases 3–4 of MJO provides the favorable conditions that enhances the TC convection activities in BoB. Specifically, during La Niña years the frequency and longevity of TCs are higher compared to El Niño years in the BoB region. Above normal convection is observed during the combined effect of Phases 3–4 of MJO with La Niña regime. Rule 4 interprets the combined effect of MJO phases 3–4 with La Niña that result in both spatial and temporal patterns of interference and modulation of Indian rainfall.

images

Rule 5: Rule 5 interprets the sinuosity index values to identify the track type of classes straight, quasi-straight, curving and sinuous. Cyclogenesis_Point corresponds to the starting location of TC whereas the Location corresponds to the current location which contains the latitude and longitude subclasses.

images

Rule 6: Rule 6 models the prevailing steering winds influencing the track of a cyclone. The subclasses of steering winds are High_Level_Winds_250 mb, Mid_Level_Winds_500 mb, Low_Level_Winds_800 mb.

images

Fig. 2 displays the OntoGraph view of the developed TCTRO ontology O TCTRO. Fig. 3 displays the SWRL tab for creating and executing SWRL rules for querying TCTRO. Here we have developed many rules for analyzing the TC tracks behavior.

images

Figure 2: View of TCTRO ontology using OntoGraf plugin in the Protégé environment

images

Figure 3: SWRLTab displaying the rules of TCTRO in the Protégé environment

3.2 Tracks Feature Extractor

After the O TCTRO is constructed by defining classes, properties, instances and defining further information about properties, the Turtle representation of the ontology is generated and exported. Turtle is a syntax and file format for expressing data in the Resource Description Framework (RDF) data model. The RDF triples and OWL are now input to the Tracks Feature Extractor which converts each tuple in the training set to a feature vector. The obtained feature vectors are normalized.

3.3 Grouping Scheme Learner

The feature vectors are now input to the Grouping Scheme Learner. In our approach the Grouping Scheme learner employs two algorithms namely Class weighting (CW) algorithm and Collaborative Concept Weight Correlation (C2WC) algorithm to select the optimal feature subset for improving the classification performance. The pseudocode of the proposed Class Weighting is presented in algorithm 1. The algorithm weights the classes and subclasses of the O TCTRO based on the hierarchy and the concept importance.

images

images

The algorithm 2 shows the pseudocode of Collaborative Concept Weight Correlation (C2WC) algorithm. TC records usually contains many attributes, however the performance of classification depends on the choice of the attributes. The proposed C2WC algorithm constrains each vector to contain k most features and each vector is selected if it covers k features. More over duplicates are added to the data partition in order to reduce the effect of overfitting. The C2WC computes the score of each vector according to a heuristic function based on correlations. The weighted Pearson correlation is computed between the meteorological vectors to select the optimal feature subset. Further to reduce the size of the training set the algorithm selects those candidate vectors for which the score value exceeds the given threshold value t. And the candidate vectors having low score i.e., the value below the specified threshold t are removed.

3.4 Support Vector Machine Classifier

The optimal training set generated in the grouping scheme learner is the input to the trainer. Our approach adopts SVM classifier to make classification using 5-fold cross-validation. The optimized training set that typically possess multiple features is given as input to the SVM classifier which classifies the data into four classes i.e., straight, quasi-straight, curve, sinuous based on the shape of the track. The main idea behind SVM technique is construct hyperplanes that separates this dataset into four classes. Consider a set T of t training vectors xi ϵ ℝD, i = 1…., n the mapping from the data space to higher dimensional feature space is given by ϕ: ℝD → ℝF. The decision hyperplane is written in the following form

yi=f(xi)=wTxi+b=0 (1)

For the M different classes we label the current m-th track type class as positive and all others negative such that the set of labels is given by

yi={+1ifl=m1iflmwherelε{1,2...,M} (2)

Therefore, in order to optimize the hyperplane, we optimize for the maximal margin between the classes and this can be formulated as

minimizew,ξ12w2+Ci=1nξi (3)

Here ξi ϵ ℝ are slack variables and C ϵ ℝ+ is a penalty parameter for the slack variables. The Lagrangian in its primal form for the above problem can be written as

L(w,b,α)=12w2i=1nαiyi(wTxi+b)+i=1nαi (4)

Accordingly, the dual formulation becomes

LD(α)=i=1nαi12i=1ni=1nαiαjyiyjK(xi,xj) (5)

LD denotes the dual form of the Lagrangian and K(xi, xj) is kernel function. A nonlinear radial basis function (RBF) is used as kernel and is defined as

K(xi,xj)=exp(γxixj2)whereγ=1/2σ2 (6)

In addition, the accuracy of classification depends on the selection of γ, ξ and C.

4  Results and Discussion

The experiments were performed on 64-bit windows server with intel core i7 4.2 GHz processor and 8 GB Ram. The International Best Track Archive for Climate Stewardship (IBTrACS) best track data has been utilized in this study. The proposed ontology based tropical cyclone tracks classification framework is applied to the 236 TC tracks over the period of 1980–2019. We performed the simulation of Random Forest (RF), KNN, SVM and the proposed methodology (Ont + SVM) using Python’s open-source library, scikit-learn. The metrics used to evaluate the performance of the proposed work are accuracy, precision, recall. These are discussed through Eqs. (7)(9).

Accuracy=TP+TNTP+FN+FP+TN (7)

TruePositiveRate(TPRorRecall)=TPTP+FN (8)

PositivePredictiveValue(PPVorPrecision)=TPTP+FP (9)

Tab. 1 shows the confusion matrix of the proposed method ontology based SVM classifier with overall accuracy 98.3%. With the SVM classifier the parameters were set C and γ were set at 0.25 and 0.005, respectively, following a 5-fold cross-validation analysis. Tab. 2 shows the performance analysis of the proposed ontology based SVM classifier in terms of precision, recall and accuracy. The proposed methodology achieves 94.5% of precision, 99.4% of recall and 98.3% of accuracy. Tab. 3 shows the comparison of the proposed ontology-based classifier with 3 selected classifiers namely RF, KNN, SVM. Experimental results demonstrate that ontology based SVM classifier outperformed, KNN, CNN, SVM classifiers with respect to performance metrics such as precision, recall and accuracy. Our proposed methodology helped improve the general accuracy of the classifier by approximately 3.5 to 4 percent.

images

images

images

An important observation regarding the TC tracks classification is the detection of model which has a high recall is suitable for tracks classification. The comparison of Precision and recall is given in Fig. 4. The highest recall is observed for the proposed method (Ont + SVM) which is 99.4%. Thus, the percentage of increase of recall from SVM to Ont + SVM is 3.7%. This gives a clear indication that for the TC tracks dataset Ont + SVM is the better choice for classification. Fig. 4

images

Figure 4: Precision and Recall performance measurements of the proposed hybrid Ont + SVM algorithm and machine learning algorithms RF, KNN, Linear SVM

Fig. 5 shows the comparison of accuracy across the four classifiers. The classifiers are compared using accuracy to evaluate the performance which shows the proposed methodology outperformed the other three classifiers. Fig. 6 shows the precision, recall and accuracy results.

images

Figure 5: Accuracy comparison between the proposed hybrid Ont + SVM algorithm and machine learning algorithms RF, KNN, Linear SVM

images

Figure 6: Comparison of the evaluation metrics precision, recall and accuracy

The observed tracks are classified into categories based on the sinuosity values. The sinuosity index is the ratio of actual distance traveled by the cyclone against the straight distance between the starting (cyclogenesis) and endpoint (landfall). Tab. 4 shows the track division for the different sinuosity measure ranges. The histogram of average sinuosity index for each year over the period 2000–2016 is shown in Fig. 7. Finally, the GIS visualization of the classified TC tracks depicted using PYQGIS, the Python environment in QGIS is illustrated in Fig. 8

images

images

Figure 7: Histogram of average Sinuosity Index SI (2000–2016)

images

Figure 8: GIS visualization of classified tracks (a) Straight (b) Quasi Straight (c) Curving (d) Sinuous

5  Conclusion

The ontology information on cyclone attribute vocabularies is provided and also conveys how they rely on each other. This makes it possible to analyze and interpret the semantic contexts by capturing the state and event elements of the semantic representation. In addition, a set of SWRL rules are inserted into the ontology for reasoning about the classes and individuals. In this paper we proposed correlation based C2WC algorithm for selecting the optimal feature subset. With this we achieve an optimized feature subset which significantly improves the performance of the SVM classifier. The experimental results show that the proposed hybrid ontology based SVM classification model achieves higher performance in terms of accuracy, precision and recall. In future work, we will continue towards the cyclone landfall prediction by incorporating a wider range of attributes.

Funding Statement: The authors received no specific funding for this study.

Conflicts of Interest: The authors declare that they have no conflicts of interest to report regarding the present study.

References

  1. P. R. C. Rahul, P. S. Salvekar and P. C. S. Devara, “Super cyclones induce variability in the aerosol optical depth prior to their formation over the oceans,” IEEE Geoscience and Remote Sensing Letters, vol. 9, no. 5, pp. 985–988, 2012.
  2. D. Gopalakrishnan and A. Chandrasekar, “Improved 4-dvar simulation of indian ocean tropical cyclones using a regional model,” IEEE Transactions on Geoscience and Remote Sensing, vol. 56, no. 9, pp. 5107–5114, 2018.
  3. S. K. Singh, N. Jaiswal, C. M. Kishtawal, R. Singh and P. K. Pal, “Early detection of cyclogenesis signature using global model products,” IEEE Transactions on Geoscience and Remote Sensing, vol. 52, no. 8, pp. 5116–5121, 2014.
  4. C. Yu, W. Ding, M. Morabito and P. Chen, “Hierarchical spatio-temporal pattern discovery and predictive modeling,” IEEE Transactions on Knowledge and Data Engineering, vol. 28, no. 4, pp. 979–993, 2016.
  5. L. Shao, X. Zhen, D. Tao and X. Li, “Spatio-temporal laplacian pyramid coding for action recognition,” IEEE Transactions on Cybernetics, vol. 44, no. 6, pp. 817–827, 2014.
  6. J. Weng, C. Weng, J. Yuan and Z. Liu, “Discriminative spatio-temporal pattern discovery for 3D action recognition,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 29, no. 4, pp. 1077–1089, 2019.
  7. J. Wang, X. Liu, H. W. Shen and G. Lin, “Multi-resolution climate ensemble parameter analysis with nested parallel coordinates plots,” IEEE Transactions on Visualization and Computer Graphics, vol. 23, no. 1, pp. 81–90, 201
  8. J. P. Terry and G. Gienko, “Developing a new sinuosity index for cyclone tracks in the tropical South Pacific,” Natural Hazards, vol. 59, no. 2, pp. 1161–1174, 2011.
  9. J. P. Terry, I. K. Kim and S. Jolivet, “Sinuosity of tropical cyclone tracks in the South West Indian Ocean: Spatiotemporal patterns and relationships with fundamental storm attributes,” Applied Geography, vol. 45, no. D11, pp. 29–40, 2013.
  10. R. Studer, V. R. Benjamins and D. Fensel, “Knowledge engineering: Principles and methods,” Data & Knowledge Engineering, vol. 25, no. 2, pp. 161–197, 1998.
  11. S. Kolozali, M. Barthet, G. Fazekas and M. Sandler, “Automatic ontology generation for musical instruments based on audio analysis,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 21, no. 10, pp. 2207–2220, 2013.
  12. J. Fan, H. Luo, Y. Gao and R. Jain, “Incorporating concept ontology for hierarchical video classification, annotation, and visualization,” IEEE Transactions on Multimedia, vol. 9, no. 5, pp. 939–957, 2007.
  13. J. Fan, Y. Gao and H. Luo, “Integrating concept ontology and multitask learning to achieve more effective classifier training for multilevel image annotation,” IEEE Transactions on Image Processing, vol. 17, no. 3, pp. 407–426, 2008.
  14. Y. Li, S. Ji, S. Kumar, J. Ye and Z. Zhou, “Drosophila gene expression pattern annotation through multi-instance multi-label learning,” IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 9, no. 1, pp. 98–112, 2012.
  15. J. Jeong, H. Hong and D. Lee, “Ontology-based automatic video annotation technique in smart TV environment,” IEEE Transactions on Consumer Electronics, vol. 57, no. 4, pp. 1830–1836, 2011.
  16. D. L. Rubin, P. Mongkolwat, V. Kleper, K. Supekar and D. S. Channin, “Annotation and image markup: accessing and interoperating with the semantic content in medical imaging,” IEEE Intelligent Systems, vol. 24, no. 1, pp. 57–65, 2009.
  17. R. Yao, G. Lin, C. Shen, Y. Zhang and Q. Shi, “Semantics-aware visual object tracking,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 29, no. 6, pp. 1687–1700, 2019.
  18. C. R. Rene Robin and G. V. Uma, “Design and development of ontology suite for software risk planning, software risk tracking and software risk control,” Journal of Computer Science, vol. 7, no. 3, pp. 320–327, 2011.
  19. C. R. Rene Robin and G. V. Uma, “An ontology driven e-learning agent for software risk management,” International Journal of Academic Research, vol. 3, no. 2, pp. 30–36, 2011.
  20. Y. Soe-Tsyr and S. Jerry, “Ontology-based structured cosine similarity in document summarization: With applications to mobile audio-based knowledge management,” IEEE Transactions on Systems, Man, and Cybernetics, vol. 35, no. 5, pp. 1028–1040, 2005.
  21. P. Bhardwaj, D. R. Pattanaik and O. Singh, “Tropical cyclone activity over Bay of Bengal in relation to El NiñoSouthern Oscillation,” International Journal of Climatology, vol. 39, no. 14, pp. 5452–5469, 2019.
images This work is licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.