Abstract: Tropical cyclones (TC) are often associated with severe weather conditions which cause great losses to lives and property. The precise classification of cyclone tracks is significantly important in the field of weather forecasting. In this paper we propose a novel hybrid model that integrates ontology and Support Vector Machine (SVM) to classify the tropical cyclone tracks into four types of classes namely straight, quasi-straight, curving and sinuous based on the track shape. Tropical Cyclone TRacks Ontology (TCTRO) described in this paper is a knowledge base which comprises of classes, objects and data properties that represent the interaction among the TC characteristics. A set of SWRL (Semantic Web Rule Language) rules are directly inserted to the TCTRO ontology for reasoning and inferring new knowledge from ontology. Furthermore, we propose a learning algorithm which utilizes the inferred knowledge for optimizing the feature subset. According to experiments on the IBTrACS dataset, the proposed ontology based SVM classifier achieves an accuracy of 98.3% with reduced classification error rates.
Tropical cyclones are extreme weather phenomena that cause heavy destruction during landfall. TC is characterized by a warm core vortex in the atmosphere consisting of cyclonic circulation in the lower troposphere and anticyclonic circulation in the upper troposphere. The vortex develops from a low-pressure system that exists over a warm tropical ocean of high sea surface temperature [ design and development of the TC tracks ontology namely TCTRO using the TCTRO ontology for the classification of tracks.
In the first part, the ontology is developed by creating the concept, instance, and relation between them. Also, the concepts and sub-concepts of the ontology are weighted by generating scores based on concept hierarchy. In the second part, this paper uses an SVM classifier to classify the TCs based on track types [
The rest of this article is organized as follows. Section 2 describes the literature review. Section 3 presents the details of the proposed Hybrid Ontology-SVM model. Here the reasoning in the TCTRO ontology was performed by inserting the SWRL rules. Section 4 presents the performance and evaluation of the proposed work. Finally, Section 5 draws conclusions and outlines our future work.
Since TCs are destructive in nature, there is always a need for an improved understanding of the behaviour of TC tracks and their landfall locations. The synthetic track is the most probabilistic track constructed using the historical database of cyclone tracks. The spatial and temporal pattern discovery scheme is discussed for TC [
Within the Semantic technologies the definition of ontology over the past decade, ontology generation has become one of the most revolutionary developments in many fields [
The proposed Hybrid Ontology SVM for TC tracks classifications presented in this section.
The ontology is expressed using Web Ontology Language (OWL) to represent the knowledge entities and showing the relationship between them. In an ontology model classes, object properties and data properties represent the concepts, relations, and attributes respectively. Axioms are typically expressed as logical expressions intended to refine the concepts and relations. Moreover, defining the classes and relationship properties in a hierarchical order refines the logical based ontology modeling in semantic graph databases. By applying the ontology modeling, the structure of TCTRO is denoted by a knowledge graph defined by KG = (T, L). T denotes all the tracks of the knowledge representation with the nodes of the graph defined by a finite set of vertices T = {t1, t2, ……..tn}. Here, T is partitioned into quadripartite. Tstr set of straight tracks Tsin set of sinuous tracks Tqstr set of Quasi straight tracks Tcur set of curving tracks
T = Tstr ∪ Tsin ∪ Tqstr ∪ Tcur
Tx ∩ Ty = where x ϵ {Tstr, Tsin, Tqstr, Tcur} and x ≠ y.
SWRL is a semantic web rule language based on the combination of parts of OWL with the datalog of Rule Markup language. SWRL includes a high-level abstract for horn like rules of the form antecedent ⇒ consequent. In this form antecedent and consequent are conjunctions of atoms written a1 ^………^ an. Atoms in rules can be of types class C(x), individual property P (x, y), data range restricted property Q (x, z), sameAs (x, y), differentFrom (x, y) or Built-in atoms. The SWRLTab is a Protégé plug-in that provides a development environment for working with SWRL rules. In it we edit and execute SWRL rules in order to provide a more completed TCTRO and to enable the possibility of inferring new knowledge.
After the
The feature vectors are now input to the Grouping Scheme Learner. In our approach the Grouping Scheme learner employs two algorithms namely Class weighting (CW) algorithm and Collaborative Concept Weight Correlation (C2WC) algorithm to select the optimal feature subset for improving the classification performance. The pseudocode of the proposed Class Weighting is presented in algorithm 1. The algorithm weights the classes and subclasses of the
The algorithm 2 shows the pseudocode of Collaborative Concept Weight Correlation (C2WC) algorithm. TC records usually contains many attributes, however the performance of classification depends on the choice of the attributes. The proposed C2WC algorithm constrains each vector to contain k most features and each vector is selected if it covers k features. More over duplicates are added to the data partition in order to reduce the effect of overfitting. The C2WC computes the score of each vector according to a heuristic function based on correlations. The weighted Pearson correlation is computed between the meteorological vectors to select the optimal feature subset. Further to reduce the size of the training set the algorithm selects those candidate vectors for which the score value exceeds the given threshold value t. And the candidate vectors having low score i.e., the value below the specified threshold t are removed.
The optimal training set generated in the grouping scheme learner is the input to the trainer. Our approach adopts SVM classifier to make classification using 5-fold cross-validation. The optimized training set that typically possess multiple features is given as input to the SVM classifier which classifies the data into four classes i.e., straight, quasi-straight, curve, sinuous based on the shape of the track. The main idea behind SVM technique is construct hyperplanes that separates this dataset into four classes. Consider a set T of t training vectors xi ϵ ℝD, i = 1…., n the mapping from the data space to higher dimensional feature space is given by ϕ: ℝD → ℝF. The decision hyperplane is written in the following form
For the M different classes we label the current m-th track type class as positive and all others negative such that the set of labels is given by
Therefore, in order to optimize the hyperplane, we optimize for the maximal margin between the classes and this can be formulated as
Here ξi ϵ ℝ are slack variables and C ϵ ℝ+ is a penalty parameter for the slack variables. The Lagrangian in its primal form for the above problem can be written as
Accordingly, the dual formulation becomes
LD denotes the dual form of the Lagrangian and K(xi, xj) is kernel function. A nonlinear radial basis function (RBF) is used as kernel and is defined as
In addition, the accuracy of classification depends on the selection of γ, ξ and C.
The experiments were performed on 64-bit windows server with intel core i7 4.2 GHz processor and 8 GB Ram. The International Best Track Archive for Climate Stewardship (IBTrACS) best track data has been utilized in this study. The proposed ontology based tropical cyclone tracks classification framework is applied to the 236 TC tracks over the period of 1980–2019. We performed the simulation of Random Forest (RF), KNN, SVM and the proposed methodology (Ont + SVM) using Python’s open-source library, scikit-learn. The metrics used to evaluate the performance of the proposed work are accuracy, precision, recall. These are discussed through
Predicted Class↓ | |||||
---|---|---|---|---|---|
Actual class→ | |||||
ST | QST | CUR | SIN | Total | |
Straight (ST) | 39 | 1 | 0 | 0 | 40 |
Quasi Straight (QST) | 0 | 70 | 1 | 0 | 71 |
Curving (CUR) | 0 | 2 | 111 | 0 | 113 |
Sinuous (SIN) | 0 | 0 | 0 | 12 | 12 |
Total | 39 | 73 | 112 | 12 | 236 |
Analysis metrics | Experimental results (%) |
---|---|
Precision | 94.5 |
Recall | 99.4 |
Accuracy | 98.3 |
Analysis metrics | RF | KNN | Linear SVM | Ont + SVM |
---|---|---|---|---|
Precision | 85.4 | 86.8 | 90.7 | 94.5 |
Recall | 90.8 | 92.3 | 96.1 | 99.4 |
Accuracy | 89.4 | 90.7 | 94.6 | 98.3 |
An important observation regarding the TC tracks classification is the detection of model which has a high recall is suitable for tracks classification. The comparison of Precision and recall is given in
The observed tracks are classified into categories based on the sinuosity values. The sinuosity index is the ratio of actual distance traveled by the cyclone against the straight distance between the starting (cyclogenesis) and endpoint (landfall).
Sinuosity | Track category name |
---|---|
1–1.06 | Straight |
1.06–1.17 | Quasi-Straight |
1.17–1.47 | Curving |
>= 1.48 | Sinuous |
The ontology information on cyclone attribute vocabularies is provided and also conveys how they rely on each other. This makes it possible to analyze and interpret the semantic contexts by capturing the state and event elements of the semantic representation. In addition, a set of SWRL rules are inserted into the ontology for reasoning about the classes and individuals. In this paper we proposed correlation based C2WC algorithm for selecting the optimal feature subset. With this we achieve an optimized feature subset which significantly improves the performance of the SVM classifier. The experimental results show that the proposed hybrid ontology based SVM classification model achieves higher performance in terms of accuracy, precision and recall. In future work, we will continue towards the cyclone landfall prediction by incorporating a wider range of attributes.