The drastic growth of coastal observation sensors results in copious data that provide weather information. The intricacies in sensor-generated big data are heterogeneity and interpretation, driving high-end Information Retrieval (IR) systems. The Semantic Web (SW) can solve this issue by integrating data into a single platform for information exchange and knowledge retrieval. This paper focuses on exploiting the SW base system to provide interoperability through ontologies by combining the data concepts with ontology classes. This paper presents a 4-phase weather data model: data processing, ontology creation, SW processing, and query engine. The developed Oceanographic Weather Ontology helps to enhance data analysis, discovery, IR, and decision making. In addition to that, it also evaluates the developed ontology with other
Several marine disasters happen every year, mainly due to weather phenomena; hence, weather prediction and analysis among ocean areas are essential. Ocean observation sensors monitor the Coastal areas and record the values of weather parameters. The provided information is voluminous and heterogeneous, which causes precise Information Retrieval (IR), a challenging task [
Ontologies help to describe knowledge of a domain and make machines understand the user requirement. Domain knowledge Ontology (DO) represents a semantic relationship among the data constructed manually, involving a significant workforce and manual invention by combining expert knowledge and domain knowledge [
Similarly, Web Ontology (WebOnto), NeOn, Semantic Web Ontology Overview and Perusal (SWOOP), Web Ontological Design Environment (ODE), OilEd, Protégé, and Onto Editor are some of the platforms available to create an ontology. Still, Protégé is accepted to be the best with various functionalities [
The paper is structured as follows: Section 2 presents the background of the research area, Section 3 deals with the literature review, and Section 4 deals with the proposed four-phase weather data model for ocean IR by building Offshore Wind Ontology (OWO) and expressions for quality measures. The results and discussions, along with quality analysis, are incorporated in Section 5. Finally, the paper concludes the work in Section 6 along with the future scope of research.
Information Technology (IT) has provided extraordinary progress over the years, but many have not yet applied it to environmental management. The size, intricacy, and variety create a significant issue in dealing with data utilization. Creating an integrated interface over a given set of existing heterogeneous databases is a challenge many database administrators face today. In the early stages, developers used the Extensible Markup Language (XML) data model to experiment with the datasets. Since XML could not address the semantic heterogeneity, it relied on other computer-based approaches. The vision of the proposed work was on ontology, an evolving research area that provides a sample scope for addressing semantic heterogeneity. The research area focuses on India’s south-eastern coastal regions on which the proposed method has tested. Approximately 1200 weather stations have been established across India, regularly generating copious data. The generated datasets were referenced geographically in various geospatial file formats that rely on new IR approaches. Also, the data format, naming convention, and units were different for different sources. Ontology addresses the semantic interoperability among the weather systems to utilize the information effectively.
Frequently reported weather phenomena along India’s coastal areas are cyclones, storms, water spouts, heavy monsoon rains, and marine heatwaves. According to recent research, the Indian Ocean is the warmest among all five oceans generating 7% of the world’s cyclones [
Among Nazare dam’s water bodies in Pune and Yanam on the Eastern coasts of India, some water spouts were reported in 2018, and 2020 respectively [
Government and private sectors are increasingly getting committed to transparency of information regarding satellite data [
The research works [
Owing to the increasing popularity of SW, researchers rely on measuring the quality of various aspects such as linked data, ontology, inference engine, data backing, and user interface. Even though ontologies are designed for a particular domain, determining their quality is a challenging task. Researchers have developed software for ontology matching algorithms but lack in terms of efficiency and accuracy. Quality measurement frameworks by various researchers like Semantic Web Application Quality Evaluation (SWAQ), which includes the fuzzy logic for evaluating SW applications' quality attributes, are proposed in [
This section depicts a detailed explanation of the proposed SW-based approach for ocean Weather Phenomenon Prediction (WPP). It consists of four phases: Input data processing, Ontology creation, SW processing, and Semantic query engine, as shown in
Sensor data, a major data resource, is used by many researchers and domain experts for any type of weather-related researches. It faces two major issues: heterogeneous files and heterogenous vocabularies, making it difficult to exchange, share, and reuse. Data from different observation sensors are in various formats like Comma Separated Values (CSV, *.csv), Total’s file (TUV, *.tuv), Excel (*.xls), Network Common Data Form (NetCDF, *.nc) and Hierarchical Data Format 5 (HDF5, *.h5) etc. In the first phase, these data have been presented in a standard machine-understandable format, namely RDF, to facilitate efficient semantic retrieval. Heterogeneous-Geospatial Climatic Data to RDF (Hetero-GCD2RDF) assists the transformation of simple vocabularies into an SW format.
Similarly, the Integrated Ocean Observing System’s (IOOS) standard vocabulary format represents the heterogeneous parameters as RDF. IOOS is a common RDF vocabulary created to provide interoperability between the data catalogues published on the web. IOOS standard vocabularies can improve data recovery from various data sources by providing a user-friendly environment using a similar query mechanism.
The second phase provides a conceptualization definition for WPP through ontology named OWO. The proposed ontology follows the concept of DO for the oceanographic weather domain to define how the attributes are related to different weather conditions. Existing satellite data retrieval systems help retrieve information effectively based on geolocation, date and time, sensor, satellite, weather attribute, etc. In that case, retrieval of additional, domain-specific, concept-based data generated by satellites from copious information is challenging. For instance, retrieving the values of weather parameters like wind speed, precipitation or temperature can be queried using API. But the retrieval of specific knowledge-based data such as fresh gale, no wind, or fresh breeze is a complex task. Hence, the proposed IR approach should support semantic concepts or knowledge-based satellite data retrieval to facilitate an efficient retrieval system and fulfill the user requirements. Ontology-based web service helps in efficient knowledge representation and data access by enhancing existing satellite data retrieval methods. Once the ontology is created for a particular application domain, updating and maintenance become the developer’s responsibility to produce the accuracy of IR. The proposed OWO ontology consists of 37 concepts, 112 instances, 85 relations, and 126 attributes.
Top-level concepts | Respective sub-concepts |
---|---|
GEO_LOCATION | Latitude, longitude |
DATE | Instances, date, year |
TIME | Interval, hours, minutes, seconds |
WEATHER_ATTRIBUTES | Fog, Aerosol_Optical_Thickness, Cloud_Cover, Precipitation_Rate, Relative_Humidity, Wind_Speed, Wind_Direction, Wind_Gust, Barometric_Pressure, Atmospheric_Pressure, Sea_Surface_Temperature, Air_Temperature, Conductivity, Solar_Irradiance, UV_Index |
WEATHER_CONDITION | Good, bad, severe |
WEATHER_PHENOMENON | Cyclone, storm, water spout, marine heat waves |
Each sub-concept is categorized further into several instances/individuals. For example, the individuals of the concept Wind_Speed are NoWind, LightAir, LightBreeze, GentleBreeze, ModerateBreeze, FreshBreeze, StrongBreeze, ModerateGale, NearGale, FreshGale, StrongGale, WholeGale, Storm, ViolentStorm, and Hurricane. Each instance holds a range of data values of different data types through a data property. The value of any attribute of a concept has a data value through an object property. Various weather sites and analysis reports provide details about the classifications of ontology instances and their values. The weather concepts are related to each other by 85 different relational links. The relations are: is-a, has-a, has long, has an attribute, condition, etc. By incorporating all the specifications, the protégé tool assists in developing the proposed OWO ontology. In addition to that, the OWO ontology includes four different ocean weather phenomena: storm, cyclone, water spout, and marine heatwaves. The weather factors related to these phenomena, along with the values, are included in the ontology.
Weather phenomenon | Weather attributes |
---|---|
Cyclone | Barometric_Pressure, Relative_Humidity, Wind_Speed, Precipitation_Rate, Sea_Surface_Temperature |
Storm | Barometric_Pressure, Precipitation_Rate, Wind_Speed, Cloud_Cover |
Water Spout | Relative_Humidity, Wind_Speed, Air_Temperature, Water_Temperature |
Marine Heat Waves | Relative_Humidity, Sea_Surface_Temperature, Water_Temperature_100 |
Ontologies developed and used in online systems are larger; hence a database is mandatory for storage, efficiency, and optimal utilization [
Mappings are hypotheses that are used to relate the data in RDBMS to the vocabulary of ontology. Hence, the mapped ontology makes it easier for the user to retrieve data through the query engine. In the third phase, the semantic layer directly interacts with DO through Jena API. Jena’s model factory can create an inference graph by connecting datasets with a reasoner supporting a general-purpose rule engine. The main aim of the reasoner is to answer the queries by transforming them into questions over the source. This platform uses mappings by allowing users to access the data from multiple resources through a single interface. W3C provides a standard called RDB to RDF Mapping Language (R2RML) for the specification of mapping in Ontology-Based Data Access (OBDA). The IOOS standard vocabularies are incorporated in RDF vocabulary and ontology concepts as well. Hence, it makes it easier for the query engine to map the data, which returns the data value and knowledge.
The final phase grants IR through queries for any weather-related applications. A typical question of OBDA is generally expressive in that it describes the user’s desire instead of training the system on answering. The OBDA allows the query to be independent of the data source and uniform access to heterogeneous sources. There are numerous query languages designed for RDF databases, namely SPARQL protocol, RDF Data Query Language (RDQL), RDF Query Language (RQL), Versa and Sesame RDF Query Language (SeRQL), etc. Still, SPARQL is widely preferred [
The concept of writing a SPARQL query is to match its triples with the RDF triples and retrieve the queried information. Users can access more information by querying an integrated database built and saved in the ontology. Queries can be made user-friendly by creating a web page using Java Servlet Pages (JSP) to retrieve the information dynamically. JSP pages help to create a dynamic website that is easier to maintain compared to other servlets. It adds Java code inside Hypertext Markup Language (HTML) code using JSP tags. Using JSP, one can easily separate Presentation, and Business logic as a web designer can design and update JSP pages creating the presentation layer. Java developers can write server-side complex computational code without concerning the web design. And both the layers can easily interact over HyperText Transfer Protocol (HTTP) requests. Even though various environments are available for developing an ontology, evaluating the ontology’s quality is still challenging. Hence, some performance metrics are discussed in Section 4.2 to assess the quality of developed OWO ontology.
The quality of any ontology is evaluated against another ontology called GS ontology. Let the developed weather ontology be
The size of classes
Let us define the GS ontology as
where,
Semantic coverage metrics
Ontology is said to be semantically compatible only if the contents are reliable to GS ontology
An element of ontology is said to be redundant if it can be derived from other factors. For instance, a concept is defined in the ontology, which can also be derived from other concepts; hence, the idea is redundant. Expressions in
where
The graph of any ontology is denoted by
The length of a path
Set of reachable nodes
Relation
The proposed data model experimented using Indian Meteorological Satellite INSAT-3D's ocean data collected along with the south-eastern coastal areas of India. Observation sensors like Agro Floats (NetCDF), Buoys (CSV), Coastal Radars (TUV), Gliders (NetCDF), Sonde (Excel/CSV), and others monitor the field area. The proposed work used a 64-bit Intel Core i5 processor with 4 GB of RAM, 2 TB hard disk and deployed in LINUX (Ubuntu16.04 version) system. The java (JDK 1.8.0_181) code for research work developed on Eclipse 5.0, Apache Jena with Apache Tomcat 9.0.14, JSP as a server, and Internet Explorer or any web browser as a client.
A sensor observed data are in heterogeneous format. Before dealing with the sensor followed data, it is harmonized by converting it into an SW supportable RDF file format. The input data processing phase is essential to produce a machine-readable format for a computer to search and understand how the terms of a particular domain are related to each other. Once the datasets are presented as RDF, many tools are available for visualizing and working with the data stored in them. In the second phase of OWO, the weather ontology is developed using the protégé 5.1 tool by deploying the specifications mentioned in Section 4.1. It serves as a knowledge base for ocean WPP and includes the hierarchy of weather conditions, attributes related to weather conditions, and their relationship. Elements of WPP consist of concepts, sub-concepts, attributes, instances, and relations between them. For example, individuals of the sub-concept Relative_Humidity are Dry_Humidity, Optimum_Humidity, and Moist_Humidity related to the top-level concept Weather_Attribute through the link named “Has_humidity,” which holds the range of data values <35, 35 to 60, and >60 respectively of data type “xsd: decimal” measured in “%.” The developed OWO ontology incorporates the respective values, for instance, men, and the same information has been deployed in the H2 relational database as tables by describing their relationship. The third phase on the top mapping tool helps map the input data and the proposed ontology through the domain expert’s knowledge. Ontology mapping is a critical phase of the knowledge-building process. The current research work creates instances from sensor data and maps with ontology concepts concerning ontology vocabulary and sensor data vocabulary. RDF represents instance mapping data in a homogeneous machine-understandable format, supporting inference and enabling semantic IR. The proposed OWO ontology includes the concepts of weather phenomenon illustrated in
After completing the ontology mapping, the SPARQL query is applied to the RDF graph to extract the information in the fourth phase of the proposed data model.
An experimental study demonstrates the proposed ontology’s quality with the other state-of-the-art ontologies. Some existing weather ontologies O1 [
Ontology | Classes | Instances | Attributes | Relations | |
---|---|---|---|---|---|
Golden standard ontology (GS) | 15 | 11 | 18 | 27 | |
Weather domain ontologies | O1 | 13 | 72 | 31 | 11 |
O2 | 24 | 78 | 37 | 89 | |
O3 | 19 | 11 | 52 | 37 | |
O4 | 35 | 49 | 114 | 84 | |
PO | 37 | 112 | 126 | 85 |
Defining classes | O1 (WO1) | O2 (WO2) | O2 (WO3) | O4 (WO4) | PO (WO5) |
---|---|---|---|---|---|
Defined both in GS and WOn | 7 | 9 | 6 | 11 | 12 |
Defined in GS and derivable from WOn | 1 | 1 | 0 | 0 | 1 |
Defined in GS but not in WOn | 7 | 6 | 9 | 4 | 2 |
Defined in WOn but not in GS | 4 | 8 | 13 | 24 | 24 |
The study evaluates the performance metrics of ontologies expressed in Section 4.2 on the chosen ontologies, metrics
Quality factors | Performance metrics involved |
---|---|
Completeness | |
Correctness | |
Conciseness | |
Structural complexity |
The evaluated results of performance metrics of proposed ontology are illustrated further in this paper. Completeness metrics of proposed ontology against the GS ontology evaluates by considering either vocabulary coverage or semantic coverage. The experimental results show that the PO is highly correlated with the GS ontology than the existing ontologies. It can be noted from
The compatibility factor generally indicates the correctness of ontology to GS ontology. However, weather ontologies include many elements: some don’t have domain knowledge but are subject to a particular web service. Hence, all the elements mentioned are not equivalent to GS ontology.
The overall comparison of quality factors, including the corresponding metrics defined in
This paper confers a climatic information data model dependent on OWL ontology by coordinating satellite information using the semantic web for ocean applications. The fundamental objective of this data model is to predict the climate marvel through ontology models by presenting information about different climate conditions influenced by weather attributes. The proposed data model has reduced much effort since information translation from the information sources causes versatility issues while processing computing applications. The proposed ontology's performance metrics have improved by 26.23% and 7.14% in completeness and correctness. Likewise, it has diminished by 4.3% and 45.32% in conciseness and structural complexity, respectively, with O4 ontology. The experimental outcomes show that the proposed technique outperforms the current strategies in terms of scalability and IR. The ontological approach sustains various scientific domains, research data sharing, semantic query execution, and efficient visualization. The future work of this research is to develop a web application to deduce weather prediction based on user-defined queries and to develop a more comprehensive ontology to infer any kind of weather forecast. The outlook of this research is to implement various SPARQL queries implementing arithmetic and logical calculations according to the user requirement. A proposed ocean ontology can be developed by including many concepts and supporting attributes for more efficient and accurate knowledge retrieval in the ocean weather domain.
The authors would like to express their sincere thanks to MoES, Govt of India for their financial support, and Adhiyamaan College of Engineering for their moral support for completing the project successfully.