Ontology is the progression of interpreting the conceptions of the information domain for an assembly of handlers. Familiarizing ontology as information retrieval (IR) aids in augmenting the searching effects of user-required relevant information. The crux of conventional keyword matching-related IR utilizes advanced algorithms for recovering facts from the Internet, mapping the connection between keywords and information, and categorizing the retrieval outcomes. The prevailing procedures for IR consume considerable time, and they could not recover information proficiently. In this study, through applying a modified neuro-fuzzy algorithm (MNFA), the IR time is mitigated, and the retrieval accuracy is enhanced for trouncing the above-stated downsides. The proposed method encompasses three phases: i) development of a crop ontology, ii) implementation of the IR system, and iii) processing of user query. In the initial phase, a crop ontology is developed and evaluated by gathering crop information. In the next phase, a hash tree is constructed using closed frequent patterns (CFPs), and MNFA is used to train the database. In the last phase, for a specified user query, CFP is calculated, and similarity assessment results are retrieved using the database. The performance of the proposed system is measured and compared with that of existing techniques. Experimental results demonstrate that the proposed MNFA has an accuracy of 92.77% for simple queries and 91.45% for complex queries.
The information on the webspace is augmenting at a vast speed with the progressive improvement in information technology. With dynamic research being conducted for over 30 years, only information retrieval (IR) has become omnipresent with the World Wide Web initiation. The retrieval of information proficiently and precisely has become highly imperative [
Farmers require information on seasonal weather, seeds, fertilizers, and best cultivars, as well as erudition on pests and diseases, controlling techniques, harvesting and postharvesting techniques, precise market costs, and present supply and demand, to make learned judgments at innumerable stages of the agricultural cycle [
The foundational agricultural crop ontology comprises information that is general to every sort of crop. This information comprises production practices, postproduction practices, environmental data, varieties, cropping systems, botanical description, and origin [
Here, an ontology-related IR system specifically for the agriculture domain is proposed. The objectives are to request a database with an IR system, to assure exactness and reliability in outcomes, and to present an ontology-centered, rapid, and effectual IR system. Ontologies refer to the modeling techniques that possess the power of formally representing domain knowledge. They have a significant role in annotating and organizing considerable experimental, clinical, and real-world data, and their everyday usage is well recognized in the scientific community [
The rest of this paper is organized as follows. Section 2 evaluates the related work regarding the proposed technique. Section 3 presents a concise confab about the proposed work. Section 4 assesses the investigational outcomes. Lastly, Section 5 concludes the paper.
A paper for publication should be divided into multiple sections, including a title and full names of all authors. Ming et al. [
Rajendran et al. [
Sayed et al. [
Selvalakshmi et al. [
For achieving rapid and efficacious IR, a reformed neuro-fuzzy-centered crop ontology system is recommended. Three strides are encompassed by the proposed work: i) development of a crop ontology, ii) implementation of the IR system, and iii) processing user query. The first progression of the proposed method is dataset creation, in which the appropriate information concerning farmers and their crops is accumulated from various resources. Next, these data are reprocessed through restructuring and repositioning them into a more understandable manner in the Apache Jena Fuseki database. Then, a crop ontology is established through performing knowledge attainment, OWL file generation, visualization, and ontology evaluation. The derived files are again saved in the Apache Jena Fuseki database. Subsequently, IR system implementation is executed by performing the following operations on the dataset: establishing closed frequent patterns (CFPs), hash code generation, and applying a modified neuro-fuzzy algorithm (MNFA) for IR. Afterward, CFPs are established for the data values. Thereafter, hash values are created for all CFPs by using the Secure Hash Algorithm (SHA) 512 algorithm. Lastly, a hash tree is generated concerning the hash values for CFPs. For the information recovery step, MNFA is applied, in which
This phase comprises three main steps. They are knowledge acquisition, ontology development, and ontology evaluation. First, the information regarding each crop and their diseases and the solution for the diseases are accumulated from the website. This sort of data collection for crops is known as knowledge acquisition. Once knowledge acquisition is done, the crop ontology is advanced through creating an OWL file, which comprises massive facts about the accumulated crops. Centered on the similarity of semantics between the indexed data and the user query, the ontology-centered IR system retrieves data. Consequently, only the pertinent data are retrieved during the process of information recovery, and the recovery period is decreased. After the OWL file creation, the file is envisaged. For file creation and visualization, the proposed system utilizes Protégé that is inbuilt with the Eclipse IDE platform. Protégé is a free, open-source ontology editor and a knowledge management system. It proffers a graphic user interface for delineating ontologies. Similar to Eclipse, Protégé is a structure in which different projects advocate plug-ins. Such an application is inscribed in Java and utilizes Swingheavily for creating the user interface. The generated crop ontology in the recommended system by utilizing the Protégé tool is demonstrated in
Then, the evaluation of the ontology phase is accomplished by abstracting the values from the OWL file formed in the foregoing stage and utilizing the reasoning in the Protégé tool. The reasoning is the task of deriving implicit facts from a set of proffered explicit facts. The derived details are stored in the Apache Jena Fuseki dataset.
The initial phase of the proposed technique is data collection. Here, specifics of crops and farmers are collected from divergent resources and kept as a dataset. The dataset comprises two sorts of information. They are the details of farmers and crops. The farmers’ details comprise farmer’s name, address, contact number, and other information; the crops’ details comprise the locality of the paddy field, yield, the duration of beginning and ending, paddy type, and the length of paddy growth.
Preprocessing of the proposed method encompasses reorganizing and the prearrangement of the farmers’ and crops’ details. During this phase, after performing the analysis on farmers’ data, the following data are removed from the dataset: Ambiguous information Incomplete data Irrelevant data Noisy data
The final filtered dataset is kept in the Apache Jena Fuseki dataset.
The execution of the IR system consists of three processes: creating CFPs, hash code generation, and MNFA application. First, CFPs are established from derived values (crops’ and farmers’ details). Next, the hash code for all CFPs is created by utilizing the SHA 512 algorithm, and a hash tree is created concerning the hash values for every CFP. A hash value is principally utilized in the hash tree for indexing. Every leaf node has the CFPs indexed and a matching hash value. Lastly, MNFA is utilized for powerful IR. Each of the progressions is elucidated in the following sections.
To detect the CFPs in the dataset, frequent patterns (FPs) are established. The patterns indicate the number
where
Afterward, the FPs from the dataset are computed. The FPs of the dataset refer to the addition of occurrences of a precise form on the dataset. They are also acknowledged as the count of patterns or the frequency of patterns.
where
where
After CFP discovery, the hash value for every CFP is created by utilizing the SHA 512 algorithm. The SHA 512 algorithm is a hash algorithm that utilizes a one-way hash function. This algorithm is an advanced version of prevailing hash algorithms called SHA 0, SHA 1, SHA 256, and SHA 384 algorithms. The SHA 512 hash function collects the input data of any size and creates a message digest of 512-bit size and 1024-bit block length. First, message bits are amplified with extra bits for forming a multiple of 1024 bits. Next, this block is split into smaller parts of 1024 bits. The chief block is integrated with the initializing vector, and the hash code is created. The consequent blocks are integrated with the formerly generated hash codes. Afterward, one hash tree is initiated, and the hash code values obtained are traversed to the hash tree’s leaf node. Whether the leaf node is full is ascertained. If not, then the hash values are inserted to the hash tree. The insertion of each hash value into the hash tree is performed in this way. Lastly, a hash tree is built using the formed values of hash linked with the CFPs. Hash values are utilized to index in the hash tree. Each leaf node signifies the CFPs indexed with an associated hash value.
All CFPs from leaf nodes are provided as the input to MNFA. The recommended MNFA is an amalgamation of two practices. They are KMA and NFA. KMA is combined with NFA to enhance the significance of IR in the proposed work. Thus, the proposed system of IR is termed as MNFA. In MNFA, initially, the disarranged CFPs are clustered
KMA functions in two phases: build and swap. First,
where
where
For each selected medoid
If
The pseudo code of the
The hash code values are grouped, and the KMA outcomes are integrated into the neuro-fuzzy system in IR by following the above given steps of KMA. A neuro-fuzzy system is basically a fuzzy system that utilizes the learning algorithm motivated by the theory of neural network for determining its parameters (fuzzy sets and rules) through data sample processing. A neuro-fuzzy system is always elaborated as a system of fuzzy rules. It likely generates the system through training data from scratch, given that it is feasible to commence it by using former knowledge in the practice of fuzzy rules. Such systems are generally characterized as distinct multilayer feed-forward neural networks. The NFA structure contains five layers. Here, the hash value-clustered data for a specific CFP resulted from the former step are the inputs to the first layer of NFA. In “five” layers, the first and fourth layers have adaptive nodes, whereas the other layers comprise fixed nodes. The information concerning a farmer or crop is precisely retrieved by utilizing NFA. The two elementary rules of NFA are stated in the following equations.
where
where
where
The throughput of this layer
For accessibility, the throughput of this layer is termed normalized firing strength.
where
After the execution of the IR system, user query processing is performed by utilizing MNFA. The query of the user by using the Semantic Web search engine is provided as an input. The testing procedure of the proposed method is similar to the training process. Here, the user input query is preprocessed, and the CFPs for the input query are established. Next, the SHA 512 algorithm is accomplished on CFPs for deriving hash values, and a hash tree is created concerning the hash values that are created. Then, the CFP hash values of the input query are equated with the trained database, and the result is retrieved by utilizing MNFA. An assessment of work is accomplished by utilizing the PageRank (PR) algorithm, given that the work is completed at the center of the Semantic Web search engine.
The recommended IR methodology utilizing MFNA is applied in the Java working platform.
In this subsection, the performance of the proposed MNFA is equated with that of the prevailing techniques of IBRI-CASONTO and ST regarding precision, recall, F-score, accuracy, returned
Performance metrics (%) | Simple queries | Complex queries | ||||
---|---|---|---|---|---|---|
IBRI-CASONTO | ST | Proposed MNFA | IBRI-CASONTO | ST | Proposed MNFA | |
Precision | 52.49 | 95.2 | 96.56 | 50.12 | 93.562 | 94.45 |
Recall | 75 | 80.8 | 82.77 | 73.56 | 78.34 | 81.45 |
F-score | 68.54 | 74.32 | 84.45 | 66.34 | 72.34 | 83.44 |
Accuracy | 45 | 67 | 92.77 | 45 | 67 | 91.45 |
The demonstration of the proposed MNFA and prevailing methods regarding precision, recall, F-score, and accuracy for simple queries is designed in
A comparison of the proposed MNFA and prevailing IBRI-CASONTO and ST on intricate queries based on precision, recall, F-score, and accuracy is demonstrated in
In this paper, an ontology-related IR system is proposed using MNFA. Three methodologies are included in this system: training, testing, and assessment. After the three phases are performed, the results of the proposed MNFA are compared with those of the prevailing technologies of IBRI-CASONTO and ST. Here, the performance valuation of the proposed and existing methods is accomplished for two sorts of queries: simple and complex queries. For both sorts of queries, the valuation is made in consideration of precision, recall, F-score, accuracy, returned