Semantic Annotation of Land Cover Remote Sensing Images Using Fuzzy CNN

This paper presents a novel fuzzy logic based Convolution Neural Network intelligent classifier for accurate image classification. The proposed approach employs a semantic class label model that classifies the input land cover images into a set of semantic categories and classes depending on the content. The intelligent feature selection algorithm selects the prominent attributes from the given data set using weighted attribute functions and uses fuzzy logic to build the rules based on the membership values. To annotate remote sensing images, the CNN method effectively creates semantics and categorises images. The decision manager then integrates the fuzzy logic rules with the CNN algorithm to achieve accurate classification. The proposed approach achieves a classification accuracy of 90.46% when used with various training and test images, and the three class labels for vegetation (84%), buildings (90%), and roads (90%) provide a higher classification accuracy than other existing algorithms. On the basis of true positive rate, false positive rate, and accuracy of picture classification, the suggested approach outperforms the existing methods.


Introduction
Remote sensing is a technique for monitoring and detecting the earth's surface area without physical contact, utilizing specialized sensing devices such as high-resolution cameras and satellite images [1]. The benefits of remote sensing are that it enables remote monitoring of an unattended environment on a continual dynamic basis [2]. It also offers information about changes that occur in the environment, which is beneficial. With the advancement of remote sensing technology, research on remote sensing has shifted its focus to precise image classification through the application of image processing and machine learning algorithms [3]. Image annotation is a popular machine learning technique that makes use of artificial intelligence to annotate images depending on their context. The majority of existing image annotation systems classify the image using a single class label in order to provide an overall understanding of the image. Unfortunately, image annotation based on a single class label gives inadequate information to categorise and annotate images more accurately in nature. Segmentation-based semantics is a widely used technique that classifies objects in a scene based on their image pixels in order to extract information [4]. Semantic image segmentation based on a single class label, on the other hand, is a difficult and time-consuming process. Additionally, this procedure incurs computational overhead, which might deplete system resources, and the annotation of the image classification is imprecise by nature [5]. To address the limitation of a single class label classifier, a classifier based on multiple class labels has been developed to offer adequate information about the scene to annotate multi label and categorize the image with greater accuracy [6]. When compared to single-label remote sensing image classification, multi-label remote sensing image classification is a more realistic challenge. The purpose of multi-label annotation is to predict numerous semantic labels that will be used to characterize a remote sensing image scene. Due to its higher descriptive capacity, multi label may be used in numerous disciplines, like image annotation [7,8] and image retrieval [9][10][11]. However, the multi-class labels demonstrated limited performance as the image annotations are classified using handcraft features from the given images and the images in high-level semantics are not displayed to ensure a precise classification and annotation of the images in a more accurate way. To overcome the limitations of existing systems, this study proposes an efficient fuzzy logic-based CNN Intelligent Semantic Multi-label annotation technique for more precisely classifying and annotating Land-cover high-resolution remote sensing images. The proposed intelligent classifier makes use of a semantic multi-label model in which the image is represented using high-level semantics. Further, the presented intelligent classifier classifies the image into a series of semantic categories, each with a unique set of classes based on the content of the remote sensing image. Furthermore, it applies intelligent feature selection, in which the prominent characteristics are chosen based on the weightage attribute functions utilizing the information gain ratio.

State of the Art
In the realm of remote sensing, it is important to annotate scene images with multiple labels in order to comprehend the images [12,13]. Qi et al. (2020) constructed a multi-label high spatial resolution dataset to understand well about semantic scene images with deep learning approach from the overhead perspective. Their suggested method enables the classification and retrieval of multi-label images via deep learning. This strategy outperformed previous methods in multi-label image classification and retrieval tasks. To evaluate the performance of image classification, investigators employed mean average precision, average F1 score and precision at number of retrieved images, as well as average normalised modified retrieval rank, mean average precision, and precision at number of retrieved images. Zhu et al. (2020) proposed a deep learning framework for multi-label annotation of remote sensing images. One of the primary features of this system is the use of convolutional neural networks to learn features from dual-level semantic ideas. One problem of this approach is that it neglects to include the dependence of the label at the object level and label relationships between the scene level and the object level. Vanegas et al. (2019) presented a kernel matrix factorization based semi-supervised online learning approach for automatic multi-label annotation. The proposed method worked with large datasets, which addresses one of the primary shortcomings of kernel-based methods, namely their inability to scale. Also, this method is ideal for non-linear complicated relationships and significantly reduces the amount of memory and calculation time required for multi-label annotation tasks. Hu et al. (2013) proposed a multi-level max-margin discriminative analysis for the annotation of highresolution images. To create discriminative features, the algorithm use the maximum entropy discrimination latent Dirichlet Allocation technique. It utilises the bag-of-words to incorporate both word-level and topiclevel elements in order to increase annotation performance in multi-level semantics and contextual information. Jeppesen et al. (2019) introduced a remote sensing network which is a deep learning model to detection cloud free images in optical satellite imagery. This model was trained and evaluated using Landsat 8 Biome and SPARCS dataset over biomes with cloud over snowy and icy region images. Further, the model treated the noisy data and increases performance over cloud masking method. Kadhim et al. (2019) proposed an effective method for deep learning and CNN based satellite image classification technique for feature extraction. Four effective ways for improving the performance of satellite image classification were presented. Cao et al. (2020) presented an automatic image annotation technique based on CNN with threshold optimization to address the problem of over-or under-labeling in multi-label image annotation. Hoxha et al. (2020) proposed a remote sensing image retrieval system capable of generating and utilizing textual descriptions that characterise the relationship between objects and their associated attributes in remote sensing images. Xia et al. (2021) suggested a stacked ensemble method for improving the pairwise label correlation and weight learning processes. Additionally, they created an optimization approach to achieve an ideal ensemble solution that is both efficient and optimal. Markatopoulou et al. (2019) addressed deep convolutional neural network architecture that taking the problem of multi-label video/image annotation by exploiting multi-task learning to find the relation between targets and structured output learning to find the correlation between the concepts. Both models are built using standard layers that may be trained using back propagation to increase the accuracy of annotations. Wanga et al. (2019) experimented with an automatic image annotation technique based on a multiclass label selection algorithm. Using a convolutional neural network, this technique improves annotation performance. Alshehri (2020) discussed a technique for extracting image features using principal component analysis and the wavelet transform. Moreover, the author suggested a prediction technique based on neural networks for image classification of retrieved data. Jabari et al. (2013) proposed a classification method in high resolution urban satellite images using fuzzy logic. Fuzzy logic is used for satellite image to handle the main problem such as uncertainty in the position of object borders in high resolution image classification. Li et al. (2017) investigated a method for extracting visual attention features using a multi-scale procedure. Further, researchers created a fuzzy classification method for classifying high-resolution remote sensing scene images. This approach allows for an accurate classification rather than other measurements of quantitative accuracy. Gheshlaghi et al. (2017) proposed an analytical network process and fuzzy based decision making system for detecting landslides problems. Bharti and Kurmi (2017) described a novel approach for classifying high-resolution urban satellite images into three categories using fuzzy logic: road, building, and vegetation. Ma et al. (2017) discussed the use of remote sensing imagery to classify land cover images using an object-based approach. According to the literature review, the majority of existing image classification algorithms are ineffective at accurately detecting class labels and at semantically annotating the multiple class labels. Motivated by these findings, a unique intelligent classification technique is suggested in this work, which leverages intelligent fuzzy rules in conjunction with the CNN algorithm to categorise image class labels with more accuracy. According to high level semantics, proposed intelligent classifier combines CNN algorithm with convolutional layer, min-max pooling layer, and decision manager to efficiently classify images into different types of label classes. Finally, the decision manager decides on image annotation by integrating intelligently produced fuzzy rules and CNN classification.
The contributions of the proposed system are 1. The Multi label semantics where the images are annotated with single-class label and represent them in high level semantics. 2. The proposed model provides intelligent feature selection algorithm where the prominent features are selected from the class label. 3. The proposed system incorporates an intelligent classifier, which utilises intelligent fuzzy rules and a CNN classifier to appropriately annotate images using the retrieved feature set.

Proposed System Architecture
The proposed system's architecture is depicted in Fig. 1, and it is composed of eight modules: an image dataset module, a semantic analysis module, a class label classification module, an intelligent feature extraction module, a CNN classification module, a fuzzy rule generator module, a fuzzy inference module, a knowledge base module, and a decision manager module.
Image data set module is the first module of a proposed system. It allows use of the UC Merced data set, with 70% of the images being used for training and 30% for testing the proposed system. The second module is devoted to semantic analysis. The semantic analysis module's primary function is to provide a higher-level knowledge of the given scenery and to annotate the given image with various class labels [14,15]. The following module is for intelligent feature extraction. The main role of this module is to identify and extract the prominent features from the annotated class labels. The next module is CNN classification. This module is further divided into three submodules: max pooling, convolutional, and decision. The following module is fuzzy inference, which employs fuzzy rules. The accompanying module is the fuzzy rule generator. The fuzzy rule generator module is further divided into four submodules: fuzzification, rule creation, rule firing and matching, and rule execution. The succeeding module is the knowledge base, which stores the created fuzzy rules. Decision manager is the last module of the system proposed. The main role of the decision manger is to take the decisions and control and coordinate the other modules present in the system.

Proposed System
The semantic annotation phase is the initial phase of the proposed system. The data collection UC Merced is used as an input, and it contains high-resolution remote sensing images of land cover, and some sample images are shown in Fig. 2. The most important job of the semantic annotation phase is to carry out high-level thinking and assign different class labels to the image [16]. The proposed approach employs multiple label annotation to more precisely classify the images. Algorithm 1 details the algorithm for the semantic annotation module.

Initial Level Image Segmentation Phase
In this step, the images are segmented into homogenous non-overlapping discrete sections based on their attributes such as values of their grey pixel, texture and auxiliary data. The proposed systems segment the image using a multi-resolution segmentation technique. The proposed system takes into account three critical parameters: scale, shape, and compactness. Based on the results of initial segmentations the class label items such as shadows, vegetation area and roadways can be defined. However, for the accurate detection of buildings, the second level image segmentation is essential.

Intelligent Fuzzy Based Image Classification
In the fuzzy image classification, the segments are classified on the basis of the specific values defined in the membership functions instead of applying a decision based upon the binary values. The membership functions based on fuzzy logic have values ranging from 0 to 1. Where 0 indicates that the object is not a member of the class and 1 indicates that the object is a member of the class. A triangle membership function is used in the proposed system [17][18][19][20]. The approach utilizes three variables: low, medium, and high. The fuzzy inference system generates intelligent fuzzy rules based on the linguistic variables. The decision manager makes the decision based on the generated fuzzy rules. The suggested system tests all object classes by classifying each image segment using intelligent fuzzy rules. The parameters used for the object based image classification are explained as follows.

Segment Shadowing
In the object based image classification, the Segment Shadowing is used to identify the objects which are elevated. The proposed system uses two parameters namely segment brightness and segment density to determine the shadow of the image segments.

Brightness
The brightness is defined as amount of mean value of each pixel or segment present in all the bands of the image. The brightness of the segments of the image can be computed by using Eq. (1) [21]: In object based image classification, the shadow objects in the image segments contains low brightness values. Moreover, in order to improve the accuracy of the brightness the proposed system employs a fuzzy logic based K-means clustering approach to detect the cluster with darkest values. Finally, the mean and Standard Deviation (SD) of the darkest cluster is computed. To build intelligent fuzzy rules, the computed darkest cluster is used as a linguistic variable for the shadow brightness. The two parameters NIR ratio [21] and NDVI [21] are considered in the proposed approach to calculate the vegetation area for the given image segment. Eq. (2) contains the formula for calculating the NIR ratio and follows: where NIR ratio is Near Infrared imaging and NDVI is the difference vegetation index which is used to identify the vegetation area from the given image segment. Two important metrics, L cm [21] and L e [21], are taken into account while identifying the road from an image segment. The L cm and L e are computed by using Eqs. (3) and (4): The intelligent fuzzy rules can be used to identify the road class label from a given image segment based on these criteria. To determine the building class labels from a given image segment, three critical characteristics must be considered: the elliptical fit, the rectangular fit, and the shadow position within the given image segment. Rectangular fit is defined as the degree to which objects (buildings) fit within a rectangle. The number 0 indicates that the objects do not fit within the rectangle, whereas the value 1 shows that the objects do fit within the rectangle. The term "elliptical fit" refers to the degree to which objects fit into an elliptical structure. The value 0 indicates that the objects do not fit within the elliptical structure, whereas the value 1 shows that the things do fit within the elliptical framework. The shadow position is used to denote the locations of buildings in an image segment. The most frequently used position of the shadow for identifying buildings in a given image segment is on the southern or western side. Intelligent fuzzy rules are formed based on all of these computed parameters [22][23][24].

Semantic Annotation Module
The algorithm for semantic annotation module is explained in Algorithm 1.In this algorithm, the UC Merced image data set is taken as input and the output of this algorithm is Set of annotated class labels [25]. Initially, in this algorithm Image Set (IS) is defined as set of images ranging from IS 1 to IS n . The next step is store the elements of the IS into an array. The Class Label(CL) set is defined as set of class labels ranging from CL 1 to CL n and these elements are stored in an array. The next phase is image segmentation phase. The image is divided into equal non-overlapping segments S 1 to S n in this step using the Image Set (IS). Calculate the class label detection probability for each portioned image set using the MedLDA and CNN algorithms. The combined class label detection probability of the MedLDA and CNN algorithms is used to get the total class label detection probability. Class labels are assigned to each image segment based on the estimated total class label detection probability, and annotation is performed using the class labels.

Intelligent Class Labels Extraction Phase
In this phase, features are extracted from the annotated class labels in order to achieve more accurate classification of the target images. This step extracts features using machine learning algorithms. Algorithm 2 explains the stages involved in the collection of Intelligent class labels. This method takes a set of annotated class labels as input and extracts intelligent features from the set of class labels as output. The class labels are initially loaded and stored in an array. Each class label undergoes pre-processing. The proposed algorithm extracts intelligent features from a set of class labels using an optimized VGG16 model and a RESNET model [21,22]. Finally, for improved classification, the retrieved features are assigned to the appropriate classes.

Algorithm 2: Intelligent Class labels extraction phase
Input -Set of class labels.
Output -Intelligent features from the set of class labels.
(1) Begin (2) Load the class labels and store them into an array CL 1 to CLn // where CL is the set of class labels.   In this step, land cover satellite images were intelligently classified using fuzzy logic and the CNN algorithm. This step takes the annotated class labels and extracted intelligent features as input. Tab. 1 has annotations for the possible land cover images. Intelligent fuzzy rules are built using these annotated class labels and the collected intelligent features [26][27][28]. The object shape, annotated class labels, and its features are referred to as membership functions in the proposed system. Algorithm 3 provides the intelligent fuzzy rules for the proposed system for improving land cover image classification [21]. The proposed intelligent-based fuzzy classification algorithm makes use of a triangular membership function that is more appropriate for mamdami than sugeno models. As a result, the mamdami model is favored above the sugeno model in the proposed model. The algorithm illustrates the classification of vegetation areas using clever fuzzy rules.

CNN Based Classification Algorithm
By extending the CNN algorithm with intelligent fuzzy rules, the proposed system develops a novel intelligent CNN-based image classification algorithm. The proposed algorithm performs the convolution operations for two functions x and y for the operator using the integral given in Eq. (5) as follows.
The Eq. (5) provides the convolution operations for two functions x and y for the operator using the integral. By employing this equation, the proposed system develops nine maximum pooling layers and ten convolutional layers for the two functions x and y, which are used to conduct image classification. Moreover, the proposed system employs the sigmoidal function as a activation function and we use the bias function defined by f(x) = x + 1/x as a bias function along with the CNN. The proposed algorithm employs nine max pooling layers and ten convolution layers for performing the classification of the image. All of these layers operate on the image data set and provide a set of features such as NIR ratio, NDV ratio, LCM value, LE value, Std value, rectangular fit, and elliptical fit that can be used to classify the images in the given data set. In this proposed model, we use the sigmoidal function as a activation function and we use the bias function defined by f(x) = x + 1/x as a bias function along with the CNN. By comparing them to the features selected by the feature selection algorithm, the CNN applies fuzzy rules to obtain feedback on the selected features. If both are identical, the classification process is initiated. In the event of a mismatch, it asks the decision manager for guidance on the qualities to employ depending on their sensitivity. Errors are communicated in reverse order and are minimized during the classification process. The intelligent fuzzy CNN proposed in this paper performs multiclass classification on a variety of distinct class labels, including buildings, vegetation, land, roads, and vehicles.

Experimental Setup and Results
The proposed intelligent classification model is implemented using the MATLAB 2013a software. The proposed model generates intelligent fuzzy rules using a mamdami model with triangle membership functions. The proposed classification model is compared to previously published models using performance criteria such as True Positive Rating (TPR), True Negative Rating (TNR), False Positive Rating (FPR), and Classification Accuracy (CA). The TPR, TNR, FPR, and CA are calculated in the following manner.
Tab. 1 gives the classification detection accuracy of various class labels such as vegetation, road, and buildings for binary cross entropy algorithm. In Tab. 1, three class labels namely vegetation area, building and roads are considered. For each class labels different sets of training and testing images are given as input to the buildings for binary cross entropy algorithm.
As shown in Tab. 1, the average true positive value for the vegetation area class label is 75.6%, the average true negative value is 23%, the average false positive value is 1.6%, and the classification accuracy for the binary cross entropy algorithm is 75.42 percent when training and testing images are varied. For building class label the average true positive value is 77.72%, average true negative value is 18.64%, average false positive value is 3.84% and classification accuracy of for binary cross entropy algorithm is 77.56% for varying training and testing images. For road class label the average true positive value is 79.08%, average true negative value is 14.16%, average false positive value is 6.64% and classification accuracy of for binary cross entropy algorithm is 79.16% for varying training and testing images. Tab. 2 exhibits the classification detection accuracy for CNN utilizing the RNN algorithm for various class labels such as vegetation, road, and building. In Tab. 2, three class labels are considered: vegetative area, building, and road. Different sets of training and testing images are fed into the CNN using the RNN algorithm for each class label. From the Tab. 2, it is clear that for vegetation area class label the average true positive value is 79%, average true negative value is 15.3%, average false positive value is 1.2% and classification accuracy of for binary cross entropy algorithm is 79.14% for varying training and testing images. For building class label the average true positive value is 82.4%, average true negative value is 15.66%, average false positive value is 2.54% and classification accuracy of for CNN using RNN algorithm is 83% for varying training and testing images. For road class label the average true positive value is 79.08%, average true negative value is 12.3%, average false positive value is 4.5% and classification accuracy of for CNN using RNN algorithm is 83.16% for varying training and testing images. Tab. 3 summarizes the classification detection accuracy for the proposed intelligent classification algorithm for various class labels such as vegetation, road, and building. Tab. 3 considers three class labels: vegetative area, building, and road. Different sets of training and testing images are given into the buildings for the proposed intelligent classification algorithm for each class label. From the table it is clear that for vegetation area class label the average true positive value is 88.42%, average true negative value is 11%, average false positive value is 1.38% and classification accuracy of for binary cross entropy algorithm is 87.68% for varying training and testing images. For building class label the average true positive value is 90.96%, average true negative value is 7.58%, average false positive value is 1.66% and classification accuracy of for proposed intelligent classification algorithm is 90.68% for varying training and testing images. For road class label the average true positive value is 90.5%, average true negative value is 7.44%, average false positive value is 2.06% and classification accuracy of the proposed intelligent classification algorithm is 90.46% for varying training and testing images.   Fig. 4 illustrates the classification accuracy of three classification algorithms: CNN with binary entropy, CNN with RNN algorithm, and the proposed intelligent classification algorithm for three image class labels: vegetative area, buildings, and roads. From the Fig. 4, it is observed that the proposed intelligent classification algorithm has better classification accuracy of three class labels for vegetation class label (84%), building (90%) and roads (90%) when it is compared with other existing classification algorithms such as CNN with binary entropy for vegetation class label (75%), building class label (76%) and road class label (79%) and CNN with RNN algorithm with classification accuracy of various class labels vegetation class label (79%), building class label (80%) and road class label (83%). The proposed intelligent classification technique achieves a higher classification accuracy because it combines semantic analysis with single label image segmentation.
Moreover, when compared to other existing classification algorithms, the proposed intelligent classification algorithm employs intelligent fuzzy rules in conjunction with the CNN algorithm to accurately classify the selected class labels vegetation area, buildings, and roads. Furthermore, the proposed intelligent classification method has a higher proportion of genuine positives and a lower proportion of true negatives, as well as a lower false positive rate. As a result, the proposed intelligent classification method outperforms other existing classification algorithms in terms of class label accuracy.

Conclusion and Future Work
A novel intelligent Fuzzy based CNN image classification model has been proposed in this paper. This strategy is advantageous for enriching the target information and outperforms manual image labeling by collecting semantic descriptors from the images automatically. The experimental results from three remote  sensing image datasets demonstrate that the proposed framework significantly improves the performance of Multi Label annotation when compared to alternative annotation approaches. In comparison to previous methods, the improvised algorithm adaptively decides the number of semantic classifications within class labels during annotation. The proposed intelligent classifier overcomes the least probability retrieval error during classification. This approach produces more true positives and fewer true negatives, as well as lower false positive rates. The future research will entail modifying the similarity measurements in order to generate more semantically related scenes using enhanced metric learning approaches. Additionally, it focuses on developing Fuzzy-CNN to operate on many classes and incorporate methods for classification judgments, incorporating Multi Label and Multi Class output models into land cover remote sensing images.
Funding Statement: The authors received no specific funding for this study.
Conflicts of Interest: The authors declare that they have no conflicts of interest to report regarding the present study.