A Cascaded Design of Best Features Selection for Fruit Diseases Recognition

Fruit diseases seriously affect the production of the agricultural sector, which builds financial pressure on the country’s economy. The manual inspection of fruit diseases is a chaotic process that is both time and cost-consuming since it involves an accurate manual inspection by an expert. Hence, it is essential that an automated computerised approach is developed to recognise fruit diseases based on leaf images. According to the literature, many automated methods have been developed for the recognition of fruit diseases at the early stage. However, these techniques still face some challenges, such as the similar symptoms of different fruit diseases and the selection of irrelevant features. Image processing and deep learning techniques have been extremely successful in the last decade, but there is still room for improvement due to these challenges. Therefore, we propose a novel computerised approach in this work using deep learning and featuring an ant colony optimisation (ACO) based selection. The proposed method consists of four fundamental steps: data augmentation to solve the imbalanced dataset, fine-tuned pretrained deep learning models (NasNetMobile andMobileNet-V2), the fusion of extracted deep features using matrix length, and finally, a selection of the best features using a hybrid ACO and a Neighbourhood Component Analysis (NCA). The best-selected features were eventually passed to many classifiers for final recognition. The experimental process involved an augmented dataset and achieved an average accuracy of 99.7%. Comparison with existing techniques showed that the proposed method was effective.


Introduction
There is a need for proper and early detection of fruit spots and patches on leaves or any other part of a fruit plant to identify symptoms of dangerous non-curious diseases due to the huge contribution to the country's economy made by fruit growth productivity in the agricultural sector [1]. A day of automatic techniques can detect fruit diseases in the early stages, thereby precluding manual intervention [2]. Image processing and computer vision techniques can recognise and classify fruit diseases from any large image dataset [3,4]. Research areas of image processing in agriculture involve different domains, such as the pre-processing and segmentation of the dataset from the computer vision and image processing domain [5,6]. On another level, features are extracted and selected by computer vision and pattern reorganisation and then classified by artificial intelligence and machine learning [7]. Pre-processing techniques include dataset augmentation, contrast enhancement, histogram techniques and noise removal [8]. Similarly, segmentation techniques include thresholding, as well as edge-based segmentation and clustering-based segmentation [9]. A feature extractor technique, also known as feature descriptors, includes a handcrafted feature descriptor, learning-based feature descriptor, deep learning-based feature descriptor, area, and graph feature-based extractor. Classical features, such as texture, colour, point and shape, have mainly been used to recognise plant diseases in the past two years [10,11].
Fruit diseases have recently been automatically detected and classified using computer technologies instead of being manually monitored by humans [12]. Machine learning and computer vision algorithms have been used to recognise the diseased part of a leaf and they have been particularly beneficial by easily identifying signs of an infected region, i.e., when the sign of a disease appears on a plant leaf [13]. Deep learning is part of ML, which utilises the neural network for the extraction of deep features [14,15], and this method has been applied in many applications for plant disease detection and classification in recent years [16]. Deep learning techniques produce more accurate results than classical techniques because the features are automatically extracted and utilised for classification purposes. Zhu et al. [17] introduced an automated system for recognising grape diseases using an image analysis and a backpropagation neural network. This model could easily perceive four grape leaf diseases [Sphaceloma ampelinum de Bary, anthracnose, round spot and downy mildew]. Sladojevic et al. [18] proposed a process to identify plant diseases using an automatic classifying method, which was able to visually recognise thirteen different plant diseases with a 96.3% accuracy. Jhuria et al. [19] presented a technique to identify fruit diseases during the farming process, and it was able to identify two apple diseases and one grape disease. Firstly, the image dataset was pre-processed, which included resizing the function of each image (200, 250). Next, the morphological and textural features of the image were extracted separately and the morphological features produced good results and textures. At the end of the classification, the results were found to be 90% accurate.
Pixia et al. [20] proposed a method for detecting cucumber disease using image-processing tools. The pre-processing step involved grey scaling and smoothing the image, and in the segmentation step, the operation of the corrosion lesion was applied to the image dataset. As for the feature extraction, colour feature extraction was used to extract the morphological features in order to reduce any inappropriate features. The disease detection rate was 96% at the end of all observations and results. Degadwala et al. [21] utilised both classification and segmentation techniques and identified three apple diseases (apple scab, apple rot, apple blotch). Firstly, they used a K-means clustering technique to segment the image dataset and then extracted the features using colour features (global colour histogram, Color Coherence Vector) and Textural Features (Local Binary Pattern), as well as a fusion of the features. Lastly, a Random Forest Classifier was used to improve the accuracy of the classification. In discussing how to identify and classify green litchi disease, He et al. [22] mentioned the challenge involved in trying to separate the background compared to detecting the red litchi disease. As for feature extraction, colour features were extracted first, then the appropriate features were taken using LDA and finally, a support vector machine was used for the classification. According to the results, the precision rate of green litchi recognition was 80.4% and the recall rate was 76.4%.
The above-listed studies still have many challenges, which reduce the accuracy of their recognition. The first challenge is the availability of sufficient datasets, which is followed by an imbalanced dataset. The second challenge is the huge similarity of the symptoms of different fruit diseases, which leads to the misclassification of images. Another issue faced by researchers is the selection of the best features because irrelevant and redundant features reduce the accuracy of the recognition. Therefore, we propose an automated system based on deep learning and an ant colony optimisation (ACO) based feature selection to overcome these challenges. The major contributions of this work are as follows; • Augmented the data to increase the size of the dataset and balance the original dataset to increase the training performance of the deep learning model. • Fine-tuned NasNet Mobile and MobileNet V2 based on the classification layer. Added a new layer that includes information about the target dataset (augmented fruit diseases dataset). • Selected the best features using a hybrid ACO and a Neighbourhood Component Analysis (NCA) and passed them to many classifiers for a final recognition.
The remainder of this manuscript is organised as follows: The proposed methodology, including deep learning models, fusion of deep features and selection scheme, is discussed in Section 2. The results are presented and compared in Section 3 and the work is concluded in Section 4.

Proposed Methodology
In the proposed methodology, two pre-trained deep learning models, NasNet-Mobile and MobileNet-V2, are used to extract the deep features. A hybrid approach based on the ACO and NCA is then proposed to select the best features, which are classified using supervised learning algorithms. The proposed flow diagram of the automated recognition of fruit diseases is illustrated in Fig. 1 and each step is detailed below.

Dataset Acquisition
The Plant Village dataset, which was obtained from Kaggle, was used for this work [23]. It contains a massive number of classes of different plant diseases, but we only considered fruit leaf diseases; hence, the shortened dataset only consisted of 15 classes comprising six type of fruit, which include Apples (scab, black rot, cedar rust, healthy), Cherries (healthy, powdery mildew), Grapes (black rot, esca black measles, healthy, leaf blight, Isariopsis Leaf Spot), Orange Haunglongbing (Citrus_greening), Peaches (Bacterial spot, healthy) and Strawberries (healthy, Leaf scorch).

Dataset Augmentation
Using imbalanced data for classification creates bias, which affects the results. Hence, it is necessary to augment the data for image-processing tasks. As mentioned above, the original dataset contained 15 classes, a few of which were imbalanced. Therefore, we balanced them using three different operations: a horizontal flip, vertical flip, and transposition. 600 images were left in each class after the augmentation. A few sample images are shown in Fig. 2. The main purpose of this step was to increase the number of images in the dataset for better training purposes.

Deep Feature Extraction
Deep convolutional neural networking (DCNN) has achieved surprising success in AI, ML and image processing in the past two decades [24,25]. Convolutional neural networking (CNN) has improved the performance of different recognition tasks over the years by observing and exploiting different network architectures and structural modifications. The twist in the technology of CNNs can be classified in different ways, such as function activation in architecture, learning algorithms, optimisation and regularisation in architectural layers, ti name but a few [26]. In this work, we implemented two pre-trained DCNN architectures, NasNet-Mobile and MobileNet-V2, to recognise fruit leaf diseases.

NasNet-Mobile
Neural architecture search (NAS) is the latest DL technique in the field of artificial neural networks (ANN). It was proposed by the Google brain team in 2016 and has three constituents: search space, search strategy and performance estimation [27]. Search space involves searching for convoluted performances, fully-connected, max-pooling, etc., and then checking the connection between the layers through which complete feasible network architectures are formed. The search strategy involves the use of random search and reinforced learning to sample the population of network architecture candidates by receiving child model performance rewards (maximum accuracy, time management). Meanwhile, the main focus of performance estimation is to reduce computational resources or time regulation of network architecture, so that the performance is estimated at the search strategy position when receiving the child model performance rewards [28,29].
In the proposed work, we fine-tuned this model and then trained with images of fruit leaf diseases. A fully connected layer was removed in the fine-tuning process and a new layer added that only included selected classes of fruit diseases. After that, the modified model was trained using transfer learning and a new model was learned. The deep features were extracted from the average pool layer, which was later utilised for classification purposes. The architecture of the NasNet-Mobile is illustrated in Fig. 3.

MobileNet-V2
MobileNet-V2, which includes 53 deep layers, is the most commonly-used approach in artificial neural networks (ANN). It is less complex and contains less deep CNNs and its performance is effective for lightweight models (like any mobile device with a low computational or processing power). The fact that there is no proper static relationship between independent variables and dependent variables (non-linearity) in MobileNet-V1 layers, is removed in MobileNet-V2, which contains distinguishable filters made of depth-wise convolution and point-wise convolution.
1 × 1 filters are used to overcome the computing complexity of normal convolution that makes the network lightweight [30].
In this work, we firstly fine-tuned this model and trained on images of fruit leaf diseases. We removed a fully connected layer in the fine-tuning process and added a new layer that only included selected classes of fruit diseases. The modified model was then trained to learn a new target model using transfer learning. The deep features were extracted from the convolutional layer and utilised for classification purposes later. The architecture of MobileNet-V2 is illustrated in Fig. 4.

Feature Fusion
This is a process of combining multiple features into one matrix for more information about an object [31,32]. Many techniques for feature fusion are introduced in the literature, such as serial-based approach and a parallel approach. Better information about features is always needed to achieve a more accurate classification. In this work, we used a serial-based approach for the fusion of the deep learning features of both models. The dimensions of the features of both models were N × 2048 and N × 1056, respectively. Both vectors were fused by applying the following mathematical equation: This process is visually illustrated in Fig. 5. From which it can be seen that the features were extracted from two dense layers called prediction and logits and then fused using the above equation to improve the features information. However, since it was found during the experimental process that this step increased the redundancy among features, it was essential to remove it using a feature selection approach.

Best Feature Selection
The selection of the best features is a hot research area in patten recognition [33,34]. Features from different sources, including some redundant information, are fused in one matrix. Many feature selection techniques introduced in the literature show an improved performance. Genetic algorithms, particle swarm optimisation (PSO) and entropy-based selection are just a few of the famous feature selection techniques [35,36]. In this work, we implemented a hybrid ACO-NCA feature selection approach, which initially involved selecting the features using ACO and then passing the output to NCA as an input. The final output was classified using supervised learning algorithms for the final classification.

Feature Selection Using Ant Colony Optimization (ACO)
Swarm intelligence facilitates the observation of the social behaviour of animals, birds and insects and obtain some ideas to make an analysis to solve our daily-life problems. Ant colony optimisation (ACO) is an example of swarm intelligence, which enables us to study the foraging (searching food) behaviour of the ant species. All ants generate or deposit pheromone on the ground while searching for food, highlighting the specific path for other members of the colony to follow. Since most ants are blind, this (Pheromone) is the only way they can communicate, which is an example of stigmergy. Pheromone is a chemical substance produced by animals and it changes the behaviour of animals of the same species [37]. Eq. (2) shows the pheromone level in a graph.
where τ is the quantity of pheromone, that is an ant deposit. Then i and j show the edge connecting node i and node j on the graph. K is the Kth ant, and τ k i,j depicts the amount of pheromone deposited by the "kth ant" on the edge connecting node i to node j. 1 L K is the length of path the found by the kth ant, and we tried to find the shortest path and the reason for the division. The shorter he path is, the higher pheromone should be deposited by the "kth ant". This equation is for the scenario of one ant, and now multiple ants are coming, so that we need to add summation to calculate the amount of pheromone on each edge. Here, vaporisation does not occur because pheromone is being added over time.
where m denotes the total number sof ants. The status of current pheromone and the new pheromone that should be deposited by all ends. Here vaporization occur, because experiment done once by all ants on edges of graph.
where (1-ρ) denotes the current pheromone level, ρ represents the constant value and defines the evaporation rate. In this work, we initially fed the fused features from both deep models into the ACO as the first phase of feature selection. A total of 2000 features were passed to the ACO algorithm, which uses its nature-driven implications to retain the 200 most relevant features and discards the others, as shown in Fig. 6. When used for classification purposes, these selected features provide efficient and sustained results in terms of time, while simultaneously maintaining a good accuracy rate. However, we found that there is still room to improve the accuracy and computational time; hence, we conducted an NCA reduction approach on the selected ACO features.

Feature Selection Using Neighborhood Component Analysis (NCA)
The concept of Distance Metrics Learning (DML) is utilised in many ML algorithms because it makes it easy to define the patterns of the input dataset. The work of an NCA is based upon Mahalanobis Distance (MD) to find the k-nearest neighbours (KNN), which makes it possible to learn the linear transformation of a given dataset to achieve the most accurate classification of a complex nearest neighbour in the transformed space [38]. Euclidian Distance (ED) is a parallel approach to distance metrics. It finds the distance between two points in straight line on a graph and it also has the same weight and independence.
The NCA is embedded with a gradient descent-based optimisation function, which starts by randomly selecting an arbitrary neighbour a i based on premeditated distance measures. It receives a voted response from all of its connected neighbours n later, when they select several points with concerned probabilities from their surroundings. Probability pr ij for one of the surrounding neighbouring points is chosen as the next attribution point a j depending on the adjacency of any two sample points, which is calculated using preferential distance d.
where, w n is the allocated weight for arbitrary surrounding feature n. The correlation between the probability pr ij and the computed distance d is elaborated by introducing a function c that acts as a kernel for returning values between the two parameters.
where, i! = j, otherwise pr ij = 0. Also c is defined as follows: In Eq. (3), ∂ is the broadness for allocated kernel during the neighbor selection and exploration phase. The probability pr i of final point to be correctly labelled and predicted as per NCA is described in the following equation: pr ij (8) Finally, the mathematical formulation of objective function maximization and error rate of assessed speculation is derived as follows: where, g(f ) is the maximisation and error computation function and "pr i " is the final probability. The NCA was used on the 200 selected features derived from the ACO to further minimise their dimensions. This approach refines the features of a model, removes unnecessary depth parameters, and enhances the model's time performance. The work flow of the NCA is illustrated in Fig. 7. This approach removed around 10% of the features, which were passed to supervised learning classifiers for the final recognition. The linear discriminant (LDA) classifier performed better in this work.

Results and Analysis
The results of the proposed framework are presented in this section. The dataset details are discussed in Section 3. Both augmented and original datasets were utilised for the experimental process and a total of 15 diseases of six different fruits were considered in the recognition process. The selected fruit were apples, cherries, grapes, oranges, peaches and strawberries and the images were both healthy and diseased. 70% of the images were used in the training of the modified deep learning models, while the remaining 30% were used for the testing. The learning rate of the deep model training was 0.0001, there were 100 epochs with 30 iterations per epoch. Multiple classifiers with many different methods were utilised for the classification results. Many performance measures were also utilised, such as accuracy, sensitivity, precision, false-negative rate (FNR), false positive rate (FPR), and area under the curve (AUC). All the experiments were performed on an HP Core i7-7500U laptop with 8 GB of RAM and 228 GB SSD, which is able to host MATLAB 2020a.

Experiment 1
The first experiment began with passing the augmented images to the pre-trained modified NasNet-Mobile network. The extracted deep features were passed in the classifiers and the results in Tab. 1 show that the linear discriminant classifier was better than 97.2% accurate. The sensitivity rate (Sen) was 97.2% and the precision rate (Pre) was 97.24%. The sensitivity rate can be verified by the confusion matrix in Fig. 8, in which it is illustrated that correct prediction accuracy of each class was >95%. The other classifiers, such as the linear SVM, MG-SVM, Cg-SVM, Fine KNN, and Weighted KNN achieved an accuracy of 94.5%, 93.5%, 89.9%, 87.7%, and 88.3%, respectively.

Experiment 2
The second experiment began with passing the augmented images to the pre-trained modified MobileNet-V2 deep CNN model. The results of passing the extracted deep features in the classifiers are presented in Tab. 2, which shows that the linear discriminant (ours) classifier was better than 98.5% accurate The sensitivity rate (Sen) was 98.48% and the precision rate (Pre) was 98.5%. The sensitivity rate can be verified by the confusion matrix in Fig. 9, in which it is illustrated that the correct prediction accuracy of each class was >96%. The other classifiers, such as the linear SVM, MG-SVM, Cg-SVM, Fine KNN, and Weighted KNN achieved an accuracy of 98.2%, 97.9%, 96.8%, 96.3%, and 96.1%, respectively. Hence, the accuracy of using the modified MobileNet V2 had improved.

Experiment 3
The features of both deep models were fused in this experiment and the ACO was applied to select the best features. Then, the features were further reduced using the NCA approach and the results are presented in Tab. 3. It is noted in this table that the linear discriminant classifier was the most accurate with 99.7% accuracy. The sensitivity rate of this classifier was 99.7% and the precision rate was 99.71%. The other classifiers, such as the Linear SVM, MG-SVM, CG-SVM, Fine-KNN, and Weighted KNN achieved an accuracy of 99.1%, 98.7%, 98.3%, 95.8%, and 96.4%, respectively. The confusion matrix of the linear discriminant classifier using the proposed framework is illustrated in Fig. 10, which can verify the sensitivity rate of the proposed framework. It is demonstrated in this figure that the prediction accuracy of each class was above 99%. Therefore, the accuracy of the proposed framework had improved compared to experiments 1 and 2.  The results of the fusion process are shown in Tab. 4, from which it can be seen that the fusion of the features of both deep models achieved an accuracy of 98.4%. The ACO-based features were then selected and found to have achieved an accuracy of 98.1%, which is almost consistent (See Tab. 5). The steps involved in the proposed framework, namely, the fusion of deep CNN features and the ACO-based selection were subjected to a time-based comparison and the results are shown in Fig. 11. It can be seen from this figure that the ACO-based selection process was less time-consuming. Similarly, the results of the computational performance of the fusion process, selection using the ACO and the proposed framework are shown in Fig. 12, from which it can be seen that the proposed framework was faster than the other methods. Finally, we compared the proposed framework with some recent studies and it was found that the proposed framework improved the accuracy of the process, as shown in Tab. 6.

Conclusion
A framework based on automated deep learning and best feature selection has been presented in this work for the recognition of fruit leaf diseases. Two deep learning pre-trained models were used and fine-tuned based on the number of classification layers. The features were extracted from both modified models and fused using a serial-based approach. An ACO was applied to select the best features in the first phase, which was later improved with an NCA reduction approach. The final selected features were classified using multiple classifiers and a linear discriminant for better accuracy. The proposed method proved to perform well compared to existing techniques. It can be concluded from the results that the augmentation process improved the recognition accuracy compared to the original dataset. The recognition accuracy was further improved by a fusion of deep learning features and this was increased by the proposed framework. The main strengths of the proposed framework are that it increases the recognition accuracy and consumes less computational time.
Funding Statement: This research work was partially supported by Chiang Mai University.

Conflicts of Interest:
The authors declare that they have no conflicts of interest to report regarding the present study.