Plant Identification Using Fitness-Based Position Update in Whale Optimization Algorithm

: Since the beginning of time, humans have relied on plants for food, energy, and medicine. Plants are recognized by leaf, flower, or fruit and linked to their suitable cluster. Classification methods are used to extract and select traits that are helpful in identifying a plant. In plant leaf image categorization, each plant is assigned a label according to its classification. The purpose of classifying plant leaf images is to enable farmers to recognize plants, leading to the management of plants in several aspects. This study aims to present a modified whale optimization algorithm and categorizes plant leaf images into classes. This modified algorithm works on different sets of plant leaves. The proposed algorithm examines several benchmark functions with ade-quate performance. On ten plant leaf images, this classification method was validated. The proposed model calculates precision, recall, F-measurement, and accuracy for ten different plant leaf image datasets and compares these parameters with other existing algorithms. Based on experimental data, it is observed that the accuracy of the proposed method outperforms the accuracy of different algorithms under consideration and improves accuracy by 5%.

The automated identification of plants is based on the analysis of leaf images. Therefore, leaves are critical sources of information about the plant. However, this task is challenged by many hurdles like similarity in plant leaves, background variation, and the colour of leaves. Moreover, natural images require an efficient segmentation approach for further processing. Thus, developing machine learning and deep learning approaches to identify plants with higher accuracy is highly desirable. This study presents a new method for plant identification with improved WOA-based feature selection, leading to efficient classification. The significant research contribution of this paper is as follows: 1. A fitness-based WOA (FWOA) was proposed, and its performance was evaluated over a set of benchmark functions. 2. Feature extraction was performed using SIFT and SVM classifier used for classification.
Following is a breakdown of the rest of the paper. Section 2 discusses some recent developments in plant identification and WOA. Feature extraction, feature selection, and classification are discussed in Section 3. Section 4 discusses the experimental results of FWOA and its application to plant leaf identification. Section 5 concludes the paper.

Preliminaries
Computational intelligence-based techniques can solve a complex optimization problem with fewer resources. These techniques are classified into different classes based on their source of inspiration, like swarm-based, evolutionary, and bio-inspired. These algorithms start with a set of randomly generated populations. Subsequently, each update their position shares details with other individuals and selects the best one for the next iteration. This section discusses recent development in plant identification and the basics of WOA [15].

Plant Identification
Machine learning and deep learning are becoming more popular nowadays for the identification of plants. Some of the recent research contributions for the identification of plants using these techniques are discussed here. Pankaja and Suma deployed WOA to reduce dimensions and classified using Random Forest (RF) [15]. The author extracted texture, shape, and color features from the leaf image dataset. The WOA-based approach selects a set of optimal features. They used Flavia and Swedish leaf datasets for this experiment. Results reported that the WOA-based strategy outperformed other considered algorithms for feature extraction, feature selection, and plant identification. Sun et al. deployed a deep residual network with 26 layers on the BJFU100 dataset collected from their university campus [16]. The new approach was first validated on the Flavia leaf dataset with a 99.65% recognition rate. The main feature of this work is that Sun et al. acquired this data set on a mobile device [16]. Ghazi et al. employed transfer learning with a deep neural network [12]. Here, the author performed fine-tuning of the pre-trained model and used AlexNet, VGGNet, and GoogLeNet. The new model gives significantly improved results. Zhu et al. deployed a deep CNN with a set of five max-pooling layers, five soft-max layers, three fully connected layers, and sixteen convolutional layers [11]. This study concluded that the use of ReLUs along with these layers improved overall performance. Finally, Rzanny et al. studied various image acquisition and preprocessing techniques to identify plants with varying backgrounds [17]. Kho [13]. Some of the combinations archived very high accuracy. A summary of some of the recent development in plant identification is illustrated in Tab. 1.  Chen et al. [18] Deep transfer learning Performed disease identification in rice and maze leaves using pre-trained models with higher accuracy 2020 Figueroa and Montero [14] Convolutional Siamese Network The proposed approach suitable for a small dataset 2021 Reddy et al. [19] Convolutional Neural Network CNN-based approach analyzed five plant leaf datasets

Whale Optimization Algorithm
The WOA is a new nature-inspired algorithm developed by Mirjalili and Lewis in 2016 [4]. This nature-inspired optimization algorithm is used to solve many complex real-world optimization problems. WOA is inspired by the bubble-net hunting approach used by humpback whales during foraging. This method mimics the hunting style by using the fittest search agent to hunt the prey, and the spiral method is used to model the bubble-net attacking mechanism. The hunting method is an exciting mechanism for humpback whales. This approach of hunting is recognized as the bubble-net feeding strategy [20]. The mathematical model of this optimization algorithm majorly consists of three steps. The first step is encircling prey, the second step is a bubble net attacking method (exploitation phase), and the last step is the search for prey (exploration phase) [4]. Each phase is illustrated in subsequent sections.

Encircling Prey
Humpback whales locate the target and encircle it. Initially, the optimal design is unknown; hence, the WOA method assumes the target prey as the present ideal candidate solution, or it can be close to the optimum. Once the optimal search agent is well-defined, some other agent will update the location of the existing best search agent. (1) where A and C both are coefficient vectors, X and X * denotes position vector and the best solution respectively, t denotes the iteration counter and updated in every iteration if found the improved solution. The vectors A and C are calculated as given below: where a is linearly decreased from 2 to 0 throughout iterations and r is a random vector whose values lie between 0 and 1.

Bubble-Net Attacking Method
The exploitation phase in WOA is simulated by the bubble-net behaviour of humpback whales with two steps.
1. Shrinking encircling mechanism: The behaviour of the humpback whale is accomplished by decreasing the value in Eq. (3) from 2 to 0. As the value of a decreases, the variation range of A decreases. The new location of the individual is defined between the current best and the original location by setting a random variable for A in [−1, 1]. 2. Spiral updating position: In this step, compute the distance between the whale situated at ( X , Y ) and prey found at ( X * , Y * ). A spiral equation is formulated for the position of whale and prey to impersonate. The humpback whale's spiral-shaped movement is shown in Eq. (5) where d is the distance of prey from i th whale to take the best solution obtained so far, l is a random number, b is a constant, which defines the shape of the spiral. Thus, humpback whales continuously swim in a spiral-shaped path within a decreasing circle around the prey. To model this synchronized behaviour, assume a 50% probability of selecting either a shrinking circle or a spiral model to update the whale's location. The calculated model is shown in Eq. (6).
where p is a random number within the range [0, 1].

Search for Prey
In the bubble net technique, humpback whales hunt prey randomly according to each other's location. Vector A is used to search for prey in the exploration phase, calculated in the first phase. In this step, update the location of the search agent by using the randomly selected search agent. This method sheds light on exploration and allows this algorithm to perform global searches. The calculated model is shown in Eqs. (7) and (8).
where X rand is an|| arbitrarily chosen whale from the current population. The detailed pseudo code for WOA is given in Algorithm 1. Initially, the WOA starts with some arbitrary solutions. Then, individuals update the positions using the best answer ever found on each iteration or an arbitrarily picked individual. Using vector, A update the location of an individual with the condition if − → |A|>1 selects a random search agent, and if − → |A|<1 selects the best solution. WOA includes the exploration and exploitation phase. Hence it is considered a global optimizer. Moreover, the proposed method describes a search space in the locality. WOA mainly includes two vector parameters, namely − → |A| and − → |C|. However, modification and additional evolutionary procedures are included in WOA formulation to mimic the behaviour of humpback whales. balancing these two opposing processes. In the WOA algorithm, bubble-net attacking is responsible for exploitation and search for prey phase perform exploration. They are essential phases in the WOA algorithm and affect the convergence behavior of WOA. The exploration phase searches whales' property for renovating position; this selection uses the random function for updating to recognize the best whale [4]. To improve the performance of WOA, a new version of WOA is proposed here and named fitness-based status update WOA (FWOA). The new variant update uses highly fitted solutions and explores the search space for a solution with low fitness. The introduced concept works on the principle that solutions in the proximity of higher fitness solutions are also highly fitted and try to exploit the best solution. In the case of low fitness, it updates its position according to the search for prey phase. Detailed pseudo-code for the new strategy is given in Algorithm 2.
Additionally, a fitness-based method is used to compute the value of − → f 1 and − → f 2 instead of a random function, which improves the performance of the current method. In addition, the accuracy of the proposed model is increased by using the fitness function. Calculate the values of − → f 1 and − → f 2 to surround the hunting stage according to Eqs. (9) and (10).
The vectors A and C are considered new variables − → f 1 and − → f 2 in random variables r 1 and r 2 . Hence vector A and C is calculated as follows: A is decreased from 2 to 0 throughout iterations; − → f 1 and − → f 2 are calculated using fitness-based position update.
The new approach takes advantage of a highly fitted solution. It assumes that the proximity of highly-fitted solutions may be a feasible solution for the considered problem. As a result, the swarm always moves in the direction of the solution with good fitness with self-organizing characteristics, and it improves the convergence speed and avoids skipping real solutions.
The performance of the newly proposed FWOA is evaluated over a set of thirteen benchmark problems [4]. The selected problems are uni-modal and multi-modal optimization problems with known solutions and search. Performance of FWOA and other competitive algorithms compared in terms of the average function value (Avg), standard deviation (SD), and optimal function value. All the algorithms are implemented in MATLAB R2020b on an Intel Core i7 machine with 16 GB RAM and 8 GB GeForce GTX1650Ti Graphics processor to measure these parameters. Tabs. 2 and 3 illustrate results for FWOA, WOA [4], SSA [16], SCA [8]. Tab. 2 illustrates the efficiency and robustness of FWOA in comparison to other algorithms. Graphical representations of results for functions F1, F8, and F12 are depicted in Fig. 1. These results proved that FWOA outperformed considered algorithms in terms of best function value, as shown in Tab. 3.

FWOA-Based Plant Identification System
The proposed model introduced a fitness-based WOA to classify plants based on the leaf image dataset. The suggested model has three significant steps: feature extraction using the SIFT algorithm, feature selection, histogram generation using the modified WOA, and classification of plants based on their leaf image using the SVM classifier, as shown in Fig. 2. A detailed description is given in later sections. The detailed process of the plant leaf classification model is shown in Fig. 3. To validate the proposed model, images of apple, banana, borages, maize, grapes, mint, orange, pepper, potato, and tomato leaf were used in this research. This dataset is used as a sample dataset for validating this model. Some sample images from each category are depicted in Fig. 4.

Feature Extraction
In image processing and computer vision, a feature is an information in a picture [21]. Objects, edges, and points, for example, have extraordinary quality and distinct structure. Feature extraction is a process of classifying essential features of an image, classifying common themes from a broad collection of images, and pattern recognition [22]. The proposed model's first step is to extract all image features and group them into corresponding groups. This extraction is one of the leading steps for image analysis relating to their features. Similar and different image features have to be extracted and stored in respective clusters for practical analysis. The SIFT algorithm is used to extract the features in the proposed method. SIFT is a feature detection method that detects and defines local features of plant leaf images. These local features are essential points in the image that aid in identifying the object of the image [23]. This method can rotate and select an image of a different scale and handle the noise points. Therefore, it is a practical algorithm for feature extraction.

Feature Selection
Feature selection is a technique that significantly affects the performance of the proposed classification model. A combinatorial optimization problem is selecting an optimal collection of features from a vast set of extracted features. Thus, it is highly desired to solve this problem with a non-conventional optimization algorithm. This step chooses the most relevant elements that will aid in estimating the class of each leaf image. Next, extracted features from the previous step are used to select the optimal features and create clusters using selected features, increasing accuracy and decreasing overfitting. This paper used the modified WOA for clustering to select optimal features. 4730 CMC, 2022, vol.71, no.3 Finally, a histogram is plotted using selected features by the proposed model. This histogram shows the fundamental frequency distribution of the selected features. In addition, the histogram allows the review of the selected features in terms of outliers and skewness. The graphical representation is depicted in Fig. 5.

Plant Leaf Classification
Classification of plants using a leaf image dataset is the final step of this proposed method. In the previous step, the histogram is generated based on the selected features and passed to SVM classifier along with their labels [24]. SVM is a high-performance binary classifier, which creates a hyperplane in ample feature space for separate leaf images into their respective classes [25]. In this step, the SVM classifier predicts the class labels of each plant leaf image based on training. Hence labelling, training and testing plant leaf image dataset confirm the accuracy of this model. Experimental results are discussed in the next section.

Experimental Results for Plant Identification
Three steps are used to analyze the proposed plant classification using a leaf image dataset based on FWOA. The first step represents plant leaf dataset description, the second step shows the performance of benchmark functions, and the third step analyses the result of FWOA based plant leaf classification.

Dataset Description
This dataset consists of more than 10000 images; 200 images from each category are used for training and testing this model. This dataset is categorized into ten different classes named apple, banana, borages, corn, grapes, mint, orange, pepper, potato, and tomato. This dataset is taken from Plant Village [26] and Kaggle [27]. This dataset is used to measure the performance of the proposed method in terms of the accuracy of classification of each class using a leaf image dataset. These images are divided into a 70%-30% train cross-test split for each class.

Experimental Results for Plant Leaf Image Classification
The proposed model has been predicted outputs using Python programming. In this section, the proposed approach is described using experimental results based on the input image dataset. Tab. 4 shows some of the parameters and best fitness values. The value of these parameters is decided with exhaustive experiments. The proposed modified WOA has been compared with SCA, BAT, SSA, DE, and WOA. An equal number of image sets have been from each class for these algorithms. Create a confusion matrix concerning each class for performance analysis. The confusion matrix for each class is depicted in Fig. 6. These matrixes show the comparison of actual data and predicted data. The performance of all the considered algorithms for classification is illustrated in Fig. 6. It is important as the considered data set has ten classes. In the case of three or more categories, it is better to visualize results with confusion matric as accuracy can be misleading. The results are measured by calculating the F1 score, precision, recall, and accuracy.   Fig. 7. Hence, it can be stated that the proposed modified WOA classification method is better than the existing algorithms.     Using a plant leaf image dataset, this study presents a new plant classification method. The new version of WOA uses a fitness-based status update method instead of random numbers. This method shows the effectiveness of the results by estimating the maximum accuracy value. In this study, we primarily used three steps: feature extraction using the SIFT method, feature selection using the modified WOA method, and classification using the SVM classifier. The proposed method achieves maximum recall, precision, F1 scores, and accuracy with 80.16%. We analyze the experimental results, and it was found that the WOA with the fitness function increased the efficiency of the proposed algorithm. WOA is employed to handle the problem of feature selection and clustering in this study. The proposed algorithm results are compared with well-known stochastic algorithms such as BAT, DE, WOA, SCA, and SSA.
Furthermore, when compared to other algorithms, the proposed method's results were effective, practical, and simple to implement. In the future, the proposed method can be applied to various plant classifications utilizing different plant leaf image datasets. Besides this, the WOA can be combined with another clustering approach to improve performance.