Deep Learning-Based Skin Lesion Diagnosis Model Using Dermoscopic Images

In recent years, intelligent automation in the healthcare sector becomes more familiar due to the integration of artificial intelligence (AI) techniques. Intelligent healthcare systems assist in making better decisions, which further enable the patient to provide improved medical services. At the same time, skin lesion is a deadly disease that affects people of all age groups. Skin lesion segmentation and classification play a vital part in the earlier and precise skin cancer diagnosis by intelligent systems. However, the automated diagnosis of skin lesions in dermoscopic images is challenging because of the problems such as artifacts (hair, gel bubble, ruler marker), blurry boundary, poor contrast, and variable sizes and shapes of the lesion images. This study develops intelligent multilevel thresholding with deep learning (IMLT-DL) based skin lesion segmentation and classification model using dermoscopic images to address these problems. Primarily, the presented IMLT-DL model incorporates the Top hat filtering and inpainting technique for the pre-processing of the dermoscopic images. In addition, the Mayfly Optimization (MFO) with multilevel Kapur’s thresholding-based segmentation process is involved in determining the infected regions. Besides, an Inception v3 based feature extractor is applied to derive a valuable set of feature vectors. Finally, the classification process is carried out using a gradient boosting tree (GBT) model. The presented model’s performance takes place against the International Skin Imaging Collaboration (ISIC) dataset, and the experimental outcomes are inspected in different evaluation measures. The resultant experimental values ensure that the proposed IMLT-DL model outperforms the existing methods by achieving higher accuracy of 0.992.


Introduction
Skin cancer is a generally occurring kind of cancer over the globe [1]. Melanoma, squamous cell carcinoma, basal cell carcinoma, intraepithelial carcinoma, etc., are different kinds of skin cancers [2]. The human skin comprises three tissues, namely hypodermis, epidermis, and dermis. The epidermis has melanocytes that could create melanin at a highly unusual rate in any situation. For example, a long-term acquaintance of stronger ultraviolet radiation from light can cause melanin creation. The unusual development of melanocytes can cause a lethal kind of skin cancer [3]. The American Cancer Society in 2019 reported that it is anticipated that there would be around 96,480 new cases of melanoma and 7230 persons will be dead from the disease [4,5]. Earlier diagnoses of melanoma are essential for better treatment. When the melanoma was identified in the earlier phases, the 5-year survival rate becomes 92% [6].
Nevertheless, the resemblances among the malign and benign skin lesions are the central problem of melanoma detection. Consequently, detecting melanoma finds complicated for skilled professionals. It is difficult to determine the lesion kind with the human eye.
In recent years, distinct imaging techniques were utilized to capture skin images. Dermoscopy is a noninvasive imaging method that enables the skin surface's visual image by the immersion fluid and light magnification device [7,8]. However, the simple visualization for identifying melanoma in skin lesions might be subjective, irreproducible, or inaccurate because of knowledge that depends on the specialist. The predictive outcome of melanoma from the dermoscopic images by non-professionals lies in the range of 75%-84%. In order to resolve these problems that exist in the melanoma diagnosis process, Computer-Aided Diagnosis (CAD) methods are required for assisting the professionals with the analysis system. The processes involved in the CAD model for melanoma identification involve pre-processing, segmentation, feature extraction, and classification. To effectively identify a melanoma, lesion segmentation is a crucial phase in the CAD system, but it becomes difficult because of considerable variations in texture, size, color, and position of the skin lesions in dermoscopic images. Besides, additional features like hair, ebony frames, air bubbles, color illumination, ruler marks, and blood vessels cause additional challenges to the lesion segmentation. Several techniques were presented for the segmentation of skin lesions. In recent times, Convolutional Neural Network (CNN) is one of the deep learning (DL) techniques, which have attained effective outcomes in the CAD model [9]. Some of the DL architectures are AlexNet, MobileNet, ResNet, etc. In this study, the Inception model is employed due to the following reasons. The Inception model has low computation efficiency and fewer parameters are realized. Besides, it offers high-performance gain, effective utilization of computing resources with a slight increase in computation load for the high-performance output of an Inception network.
This study designs an Intelligent Multilevel Thresholding with Deep Learning (IMLT-DL) based skin lesion segmentation and classification model using dermoscopic images. Principally, the presented IMLT-DL model integrates the Top hat filtering and inpainting technique for the pre-processing of the dermoscopic images. Moreover, the Mayfly Optimization (MFO) with multilevel Kapur's thresholdingbased segmentation process is involved in determining the infected regions. Also, an Inception v3 based feature extractor is applied to generate a meaningful collection of feature vectors from the segmented image. Lastly, GBT model-based classification process is carried out to allocate proper class labels of the applied dermoscopic images. The proposed IMLT-DL model is simulated using International Skin Imaging Collaboration (ISIC) dataset and the experimental results are inspected under different evaluation measures. The paper's organization is given as follows: Section 2 reviews the state-of-art skin lesion segmentation techniques. Section 3 explains the proposed IMLT-DL model and section 4 validates the simulation results. At last, the conclusion of the IMLT-DL model is drawn.

Literature Review
This section reviews some of the existing skin lesion segmentation and classification models. Jaisakthi et al. [10] summarized a semi-supervised technique by combining the Grabcut and K-means clustering methods for segmenting the skin lesions. First, the graph cuts are used to segment the melanoma, and then K-means clustering fine-tuned the boundary of the lesion. The pre-processing methods like noise removal and image normalization processes are utilized on the input image, earlier serving to the pixel classification process. Agrawal et al. [11] used the scale-invariant feature transform method for feature extraction. Madaan et al. [12] implemented convolutional neural networks for medical image classification. Similarly, Aljanabi et al. [13] presented an artificial bee colony (ABC) technique for segmenting lesions. The swarm-based method includes pre-processing of the digital image. Subsequently, it determines the optimal threshold value of the melanoma by that lesion is segmentation by Otsu thresholding.
Pennisi et al. [14] presented a method that segments the Delaunay triangulation method's image (DTM). This technique includes a parallel segmentation method, which creates two different images that are later combined to obtain the last lesion mask. The artifacts are detached in the images and then one model filtering the skin from the image for providing a binary mask of lesions. The DTM method is automatic and does not need a trained model that can be quicker than other techniques. Bi et al. [15] presented a novel automatic technique that executes image segmentation by image-wise supervised learning (ISL) and multi-scale super pixel-based cellular automata (MSCA). The researchers utilized probability maps for automatic seed selection, which removes the user-defined seed selection; subsequently, the MSCA method is applied to segment the skin lesions. Bi et al. [16] presented a Fully Convolutional Network (FCN) based technique to segment the skin lesion. The image features are learned from embedding the multi-stage of the FCN and attained an enhanced segmentation accuracy (compared to earlier tasks) of skin lesion without applying all pre-processing portion (for example, contrast improvement, hair removal, and so on).
Yuan [17] presented a convolution deconvolution neural network (CDNN) for automating the method to segment skin lesions. This method has concentrated on trained approaches, making it highly effective against the utilization of several pre and post-processing. This method creates the probability mapping in which the components correspond to the possibility of pixels going to melanoma. Berseth [18] proposed a U-Net framework to segment the skin lesions depending upon probability mapping of the image dimension, whereas the 10-fold cross-validation system is utilized for training this method. Paulraj [19] introduced a DL method to extract the lesion parts from the skin lesion.

The Proposed Intelligent Skin Lesion Diagnosis Model
The system architecture of the presented IMLT-DT model is illustrated in Fig. 1. The figure has shown that the IMLT-DT model diagnoses the skin lesion using different stages of operations such as preprocessing, segmentation, feature extraction, and classification. The detailed working of each operation is offered in the succeeding subsections.

Image Pre-Processing
Initially, pre-processing of skin lesion images is performed in two stages, as defined below. Primarily, the format conversion and region of interest (RoI) detection processes are performed. As the existence of hair affects the detection and classification results, the hair removal process is carried out [20]. The RGB image is transformed into the grayscale image, and then the top hat filtering technique is utilized to identify the thick and dark hair in the dermoscopic images. The results obtained by the earlier processes comprise high variation among the input and output images, as given in Eq. (1) below: where ○ signifies the closing function, G represents the grayscale input image and b designates the grayscale designing component. Lastly, in the painting process, the hairline pixels are replaced with the nearby pixel values.

MFO with Multilevel Thresholding-Based Segmentation
Once the dermoscopic input images are pre-processed, the MFO with multilevel Kapur's thresholdingbased segmentation model is performed to determine the infected lesion regions in the dermoscopic images. Kapur et al. [21] presented an effective thresholding technique to determine the optimal thresholds for image segmentation. It mainly depends upon the entropy and thus the probability distribution of the image histogram. This technique computes the optimal (th) for the maximization of the overall entropy. In the case of bi-level thresholding, the objective function of Kapur's problem can be represented in Eq. (2): where H 1 and H 2 can be computed as where Ph i is the probability distribution of the intensity level, x 0 (th) and x 1 (th) are probability distribution of the class labels H l and H 2 as shown in Eqs. (3) and (4). This entropy-based technique can be extended for multilevel thresholding values. For example, it is essential to divide the images into k class labels using k À 1 threshold values [22]. The objective function can be altered using Eq. (5): is a vector comprising multiple threshold values. All the entropies are determined individually with the corresponding (th) value, so Eq. (6) is extended for k entropy.
where the values of the probability occurrence x c 0 ; x 1 ; . . . ; x kÀ1 À Á of the k class levels are attained, for the optimal selection of multiple threshold values, the MFO algorithm is applied. The MFO algorithm is stimulated by the flighting nature and mating process of the mayflies [23]. In the MFO algorithm, the individuals in swarms are particularly recognized as male and female mayflies. The male MFs are generally robust and results in improved optimization. The MFO algorithm update the position based on the existing positions p i t ð Þ and velocity v i t ð Þ at the present round: Every male and female MFs update the respective position using Eq. (7). However, the MFs involve distinct velocity updating characteristics.

Movement of Male MFs
Male MFs in the swarm perform exploration or exploitation processes over iterations. The velocity gets updated based on the present fitness values of x i ð Þand the past optimal fitness value in trajectory f x ð Þ. When f x i ð Þ > f x h i ð Þ, the male MFs update the velocity based on the current velocity along with the distance among them and gbest position, the past optimal trajectory: where g is a variable reduced from maximum to 1 in a linear way. a 1 ; a 2 , and b are the constants. r p and r g are two parameters denoting the Cartesian distance among the individuals and its past optimal position, the gbest position in swarms. The Cartesian distance is the 2 nd norm for the distance array: At the same time, when f x i ð Þ < f x h i ð Þ, the male MFs update the velocities from the presented one with a random dance coefficient d: where r 1 is the arbitrary number in uniform distribution.

Movement of Female MFs
The female MFs update the velocity in various ways. The female MFs with wings only endure for 1-7 days. Therefore, the female MFs rushed to identify the MFs for mating and reproduction. So, the velocity gets updated depending upon the male MFs they wish to mate. Here, the topmost female and male MFs are considered the first mate and the next optimal female; male MFs are treated as the second mate, etc. Therefore, for the ith female mayfly, when f y i ð Þ < f x i ð Þ: where a 3 represents the constant employed for balancing the velocity and r m denotes the Cartesian distance among them. Contrastingly, when y i ð Þ < f x i ð Þ, the female MFs update the velocity from the existing one with another arbitrary dance fl: where r 2 is the arbitrary number in uniform distribution.

MFs Mating
The top half of the male and female MFs undergo mating and reproduce children. The offsprings are arbitrarily developed from the respective parents as defined below: where L is arbitrary numbers in Gauss distribution.

Feature Extraction
The segmented image is passed into the Inception v3 model during feature extraction, which has generated a meaningful set of feature vectors. Krizhevsky et al. [24] proposed the AlexNet model for object recognition and classification, and it has achieved improved performance. Followed by, different convolutional techniques are developed for the minimization of the Top-5 error rate of object recognition and classification. On comparing with the GoogleNet (Inception-v1) model, the Inception-v3 model has achieved improved performance. Notably, it has three parts: fundamental convolution block, enhanced Inception block, and classification block. Fig. 3 illustrates the structure of the Inception V3 model.
The fundamental convolution block, which alternates the convolution with max-pooling layers, is employed to extract the features. Then, the enhanced Inception block is developed using the Network-In-Network [25], where multi-scale convolution operations are performed simultaneously, and the convolution outcome of every branch undergoes concatenation. Because of the utilization of a secondary classifier, highly stable outcomes and better gradient convergence can be accomplished, and concurrently disappearing the gradients, and overfitting problems are also discarded. In Inception-v3, one × one convolution kernel is commonly employed for reducing the feature channel count and speed up the training speed. Moreover, the decomposition of large convolutions into small ones also minimizes the number of parameters and computational complexity. Therefore, the Inception v3 model is applied to extract the features from the dermoscopic images.

Image Classification
At the final stage of image classification, the extracted feature vectors from the Inception v3 model are feed as input to the GBT model to define the presence of skin lesions, i.e., allocate proper class labels of the applied dermoscopic images. The GBT model is trained using the XGBoost using the features obtained in the earlier process [26,27]. The GBT model is non-variant to input scaling, and it learns higher-order interaction among the features. In addition, the GBT model undergoes training in an additive way. At every particular time step t, it grows another tree for minimizing the residuals of the present model. The objective function is defined using Eq. (15): where l represents a loss function that determines the variation among the label of the i-th sample y i and the predictive process at the final step along with the current tree output; and f t ð Þ is the regularization norm which penalizes the difficulty of a new tree. Finally, the GBT model generates appropriate class labels of all the applied test skin lesion images.

Performance Validation
The performance validation of the presented model takes place on the ISIC dataset [28] comprising images under different classes such as Angioma, Nevus, Lentigo NOS, Solar Lentigo, Melanoma, Seborrheic Keratosis, and Basal Cell Carcinoma (BCC). The images in the ISIC dataset are in the sizes of 640 * 480 pixels. Few sample test images are illustrated in Fig. 3.   Fig. 3 illustrates the original dermoscopic images with their masked versions. Fig. 4a shows the actual skin lesion image and the lesion region in each image is correctly masked in Fig. 4b.  A detailed comparative results analysis of the IMLT-DL with other existing methods occurs in Fig. 8 and Tab. 2 [29][30][31][32][33][34]. From the results, it is revealed that the SVM model has showcased worse outcomes with the sensitivity of 0.732, specificity of 0.754, and acc. of 0.743. Next to that, the high-level features model has        From the tables and figures mentioned above, it is apparent that the IMLT-DL model has accomplished effective skin lesion segmentation and classification outcome. Therefore, it can be an appropriate tool to segment and classify skin lesions using dermoscopic images in a real-time environment.

Conclusion
This study has developed a novel IMLT-DL model for effective skin lesion segmentation and a classification model using dermoscopic images. The IMLT-DT model diagnoses the skin lesion using different stages of operations such as pre-processing, segmentation, feature extraction, and classification. At the initial level, the presented IMLT-DL model integrates the Top hat filtering and inpainting technique for the pre-processing of the dermoscopic images. Then, multilevel thresholding-based segmentation is carried out to determine the infected skin lesion regions in the dermoscopic images. Inception v3 based feature extraction and GBT based classification processes are performed for effective skin lesion detection. The proposed IMLT-DL model is simulated using the ISIC dataset and the experimental outcomes are examined concerning several measures. The obtained simulation outcomes verified the superior performance of the IMLT-DT model by accomplishing a maximum accuracy of 0.992. In the future, the performance of the skin lesion segmentation process can be improved using advanced DLbased instantaneous segmentation techniques.
Funding Statement: The authors received no specific funding for this study.

Conflicts of Interest:
The authors declare that they have no conflicts of interest to report regarding the present study.