|Computers, Materials & Continua |
A Lightweight Approach for Skin Lesion Detection Through Optimal Features Fusion
1Department of Information Technology, University of Gujrat, Gujrat, 50700, Pakistan
2Department of Software Engineering, University of Gujrat, Gujrat, 50700, Pakistan
3Department of Computer Science, COMSATS University Islamabad - Wah Campus, Wah Cantt, 47040, Pakistan
4Centre for Smart Systems, AI and Cybersecurity, Staffordshire University, Stoke-on-Trent, UK
5Advanced Manufacturing Institute, King Saud University, Riyadh, 11421, Saudi Arabia
6Department of Industrial Engineering, College of Engineering, King Saud University, Riyadh, 11421, Saudi Arabia
*Corresponding Author: Hafiz Tayyab Rauf. Email: email@example.com
Received: 14 March 2021; Accepted: 16 April 2021
Abstract: Skin diseases effectively influence all parts of life. Early and accurate detection of skin cancer is necessary to avoid significant loss. The manual detection of skin diseases by dermatologists leads to misclassification due to the same intensity and color levels. Therefore, an automated system to identify these skin diseases is required. Few studies on skin disease classification using different techniques have been found. However, previous techniques failed to identify multi-class skin disease images due to their similar appearance. In the proposed study, a computer-aided framework for automatic skin disease detection is presented. In the proposed research, we collected and normalized the datasets from two databases (ISIC archive, Mendeley) based on six Basal Cell Carcinoma (BCC), Actinic Keratosis (AK), Seborrheic Keratosis (SK), Nevus (N), Squamous Cell Carcinoma (SCC), and Melanoma (M) common skin diseases. Besides, segmentation is performed using deep Convolutional Neural Networks (CNN). Furthermore, three types of features are extracted from segmented skin lesions: ABCD rule, GLCM, and in-depth features. AlexNet transfer learning is used for deep feature extraction, while a support vector machine (SVM) is used for classification. Experimental results show that SVM outperformed other studies in terms of accuracy, as AK disease achieved 100% accuracy, BCC 92.7%, M 95.1%, N 97.8%, SK 93.1%, SCC 91.4% with a global accuracy of 95.4%.
Keywords: AlexNet; CNN; GLCM; SVMs; skin disease
The skin is the most significant organ of the human figure, as it accommodates all body elements, such as bones and tissues. Nearly 1.79% of the global diseases are skin diseases . Skin diseases can be germs, cancerous, or provocative, affecting different kinds of people, such as children, young and older adults . Many people suffer from skin problems, which eventually lead to skin cancer. Skin diseases are the fourth driving reason for nonfatal illness problems on the planet, and the three most common diseases in the world . Skin diseases caused substantial economic burdens in high and low-income countries. Skin disease can affect all parts of life, such as personal relationships, workplace, social environment, physical movement, and emotional well-being of each other. Suicidal attempts are more in patients having skin diseases . However, making an automated diagnostic system can improve the early detection of this disease. For automated diagnosis, many inventions in the medical image processing field, particularly magnetic resonance imaging (MRI), computed tomography (CT) scan, and digital subtraction angiography (DSA), which determined high-resolution features, was produced for automated diagnosis . Similarly, many public datasets on skin diseases are available for research, and International Skin Imaging Collaboration (ISIC) and Mendeley are used for the proposed study.
Skin cancer is the most well-known type of cancer. Skin cancer can be an abnormal injury of cells. There are four skin cancer classes: Actinic Keratosis, Basal cell carcinoma, Squamous cell carcinoma, and Melanoma. Early detection of cancer assists in treating it effectively. Cancer spread to other body organs due to late diagnosis, can no longer be treated . According to research in the USA, at least one person dies from Melanoma every hour. Also, every year, 9730 patients lose their lives, and 87,110 Melanoma cases were reported . Furthermore, in 2016, there were 6800 Melanoma patients in Canada, where 1150 patients died .
SK disease occurs due to benign epidermal cells based on delayed maturation. Melanocytic nevus (MN) and Basal cell carcinoma (BCC) have more profound injuries, which can be differentiated accurately and adequately . BCC is the most common skin cancer and it mainly affects aged people. Epidemiological studies showed that the proportion of BCC is rising every year, and the new generation rate shows a continuously higher trend . Moreover, there is no noticeable difference between Actinic keratosis (AK) and Squamous cell carcinoma (SCC) they are very similar.
Due to the severity of skin infections, several conditions happened in a patient's life like impairment of daily actions, unemployment, loss of confidence, disturbance, suicidal attempt, loss of internal organs, and death in deadly skin cancers such as melanoma, etc. If the detection of these diseases is delayed or incorrect, it may lead to treatment delays, no treatment, or even improper treatment. In the literature, several machines and deep learning algorithms are used to segregate target classes [11,12]. To reduce the morbidity rate, cost, and mortality, skin diseases should be treated in the early stages to overcome these consequences . All the above-described diseases were misclassified due to the same color or intensity level, and another reason for misclassification is an imbalanced dataset. The proposed framework includes
• An improved deep learning-based Lesion semantic segmentation is proposed.
• Multi-type features regarding Deep, GLCM, and ABCD rules are extracted.
• Improved results in terms of accuracy and false-positive reduction.
The rest of the article is divided into Related Work-2, Proposed Methodology-3, Results, and Discussion-4, and Conclusion-5.
2 Related Work
This part summarizes the most advanced and past work in which we can see the skin disease classification at single, binary, or multi-level using different features. With the help of ABCD rule implementation, an automated system is developed for the most common skin cancer, melanoma, which identifies either it is malignant or benign. After implementing the ABCD rule, dermatologists confirm the phase of melanoma and, according to this phase, proposed possible treatments . A computerized smartphone application is used to detect melanoma skin cancer using the ABCD rule. Image capture capabilities, combined with segmentation and preprocessing, are used to extract ABCD features for each skin lesion image, to identify the disease uniquely .
The ABCD rule and texture features are combined to provide an automatic diagnosis of melanoma cancer. In this study, a computational method using dermoscopic images is developed to help dermatologists differentiate between non-melanoma and melanoma skin lesions . A novel approach is introduced to identify melanoma using the ABCD rule based on mobile devices. Different sensor-based mobile devices are used to input images and accurately detect melanoma cancer from input images . A study was conducted to discriminate benign lesions and malignant melanoma using the ABCD rule to separate benign lesions from malignant .
Melanoma skin cancer is a dangerous type of cancer that kills more people each year than other types of skin cancer. Based on non-linear and linear features, the proposed system automatically diagnosed melanoma. Different features extracted from melanoma images, such as texture features and ABCD rule, distinguish between malignant melanoma and benign lesion . Effective medication for melanoma is critical at the initial stage. According to all the above studies is about melanoma is the most vulnerable form of skin cancer. As a result, more death occurs every year. Thresholding-based techniques are used for segmentation and statistical feature extraction. GLCM (Gray Level Co-occurrence Matrix), Color, Asymmetry, Diameter, and Border features are extracted from collected images . Based on specific characteristics (Energy, Autocorrelation, Homogeneity, Contrast, Entropy), an automated system is introduced to help physicians or doctors to easily classify or detect skin cancer types (AK, BCC, SCC, M) .
Deep learning techniques were used to classify six significant diseases as (AK, rosacea (ROS), BCC, SCC, lupus erythematosus (LE), and SK) . Skin cancer is still a challenging task in deep and machine learning fields. The proposed method classifies two disease classes: melanoma and seborrheic keratosis using deep neural networks . Using a deep neural network with human expertise, this algorithm classifies four common skin diseases: MN, BCC, SK, and psoriasis (P) .
3 Proposed Methodology
The proposed framework steps with CNN-based Segmentation and Fused Features based classification is shown in Fig. 1. First, the dataset is normalized using six classes of skin disease. Data is then fed to CNN Network for Segmentation, which returns the lesion areas, which are later segmented using size filter operation. After that, the Deep features using AlexNet, GLCM features, and ABCD rules are extracted from segmented lesions. These features are later given to SVM for classification.
The ISIC and Mendeley dataset is used that is publicly available. The 4363 images were collected based on six common skin diseases. The targeted six classes are Nevus (N), SK, Melanoma (M), SCC, AK, and BCC.
The melanoma and nevus images gathered from Mendeley , and the other four types of diseases (SK, BCC, SCC, AK) images were collected from the ISIC archive . The proposed study will focus on the above-mentioned diseases for two reasons: First, diseases that appear on the face, such as BSS, SCC, M, and N. Second, AK and SK disease commonly transition from benign to malignant due to inappropriate medication.
The input image is the first and essential step to begin the proposed framework. After the collection of images, labels are prepared using ground-truth values by annotating rectangular or irregular-shaped polygons, and it finally generates ground-truth values in the form of image masks. The originally collected and normalized input images of target diseases are presented in Fig. 2.
3.1 Data Normalization
Data augmentation techniques were used to resize collected images because the dataset collected from two databases with images contain different sizes for different types of diseases. In this process, Eq. (1). is used to resize the images.
The calculated MSE value is further passed to the PSNR value, where all operations are performed using the nearest neighbor interpolation method.
3.2 Semantic Segmentation Using CNN
In the proposed technique, segmentation is performed for the background and lesion areas using a CNN model. The size of the input is crucial because it forms the basic structure of CNN. The image 256 × 256 × 3 with “zero centers” normalization input size is given to the input layer of the CNN. The convolution layer of CNN takes images from the input layer and applies 64 filters with sizes based on 3 × 3, [1 1] convolution stride, and [1 1 1 1] padding on the selected images. The ReLU activation function is used as a decision-making function on the convolution layer.
Following that, size 2 × 2 max pooling is performed using [2 2] stride and [0 0 0 0] padding. Then, with the same customization, two more convolutional blocks are added. Finally, two fully connected layers are introduced: one is a pixel classification layer, and the other is a SoftMax layer, which is performed after three convolution layers.
The SoftMax layer calculates the probability of the predicted classes, and other layers classify target images into lesions and backgrounds. The proposed architecture of CNN is shown in Fig. 3. and a sample of segmented images of skin diseases using CNN is shown in Fig. 4.
3.3 Size Filter
After applying CNN to testing data, some noise had to be removed before moving on to the lesion class. We use component labeling operations to remove them where noise has less connected components than lesion area. In the proposed study, an area selection threshold greater than 50 value removes all noise from the images and makes the lesion a final segmented component.
The binary image considers I, x, and y as digital lattice coordinates, then the 8-connectedness of pair of pixels leads to specific metrics that can be used to find objects based on shape or size. These calculations are shown in Eq. (2) .
3.4 Feature Extraction
After the segmentation process, skin lesions’ features are extracted and used to overcome the problem of misclassification. Different types of features, such as the ABCD rule, GLCM features, and in-depth learning-based features are extracted from segmented images. Principal component analysis (PCA) is used to get gray co-matrix to calculate GLCM features. The explanation of these mentioned features is defined below.
3.4.1 ABCD Rule
One of the most commonly used methods for accurate skin lesion detection is ABCD (asymmetry, border, color, diameter) parameters. Because these multi-class do not classify uniquely, the collected images have similarities, which is why the ABCD rule is used in the proposed approach and provides good classification accuracy. The acronym of ABCD categorizes clinical and morphological features of skin lesions or mole. The “symmetry” feature used lesion images, which are not symmetric, with the images’ principal x–y and lesion axes aligned (if there is any asymmetry) on the images of significant axes. An irregular border is a notched, ragged, or blurred edge of skin lesion image. The border is translating with peaks in the function, which shows the irregularity of the border.
The color feature indicates various colors in collected images such as white, blue, black, brown, red, etc. The diameter feature described the size of the lesion in a particular image, and each target disease diameter is different.
3.4.2 GLCM Features
After ABCD feature extraction, different statistical parameters such as Correlation, Homogeneity, Contrast, Energy are computed.
Statistical analysis is used for skin lesion detection analysis. The PCA algorithm is used to extract these types of features. To obtain statistical parameters, images are converted to grey level. By creating GLCM and its features; statistical measures are extracted . Statistics measures provide information about the texture of the image. Detailed information about statistical features is given below.
Homogeneity This feature shows the Regularity of the calculated region because it computes the distribution of elements in the GLCM to the GLCM diagonal. The following Eq. (3) is used to calculate homogeneity.
Energy Energy feature is used to calculate uniformity for different images. It is also known as the angular second moment or uniformity of the image. The constant energy of an image is defined as 1 because its range starts from 0 to 1 as shown in Eq. (4).
Correlation The correlation feature is used to calculate the correlation between the pixels of different skin images. Eq. (5) shows how much a pixel of an image is correlated with its neighbor, and its range starts from −1 to 1.
Contrast Contrast is used to show the difference in brightness and color of the selected lesion's object. It can be calculated using the following formula as shown in Eq. (6).
3.4.3 Deep Features
The AlexNet algorithm is used to perform deep feature extraction on images . For this process, the images were resized to 227 × 227 × 3. The workspace is loaded with the pre-trained model, and a fully connected layer named “fc7” is used for transfer learning. Activations were performed on training and testing images using “fc7” layer. A total of 4363 images were activated, with 3053 images (70%) used for training and 1310 images (30%) used for testing, which creates vectors of size 4096.
Support Vector Machine (SVM) is a machine learning algorithm . The SVM is a supervised learning algorithm that is used for classification problems. The best optimal path is determined using SVM between possible outcomes. In the classification process, the SVM achieves the highest accuracy than other classifiers. SVM only relies on the support vectors, which are the training samples that lay precisely on the hyperplanes used to define margin. We used the SVM as a multi-class classifier by splitting data into a 70–30 ratio of training and testing.
4 Results and Discussion
This study's main goal is to classify multiple skin disease images using a deep convolutional neural network. A total of 4363 skin disease images were collected from (ISIC Archive, Mendeley) based on six types of different widespread diseases. The description of the dataset is mentioned in table Tab. 1.
The collected images are of different sizes, which is incapable of training the CNN model. The resize function is performed on the dataset so that images convert in the same size—the image 256 × 256 × 3 with “zero centers” normalization input size given to the CNN input layer. A total of 3053 images are used to train, and the other 1310 images are used for testing.
4.1 Evaluation of CNN
A CNN for semantic segmentation is inspired by recent papers [31,32], which is trained on lesion and background class images and evaluated on data that return a global mean accuracy of 0.80313, regardless of global class accuracy, which is a ratio of correctly classified pixels of each image. IoU is a segmentation metric that tells us about the predicted and ground-truth areas. The proposed CNN achieves 0.44409 for Mean IoU. Tab. 2 shows evaluation metrics of CNN:
4.2 Evaluation of Classification Approaches
After the segmentation process, different features (ABCD, Statistical, and Deep based features) were extracted to help the classification process. We extracted and evaluated the testing data with in-depth features on SVM. Tab. 3 showing the evaluation metrics.
The above table shows that Melonama and Nevus's results are good, whereas other classes need more accuracy improvement. However, we added and concatenated the Deep Features with GLCM and other statistical features to improve the results using more features. We then concatenated the in-depth feature vector with the GLCM features vector, and the SVM classifier was created again and tested on testing data. The results are shown in Tab. 4.
The below table shows improved results in less accurate classified classes and improved accuracy for already improved classes, where it also increases the overall accuracy of all classes from 89.8% to 94.2%.
Finally, we concatenated all three features to get more improved results for all classes. However, it improved results with overall accuracy and less accurate classes. The results are shown in Tab. 5.
The above table shows the proposed study results, which improved the results compared to single or double-type feature fusion. The accuracy, specificity, precision, and F1-Score are improved overall, where the proposed study's Confusion Matrix is shown in Fig. 5. The graphical representation of all feature approaches using SVM is shown in Fig. 6.
The SVM achieved 95.4% accuracy with a 4.6 error rate, which is higher than previous studies. The proposed method classifies multiple skin diseases accurately using a combination of features mentioned above.
4.3 Comparison of Proposed Study
For experimentation, our method overall results are significantly better than the previously reported works as 90% , 92.1% , 67% , 90.69% , 87.25% , 86.21% , and 89.0%  are some studies that are compared with the proposed framework. The proposed study, significantly achieves better accuracy results as shown in Tab. 6.
This paper discussed the analysis and classification of multi-level skin disease images using a combination of features for accurate detection. Segmentation was performed on images using the CNN to achieve 0.80313% accuracy. After segmentation, the segmented images were obtained to a size filter operation to remove noise. Statistical features are extracted using the PCA and GLCM algorithms, and Deep features are extracted using the AlexNet transfer-learning method. The ABCD rule features are also extracted. These features are concatenated one by one for classification. In terms of sensitivity, specificity, accuracy, and error rate, the SVM algorithm outperforms other methods. The proposed framework achieves 91% sensitivity, 98% specificity, 94% precision, 95.4% accuracy, and an error rate of 0.04, which is better than other compared approaches.
The proposed method achieves higher accuracy using ABCD; statistical and deep features than other methods compared in research work. However, the proposed framework uses 4363 images for six types of skin diseases. The dataset size and number of diseases can be increased, which can affect the performance of the proposed study.
In the future, more datasets of ISIC challenges regarding skin diseases can be added where more diseases can be used for identification. Moreover, more of images can improve the semantic segmentation results.
Acknowledgement: We would like to thank the anonymous reviewers for their help in improving the quality of the manuscript.
Funding Statement: The authors extend their appreciation to the Deanship of Scientific Research at King Saud University for funding this work through research group number RG-1440-048.
Conflicts of Interest: The authors declare that they have no conflicts of interest to report regarding the present study.
|This work is licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.|