Skin cancer (melanoma) is one of the most aggressive of the cancers and the prevalence has significantly increased due to increased exposure to ultraviolet radiation. Therefore, timely detection and management of the lesion is a critical consideration in order to improve lifestyle and reduce mortality. To this end, we have designed, implemented and analyzed a hybrid approach entailing convolutional neural networks (CNN) and local binary patterns (LBP). The experiments have been performed on publicly accessible datasets ISIC 2017, 2018 and 2019 (HAM10000) with data augmentation for in-distribution generalization. As a novel contribution, the CNN architecture is enhanced with an intelligible layer, LBP, that extracts the pertinent visual patterns. Classification of Basal Cell Carcinoma, Actinic Keratosis, Melanoma and Squamous Cell Carcinoma has been evaluated on 8035 and 3494 cases for training and testing, respectively. Experimental outcomes with cross-validation depict a plausible performance with an average accuracy of 97.29%, sensitivity of 95.63% and specificity of 97.90%. Hence, the proposed approach can be used in research and clinical settings to provide second opinions, closely approximating experts’ intuition.
The skin being largest organ of the human body performs critical tasks of providing protective barrier against mechanical, thermal and physical injury, exposure to hazardous substances and light and maintaining body temperature. While performing these, the skin cells are regularly shed and new cells replace these. However, this systematic process may go awry and produce cells when these are not required. This malignant ailment of the skin, most commonly melanoma, bears the highest mortality and timely detection and treatment are thus imperative to lessen the concomitant adverse effects [
According to the Global Cancer Statistics reported by International Agency for Research on Cancer (IARC), approximately 19.3 million new cancer patients and around 10 million deaths have been reported in 2020 [
Nowadays, computer-based technology has played a vital role in almost every field of human life. Computers are aiding humans in diverse fields where it is difficult for humans to work efficiently and identify and detect things of interest. Same is the case with detecting diseases like cancer where a significant impediment is that the early diagnosis of melanoma, even by seasoned specialists, is a hard-core process. Therefore, using a method for simplifying the diagnosis can be helpful for the specialists identifying diseases like cancer at an early stage is very important for the proper diagnosis and timely treatment [
Skin cancer occurs because of the abnormal behavior of cells or growth of anomalous cells which may be treatable if detected in earlier stages. A little ignorance may lead to death because if it is not treated timely, it starts infiltrating lymphatic system or circulatory system and reach other parts of the body [
Deep learning is a complete black box model. It presents data in a hierarchical way. Deep learning has several frameworks/models to implement a real time detection and recognition of objects like TensorFlow proposed and developed by [
The rest of the paper is organized as follows: In Section 2, we explain the basic idea of skin cancer detection. Section 3 explains the proposed framework/model using CNN, results and discussion are presented in Section 4 and in Section 5, we conclude the paper and future work.
Recently, it has been observed that cancer rates are increasing rapidly due to life-style of people; there are different type of cancers, like Actinic Keratoses, Basal cell carcinoma, Squamous cell carcinoma and Melanoma. Early detection of these cancers is curable and can save life. There are numerous skin cancer detectors made by using different techniques involving computer vision, machine learning and image processing.
Using convolutional neural networks, cancer classification is developed by Esteva et al. [
Klautau et al. [
The novel approach is inspired from image feature representation learning and deep learning, proposed by Klautau et al. [
Malignant melanoma is the most war-like and existence-danger of skin cancer. It is an uncontrolled and rapid growth of skin cells and it is easy to recover if diagnosed in its early stages. The stage of a cancer decides the survival rate of malignant melanoma. If malignant melanoma gets diagnosed in its early stages, then it could reduce the risk of death [
For early detection of skin lesions different methodologies of Computer Vision and Machine Learning are used. The Intelligence system helps us to differentiate between malignant melanoma and benign lesions (non-cancerous).
These newly explored features have not been composed with ABCD (asymmetry, boundary irregularity, color and diameter of skin lesion) till now. All new features should be combined to enhance classification performance. If we combine the skin pattern feature and ABCD features, then that will be good to increase lesion discrimination as compared to the use of each feature alone. First of all, two features will be extracted from skin pattern and then the computational algorithms will be used to calculate the ABCD features and later, the lesion classification was conducted. The classifications using individual or combined the features and in the end, the results will be declared [
Boundary information is used for shape analysis of skin lesion classification. Shape classifies skin lesion images to be benign (non-melanoma) or malignant. Shape descriptor describes a given shape by some set of features vectors. In literature, Zhang et al. [
The most explored/advanced areas in a Computer Science discipline is Artificial Intelligence or Machine Learning and Computer Vision. In Machine Learning, we trained our framework/model using datasets to make them a decision-able system. At the end, the system will provide the most reliable output against inputs. Sometimes the outputs are not according to our requirements [
Several research communities have done good work on melanoma detection in dermoscopy images using Convolutional Neural Networks. Different methodologies and techniques are used to classify skin lesion images, [
Reportedly, millions of people are diagnosed with skin cancer on a yearly basis and late diagnosis becomes a major reason for death in many cases. To improvise the detection and ensure timely diagnosis, computer aided diagnostic mechanisms are designed through the use of deep learning techniques. Thomas et al. [
Another study by Tougaccar et al. [
Pacheco et al. [
Tan et al. [
Srinivasu et al. [
The thorough and detailed literature review suggests that hybrid Convolutional Neural Network techniques with Local Binary Pattern for detection and multi-classification of skin lesions have not been employed previously for the extraction of local and atomic features. Especially in skin lesions classification, local features consist of the inner details of skin lesions. For the precise and accurate classification of medical images (especially deadly diseases), we require appropriate and pertinent features from the dataset in a shorter time with the usage of minimum computational resources. Thus, we propose a hybridized Convolutional Neural Network with Local Binary Patterns to acquire the local and atomic features from the ISIC dataset to classify the skin lesion at multiple levels. In this study, we apply data augmentation techniques, preprocessing to remove hair (combination of different filters with morphological operations and inpainting techniques) from the skin lesion images and contrast enhancements using top-hat and bottom-hat approaches. After preprocessing, we have applied our LBPCNN model with different filter sizes to acquire plausible results (accuracy, sensitivity and specificity) and compared them with pretrained models.
In convolutional neural networks, layers are adjacent and connected to each other. Each node is connected to nodes of the adjacent layer. Input layers have pixel intensities as values. We have been using 16,384 input neurons for the 128 × 128-pixel images. When input values are passed to adjacent layer neurons, it contains random weighted values and are then moved to the next adjacent (hidden) layer and finally results are received by output neurons.
The main theme of this paper is to classify skin lesion images using convolutional neural network and classify according to Skin Cancer types (Basel Cell Carcinoma (BCC), Actinic Keratoses (AK), Melanoma and Squamous Cell Carcinoma (SCC)).
The results are obtained from 11529 images (which are downloaded from ISIC dataset (Combination of ISIC 2017, 2018 and 2019 (HAM10000)–327 of Actinic Keratoses, 379 of Basel Cell Carcinoma (BCC) and 153 of Squamous Cell Carcinoma (SCC))).
Cancer type | Testing | Training | Total |
---|---|---|---|
Basal Cell Carcinoma (BCC) |
716 |
2113 |
2829 |
A convolutional neural network is a special type of network. In terms of machine learning, it is a class of deep or even may be shallow as argued by some, feed forward artificial neural networks that is applied to analyze visual imagery. CNN's were inspired by the biological process of the visual cortex. CNNs require relatively little preprocessing as compared to other image classification algorithms. A CNN like any other neural network consists of an input and an output layer and a single or multiple hidden layers. The hidden layers of a CNN typically consist of convolutional layers, pooling layers, fully connected layers and normalization layers. Convolutional neural networks are best for inputs which have some relation with other inputs. Due to it images are very suitable for CNN's as in an image, almost every given pixel has a relation to a pixel in its neighborhood. In Convolution Layers, filtering is used in certain conditions. A
In this paper, we propose a hybridized convolutional neural network framework to explore the main preprocessing task of skin cancer images collected from ISIC 2017, 2018 and 2019 – HAM10000). Before using convolutional neural networks, we have used the basic image processing steps depicted in the
Further, we have removed noise (hair and unwanted objects) by using several filtering algorithms (Gaussain –
For filtering, we have used the input image as shown in
After filtering, we used several edge detection algorithms (Canny and Laplacian). The Laplacian algorithm gave us positive results as compared to others (Canny). After edge detection (IMed(
where g
Following are the basic LBP parameter to extract features from the patch of an image:
Base: It takes any real values for weights to encode the LBP descriptor. Pivot: For comparing the patched intensity of the pixel, it chooses the center value as pivot. Choosing different intensity values in the patch gives us different local textures. Ordering: It partially encodes/preserves the local texture for a specific order (clockwise or anti-clock) by choosing different neighborhood sizes (P = 8, 12 and 16) and pivot values.
The LBP equation returns a minimum value of circularly rotated. The circular right rotation is represented by
In our experiments the function used is LeakyRelu in all layers where activation is required. This caters for negative activations from neurons which are otherwise set to 0 when the ReLU is used. In our experiments, we have a set value of
Layer (type) | Output shape | Param # |
---|---|---|
Conv2d_1 (Conv2D) | (None, 128, 128, 64) | 1792 |
Conv2d_2 (Conv2D) | (None, 128, 128, 64) | 36928 |
Lbpconv_1 (Conv2D) | (None, 128, 128, 64) | 36928 |
Batch_normalization_1 | (None, 128, 128, 64) | 256 |
Max_pooling2d_1 | (None, 64, 64, 64) | 0 |
Conv2d_3 (Conv2D) | (None, 64, 64, 128) | 73856 |
Conv2d_4 (Conv2D) | (None, 64, 64, 128) | 147584 |
Lbpconv_2 (Conv2D) | (None, 64, 64, 128) | 147584 |
Batch_normalization_2 | (None, 64, 64, 128) | 512 |
Max_pooling2d_2 | (None, 32, 32, 128) | 0 |
Conv2d_5 (Conv2D) | (None, 32, 32, 264) | 304392 |
Conv2d_6 (Conv2D) | (None, 32, 32, 264) | 627528 |
Conv2d_7 (Conv2D) | (None, 32, 32, 264) | 627528 |
Lbpconv_3 (Conv2D) | (None, 32, 32, 264) | 627528 |
Batch_normalization_3 | (None, 32, 32, 264) | 1056 |
Max_pooling2d_3 | (None, 16, 16, 264) | 0 |
Conv2d_8 (Conv2D) | (None, 16, 16, 264) | 627528 |
Conv2d_9 (Conv2D) | (None, 16, 16, 264) | 627528 |
Conv2d_10 (Conv2D) | (None, 16, 16, 264) | 627528 |
Lbpconv_4 (Conv2D) | (None, 16, 16, 264) | 627528 |
Batch_normalization_4 | (None, 16, 16, 264) | 1056 |
Max_pooling2d_4 | (None, 8, 8, 264) | 0 |
Flatten_1 (Flatten) | (None, 16896) | 0 |
Dense_1 (Dense) | (None, 4096) | 69210112 |
Dropout_1 (Dropout) | (None, 4096) | 0 |
Dense_2 (Dense) | (None, 512) | 2097664 |
Dropout_2 (Dropout) | (None, 512) | 0 |
Dense_3 (Dense) | (None, 1) | 513 |
Total params: | 75,825,401 | |
Trainable params: | 75,823,961 | |
Non-trainable params: | 1,440 |
The feature maps are obtained from the proposed LBPCNN layer which is a linear combination of different intermediate bitmaps and anchor weights. Each patch of a bitmap is composed by convolving the input map of image with a predefined kernel and intermediate channel (
The minimization of covariance shift is achieved by the batch normalization layers in the network. If we consider
In literature review, there have used different Convolutional Neural Network with diversified configurations that have been used filter size of 11 × 11 with stride 4 [
For demonstration and evaluation of the LBPCNN model, we have used a Lenovo System with 8GB RAM, processor is Intel Core i7-7700 CPU @ 3.60 GHz × 8 and Graphic Card is GeForce GTX 1050 Ti/PCIe/SSE2. For implementation, we have used pyTorch and other python libraries for preprocessing of ISIC dataset.
We have applied various data augmentation techniques to achieve In-Distribution Generalization: generating examples that are novel but drawn from the same distribution as the training set. We have used different data augmentation techniques such as sequential rotation (45◦), shearing with a factor of 0.2, horizontal and vertical flipping in our proposed LBPCNN model. The same data augmentation techniques are used for pretrained models. Using data augmentation, we achieve rotation invariant and resolve transformation issues and avoid over-fitting problems in ISIC dataset. We also performed In-Distribution generalization on the ISIC dataset and class imbalance problem in order to achieve better performance.
For training of our proposed model and pretrained models, we have used pyTorch with other python based libraries. Our proposed model comprises of a combination of convolution, Local Binary Pattern (LBPCNN) layers with ReLu and Batch Normalization layers. The approach of LBPCNN layers is depicted in
We have used Local Binary Pattern Convolutional Neural Network (LBPCNN) with our proposed network model (similar to VGG Network) with hidden layers, ReLu, Dropout, Batch Normalization and Flatten Layers. Our LBPCNN model is depicted in
Using LBPCNN and above mentioned preprocessing steps we achieved plausible results. We visually analyzed the separation and region-growing performance of the proposed LBPCNN,
We have passed 128 × 128 images to the input layer and convolved with 17 different layers (4 LBP and 10 convolutional layers) using ReLu to rectify the gradient problem. To overcome the covariance shift and to speed up the training process, we used batch normalization in LBPCNN that are widely used in deep learning frameworks/models e.g., Inception and ResNet [
The main crux of LBPCNN is to extract local (neighborhood) features based on the current pixel and expeditiously acquire the local spatial patterns. After thorough and detailed research on the ISIC dataset, we finalized that LBPCNN is the best approach for detection and classification of skin cancer. It has the ability to extract local features with the help of radius (R) and neighborhood (P). P and R are the total neighbors of pixel values and radius of the neighborhood respectively. We set 3 × 3 neighborhood values to extract the atomic features from the ISIC dataset.
For classification of skin lesions, we required the local features of an image. Generally, LBP has been used for local feature extraction. Mostly the image has only a single lesion, so from a single lesion image, the LBPCNN have acquired plausible results. Using LBPCNN, we can customize the P and R values to extract the different levels of discriminative features.
There is no specific approach for extraction of local features and describing atomic-scale appearance using LBPCNN in the previous research. The mentioned pretrained models such as AlexNet, ResNet, DenseNet169, InceptionV3, VGG16 and Xception have used different combinations of Convolution layers with different hyper parameters to achieve accuracy. Using LBPCNN, we use lesser computational resources as compared to other pretrained models and acquire discriminative and most prominent features for detection and classifications of skin lesions.
We have trained our proposed novel LBPCNN model with filters of three different sizes (depicted in
Our proposed novel hybrid LBPCNN model ROC's and confusion matrices are depicted in
We tweak the filter size of LBPCNN model to train and test with the same datasets. Confusion matrix of training and testing are depicted in
During third time training/testing, we modify the filter size (7 × 7) for training, validation and testing with the same dataset. Confusion matrix of training and testing are depicted in
The performance of multi-class classification of the proposed novel hybrid CNN (LBPCNN) model has been measured for each fold. It has been observed that 3 × 3 filter is given better accuracy as compared to other filters such as 5 × 5 and 7 × 7.
For evaluation of our proposed hybrid LBPCNN model, we apply pretrained models such as AlexNet, ResNet, DenseNet169, InceptionV3, VGG16 and Xception on ISIC dataset and achieved results depicted on
We have compared our proposed framework/model result with other existing previous work [
Training | Testing | |||||
---|---|---|---|---|---|---|
Accuracy | 96.53 | 95.64 | 98.37 | 95.64 | 94.14 | |
Precision | 94.32 | 94.19 | 90.63 | 95.95 | 90.08 | 88.13 |
Recall | 95.63 | 92.98 | 93.28 | 96.22 | 89.58 | 85.39 |
Specificity | 97.90 | 97.86 | 96.49 | 98.93 | 97.30 | 96.67 |
Sensitivity | 95.63 | 92.98 | 93.28 | 96.22 | 89.58 | 85.39 |
Accuracy | 96.88 | 96.07 | 95.58 | 97.85 | 94.99 | 93.35 |
Precision | 94.26 | 92.49 | 92.33 | 96.38 | 92.87 | 89.25 |
Recall | 92.77 | 91.31 | 89.68 | 95.54 | 89.06 | 86.87 |
Specificity | 98.19 | 97.61 | 97.53 | 98.69 | 97.31 | 95.89 |
Sensitivity | 92.77 | 91.31 | 89.68 | 95.54 | 89.06 | 86.87 |
Accuracy | 96.61 | 96.19 | 95.38 | 98.42 | 95.92 | 94.60 |
Precision | 94.20 | 92.83 | 91.97 | 96.50 | 91.66 | 89.10 |
Recall | 92.42 | 92.22 | 90.09 | 96.24 | 90.20 | 87.34 |
Specificity | 98.05 | 97.55 | 97.24 | 99.03 | 97.60 | 96.78 |
Sensitivity | 92.42 | 92.22 | 90.09 | 96.24 | 90.20 | 87.34 |
Accuracy | 97.47 | 96.92 | 96.04 | 98.37 | 95.95 | 94.02 |
Precision | 94.36 | 92.83 | 91.87 | 97.15 | 91.54 | 88.42 |
Recall | 96.15 | 95.76 | 93.48 | 97.85 | 96.25 | 93.68 |
Specificity | 97.94 | 97.34 | 96.97 | 98.62 | 95.81 | 94.19 |
Sensitivity | 96.15 | 95.76 | 93.48 | 97.85 | 96.25 | 94.19 |
Overall Accuracy | 94.29 | 93.10 | 91.67 | 96.57 | 91.61 | 88.72 |
Accuracy | Sensitivity | Specificity | |
---|---|---|---|
AlexNet | 89.09 | 92.01 | 84.19 |
ResNet | 92.06 | 92.51 | 92.13 |
DenseNet169 | 91.36 | 92.25 | 88.83 |
InceptionV3 | 93.36 | 93.37 | 93.05 |
VGG16 | 89.86 | 92.34 | 84.69 |
Xception | 90.36 | 92.17 | 87.61 |
LBPCNN model (proposed) |
Accuracy | Sensitivity | Specificity | |
---|---|---|---|
Capdehourat et al. [ |
– | 90 | 77 |
Anas et al. [ |
83.33 | – | – |
Almansour et al. [ |
90.32 | 93.97 | 85.84 |
Khan et al. [ |
95.8% | – | – |
Abbas et al. [ |
– | 89.28 | 93.75 |
Isasi et al. [ |
85 | – | – |
Giotis et al. [ |
81 – | – | |
Esteva et al. [ |
72.1 | 96 | – |
Dorj et al. [ |
94.2 | 97.83 | 90.74 |
Ruiz et al. [ |
– | 78.43 | 97.87 |
LBPCNN model (proposed) |
In this paper, a local binary pattern with convolutional neural network is presented to classify the melanoma and non-melanoma skin lesions. We have used different R and P operators of LBP to get different textures using ROI. The result presents an outstanding performance for LBPCNN during training and testing with an accuracy 0.97 and 0.98 and sensitivity of 0.95 and 0.96 respectively. The main crux of LBPCNN is to extract local (neighborhood) features based on the current pixel and expeditiously acquire the local spatial patterns.
In future work, we plan to extract features with the help of handcrafted feature extraction algorithms such as Maximally Stable Extremal Regions (MSER) and Speeded-Up Robust Features (SURF) and embed with custom layers in Convolutional Neural Network and classify with traditional machine learning algorithms or ensemble classifiers. We also plan to improve the real time (using smartphone camera) classification of skin lesions (melanoma or non-melanoma). The developed model will be placed in a remote cloud to automate the detection of skin lesions and to help the affected patients immediately and it will reduce the clinic manual workload. After the detailed and necessary tests are done, we will deploy the developed and tested model in local clinics and hospitals for immediate screening and diagnosis.