Intelligent Breast Cancer Prediction Empowered with Fusion and Deep Learning

: Breast cancer is the most frequently detected tumor that eventually could result in a significant increase in female mortality globally. According to clinical statistics, one woman out of eight is under the threat of breast cancer. Lifestyle and inheritance patterns may be a reason behind its spread among women. However, some preventive measures, such as tests and periodic clinical checks can mitigate its risk thereby, improving its survival chances substantially. Early diagnosis and initial stage treatment can help increase the survival rate. For that purpose, pathologists can gather support from nondestructive and efficient computer-aided diagnosis (CAD) systems. This study explores the breast cancer CAD method relying on multimodal medical imaging and decision-based fusion. In multimodal medical imaging fusion, a deep learning approach is applied, obtaining 97.5% accuracy with a 2.5% miss rate for breast cancer prediction. A deep extreme learning machine technique applied on feature-based data provided a 97.41% accuracy. Finally, decision-based fusion applied to both breast cancer prediction models to diagnose its stages, resulted in an overall accuracy of 97.97%. The proposed system model provides more accurate results compared with other state-of-the-art approaches, rapidly diagnosing breast cancer to decrease its mortality rate.


Introduction
Breast cancer is a great health threat and a significant factor in female mortality. The occurrence of breast cancer is increasing every day. It has become the second-leading disease due to its rapid spread among women worldwide [1]. The early diagnosis of breast cancer can effectively mitigate its risk of mortality. It increases the ratio of life survival through proper treatment because it is one of the most curable malignancies if detected earlier [2]. The detection procedure for breast cancer is expensive and time-consuming. The diagnosing process depends on the consistency and knowledge of medical examiners [3]. The two primary types of tumors are benign (noncancerous) and malignant (cancerous). These two types are further divided into subdivisions and properties. Malignant is considered life-threatening, whereas benign is not typically harmful [4]. Humans are often prone to making omissions so misdiagnosis can take the patient to a noncurable stage. Hence, a fast and efficient computer-aided diagnosing technique can become an assistive tool for the early and accurate detection of breast cancer. The mammogram screening method assists the early detection of breast cancer and enables physicians to make accurate decisions regarding breast cancer treatment [5].
In the process of breast cancer detection, computer-aided diagnosis (CAD) is an assistive tool for early detection. The CAD techniques are effectively used in mammograms to decrease the burden on medical experts and reduce the misdiagnosis of breast cancer [6]. Recently, artificial intelligence technology has been applied to various machine learning and computer vision problems. The deep learning (DL) approach has been used in many scientific and engineering applications, which have increased their performance using DL technology [7,8]. Recently a variety of DL technologies have been successfully adopted in the medical domain in the prediction of heart disease, infant brain [9], lung detection [10], and breast cancer classification [11].
Moreover, DL has been used for many types of cancer and has scored significant success in breast cancer screening [12]. With the emergence of DL, various research has been conducted by considering deep architectures, but a significant type of DL technique is the convolutional neural network (CNN) [13]. Araujo applied a CNN to categorize breast biopsy images to diagnose breast cancer. The accuracy of the proposed model was 83.3% for cancerous and noncancerous detection, whereas the accuracy was 77.8% for invasiveness, carcinoma in situ, benign, and normal tissues [14].
In another study, Yao et al. [15] presented a novel DL model to classify histological images into four classes. This model extracted images featured by an apparent combination of a CNN and a recurrent neural network (RNN). Afterward, the extracted features were used as input in the RNN. The fusion method was applied to three datasets of histological images in the proposed model. Similarly, Wang et al. opted for a CNN and hybrid CNN with a support vector machine (SVM) model for the classification of the breast cancer histological image dataset. Considering the above, the proposed study applied a multi-model approach and achieved 92.5% accuracy [16].

Literature Review
The focus of research efforts related to the prevalence of breast cancer is primarily on the diagnosis and detection of tumors. This section of the research briefly summarizes the existing research on breast cancer. During the last decade, research related to this topic has increased, and various computer-based systems have been developed to overcome this challenge.
Wang et al. [17] suggested the mammogram as an essential element in CAD for early breast cancer diagnosis and treatment. Wang et al. designed a model based on feature fusion with CNN deep features for the detection of breast cancer. The model consists of three phases. The first phase is the unsupervised extreme learning machine (ELM) and CNN deep features used for mass detection. In the second phase, deep, density, texture, and morphological features are used to create a feature set. Third, ELM classifiers are used to classify malignant and benign breast masses using a fused feature set. The experimental results have shown that an ELM classifier has a good capability to identify, classify, and handle multidimensional feature classification.
Chiang et al. [18] suggested a computer-aided detection system to detect tumors using a breast ultrasound. The proposed system is based on a 3D CNN and prioritized candidate aggregation. First, the volumes of interest are extracted using a sliding window method. Then, a 3D CNN was used to estimate the tumor probability for each volume of interest. Those with a high estimated probability were marked as tumor candidates, and the situation of each candidate may overlap. For the cumulative overlap of candidates, an innovative system was designed. In the aggregation process, tumor probability was used to form the candidate prioritization. The experimental results for 171 tumors using the proposed model obtained sensitivities of 95% with an execution time of less than 22 s, which demonstrates that the proposed method is faster than the existing approaches.
Agnes et al. [19] proposed a model called multiscale all CNN (MA-CNN) for the detection of breast cancer. The CNN approach was used in the MA-CNN model to classify mammogram images. The CNN classifier improves the performance for multiple scale features obtained from mammogram images. The proposed MA-CNN model identifies mammographic images into benign and malignant classes, and the experimental results proved that the proposed model is a powerful tool for the detection of breast cancer using mammogram images.
Shen et al. [20] developed a model based on a DL algorithm for the detection of breast cancer. The proposed model uses end-to-end training techniques to screen mammograms. Using this approach, the system requires annotations in the initial training stage and rest stages to require image-level labels only. The CNN was used to classify mammogram screens, and in the breast, the dataset was used to train and evaluate the proposed model. The results reveal that the proposed model achieved high accuracy compared with other existing work on heterogeneous mammography. Zhou et al. [21] presented a CNN-based radionics approach to detect breast cancer. The proposed model applies shear-wave elastography data to obtain important morphology information and feature extraction procedures through a CNN. In the training phase, 540 images in which 318 images were malignant and 222 were benign were used to train this model. The experimental results indicated 95.8% accuracy, 95.7% specificity, and 96.2% sensitivity, respectively.
Qiyuan et al. [22] stated that multiparametric magnetic resonance imaging increased the performance of radiologists in analyzing breast cancer. The proposed study used 927 images, and a pretrained CNN model was applied for feature extraction. The SVM classifier was trained to obtain benign and malignant images from CNN extracted features. The sequence of various levels of feature fusion, image fusion, and classifier fusion was examined, achieving an accuracy of 95%.
Chaves et al. [23] stated that the early diagnosis of breast cancer can increase the chance of treatment and cure patients. According to the researchers, infrared thermography is an essential and promising technique to detect breast cancer. It was found to be low cost and less harmful than radiation when it was applied in women of a young age. The authors applied pretrained transfer learning CNNs to detect breast cancer on infrared images, and the dataset consisted of 440 infrared images that were further divided into two classes, normal and pathology. The experimental results reveal that CNN with infrared images can play a pivotal role in the early detection of breast cancer. George et al. [24] proposed a model for the detection of breast cancer based on nucleus feature extraction by the CNN. The CNN approach was used to extract features from images, and SVMs with the feature fusion technique were applied to extracted CNN features to categorize breast cancer images. With the help of the feature fusion method, the local nucleus features were transformed into compact image features that improve performance. The proposed model achieved promising results with 96.66% accuracy compared with other existing approaches.
Kadam et al. [25] proposed feature ensemble learning consisting of softmax regression and sparse autoencoders to detect breast cancer. Softmax regression and sparse autoencoders classify tumors into benign and malignant classes. The Wisconsin breast cancer dataset in this study and the prediction results outperform with 98.60% accuracy. Additionally, the findings were also compared with the previous work. The statistical analysis is a useful and beneficial model for breast tumor classification. Singh et al. [26] used an SVM classifier technique for detecting breast cancer. The proposed method was tested on the database and achieved 92.3% accuracy with a cubic SVM classifier.
Motivated by the previous research, the researchers focused on cloud and decision-based fusion for an intelligent breast cancer prediction system using a hierarchical DL approach. Furthermore, DL approaches have been implemented to train and evaluate the proposed model. Moreover, this research is significant because of its approach to fusing different datasets and concludes the results based on a fused dataset. The proposed research is also generalizable to other datasets and will be beneficial to future researchers in the medical field, especially for the early prediction of breast cancer.

Proposed CF-BCP System Model
The cloud and decision-based fusion model for an intelligent breast cancer prediction system using hierarchical DL (CF-BCP) is proposed to provide diagnoses. The proposed CF-BCP training model comprised two types of datasets such as image and feature based. The multimodal medical image fusion technique was applied to the image dataset, and the preprocessing technique was used to remove noise from the image data acquisition layer. The moving average technique was used for handling missing values in the electronic medical record (EMR) dataset.
In the application layer, the CNN is applied for breast cancer prediction in the image dataset. In the evaluation layer, the accuracy and miss rate of the proposed CF-BCP model was investigated. For the EMR data, deep ELM (DELM) was applied for breast cancer prediction. If the learning criteria are not met in both conditions, then the system must retrain, whereas if the learning criteria are met, then the data are stored on the cloud, and the next step is the decisionbased fusion empowered with fuzzy logic activation. Decision-based fusion empowered with fuzzy logic determines whether the fused images are benign or malignant. In the second training layer, the fused malignant data are used to detect breast cancer types, such as ductal, lobular, mucinous, and papillary carcinoma, and are also stored on the cloud, which is illustrated in Fig. 1.
After the preprocessing step, the proposed CF-BCP model imports fused data from the cloud to predict breast cancer. If breast cancer is not detected, then the model discards the prior part. If breast cancer is detected, then a fused database for intelligent breast cancer prediction (activation layer) is created, and a breast cancer stage prediction model is imported from the cloud. After the detection of breast cancer stages, the patient is referred to the hospital for further treatment, which is presented in Fig. 2

Convolutional Neural Network
Deep learning (DL) is a widespread technique applied in various fields ranging from lifespan to forecasting transport, diseases, agriculture, stock markets, and so on. Moreover, DL is very helpful in different areas due to its fast learning procedure.
The CNN involves two segments: Convolutional and pooling layers. The proposed CF-BCP uses two CNN layers. The leading layer is used for diagnosing breast cancer, and the other layer is used for predicting breast cancer types. Moreover, the CNN comprises three layers: The input, hidden, and output layers. The size of the input images is transformed into 700 × 460 × 3, where 700 × 460 represents the width and height of the fused input images, and 3 represents the number of channels. The convolutional layer can resolve more computational tasks. The purpose of this layer is to recover features by applying filters, preserving the spatial relationships among pixels. The pooling layer reduces the dimensions of the fused images and uses less computational time. Two pooling layers are often used: max pooling and average pooling.
The average pooling layer is used in the proposed CF-BCP model in Fig. 3. This layer is used to preserve the specific image features captured in the CNN procedure. Additionally, all inputs relate to the rectified linear unit activation function that represents a fully connected layer. Extracted data from the last layer are compiled into a fully connected layer to obtain the final output. Finally, the convolutional layer converts it into a single flattened length and becomes a fully connected layer. The softmax layer is applied to transform logits into probabilities. In the last layer of the CNN model, the accuracy values are marked.
where m denotes the number of classes depending on applications. The softmax transformation is shown in Eq. (2): where C m denotes the logits, which are converted into probabilities using softmax C a = η n=1 T na * B l n , where C a is obtained using interconnected weights with B l n . Next, we find the loss w.r.t the weights that consist of the two summations shown in Eq. (3): where n = 1 to k, l = 1 to z, and Moreover, when m! = lth unit, this indicates a low probability.
We can summarize Eqs. (5) and (6) The cross-entropy loss does not have any module of C l a ; therefore, taking the partial derivative of C l a w.r.t. log (w U ) results in the following: Taking the derivative, the equation becomes where ∂w l u ∂C l a has already been calculated as shown in Eq. (7), and Eq. (8) is divided into two parts: We can simplify this as follows: We can further simplify this: where ∂C l a ∂T l n,a = b l n as input weights. Eq. (9) presents the derivative of the loss w.r.t. the weights for the fully connected layer.

Deep Extreme Learning Machine
The DELM is a significant method that is primarily used for prediction. Fig. 4 represents the DELM architecture that consists of an input layer, hidden layers, and output layer. In the input layer, the various features are used as input. Six hidden layers are used in the proposed CF-BCP model.
For the mathematical DELM, Eq. (10) presents the input layer, and Eq. (11) represents the output of the first layer: The following is the feedforward propagation for the second layer to the output layer in Eq. (12): The error in backpropagation is written as follows in Eq. (15): where Target l , and γ k=6 l represent the desired and calculated outputs, respectively.
Eq. (16) reflects the rate of the weight shift that is written for the output layer: It is written by adding the chain rule, as follows in Eq. (17): After implementing the chain rule (substituting Eq. (17)), it is possible to obtain the weight value modified as shown in Eq. (18): where μ l k = Target l − γ l k × γ l k 1 − γ l k , and so on.
where μ i k = l μ l k x i,l k × γ i k (1 − γ i k ). The output and hidden layers in Eq. (19) in weights are the updating and biases between them: x + i,l k=6 = x i,l k=6 + δ e k=6 Δγ i,l k=6 (19) In Eq. (20), the weight and bias changes between the input and hidden layers are represented: β + j,i k = β j,i k + δ e k Δβ j,i k (20) where δ e is the learning rate of the BCP-DELM, and the value of δ e is between 0 and 1. The convergence of BCP-DELM depends upon the careful selection of the value of δ e .

Decision-Based Fusion Empowered with Fuzzy Logic
The proposed decision-based fusion model empowered with fuzzy logic is based on knowledge, expertise, and logical reasoning ability. The fuzzy logic model has the capacity to manage the uncertainty and imprecision of the data using a proper method. The proposed cloud and decisionbased fusion for an intelligent breast cancer prediction system using the hierarchical DL (CF-BCP) model mathematically is written as follows: Fig. 5 represents the decision-based fusion lookup diagram for the detection of breast cancer and its types:   there is no chance of breast cancer, which thus presents as benign. If CNN layer 1 is benign and DELM layer 1 is malignant, then the breast cancer is diagnosed as malignant, and in all other conditions, malignancy is detected.

Proposed CF-BCP Results
The proposed cloud and decision-based fusion for an intelligent breast cancer prediction system using a hierarchal DL (CF-BCP) model was developed for the earliest prediction of breast cancer and its severity. Using MATLAB (2019a) simulations, the results are obtained for detection. The proposed CF-BCP model consists of two DL approaches: The CNN and DELM. In Layer 1, the DL CNN and DELM approaches were used on 7909 and 569 fused samples, respectively. For both approaches, 80% of the fused samples were used for training purposes, and 20% were used for validation. The accuracy and miss rates of the proposed CF-BCP model were compared with the other existing state-of-the-art techniques: The proposed CF-BCP model diagnoses breast cancer as benign or malignant, where benign represents no breast cancer, and malignant represents breast cancer. Tab. 3 lists the proposed CF-BCP Layer 2 for predicting breast cancer types during the training phase. In total, 4344 fused samples (80%) were used during the training phase and were further divided into 2761, 501, 634, and 448 fused samples of malignant T1, T2, T3, and T4, respectively. In malignant T1, a total of 2761 fused samples were taken, in which 2747 samples were predicted correctly as a malignant T1, and 14 fused samples were predicted incorrectly. In malignant T2, 501 fused samples were taken, in which 485 samples were predicted correctly as a malignant T2, and 16 fused samples were wrongly predicted. For malignant T3, 634 fused samples were taken, in which 609 fused samples were validly predicted as a malignant T3, and 25 samples were invalidly predicted. For malignant T4, 448 fused samples were taken, in which 432 samples were validly predicted as malignant T4, and 16 samples were wrongly predicted by the proposed CF-BCP model.   Tab. 4 presents the proposed CF-BCP Layer 2 for the prediction of breast cancer types during the validation phase. In total, 1085 fused samples (20%) were used during the validation phase and were further divided into 690, 125, 158, and 112 fused samples of malignant T1, T2, T3, and T4, respectively. In malignant T1, 690 fused samples were taken, in which 686 samples were predicted correctly as malignant T1, and 4 fused samples were wrongly predicted. In malignant T2, 125 fused samples were taken, in which 121 samples were predicted correctly as malignant T2, and 4 fused samples were wrongly predicted. For malignant T3, 158 fused samples were taken, in which 149 fused samples were validly predicted as malignant T3, and 9 samples were invalidly predicted. For malignant T4, 112 fused samples were taken, in which 107 samples were validly predicted as malignant T4, and 5 samples were wrongly predicted by the proposed CF-BCP model. Tab. 5 lists the overall performance of the proposed CF-BCP model for the training and validation phases. The proposed CF-BCP model achieved 98.37% overall accuracy and a 1.63% miss rate in the training phase. For the validation phase, the proposed CF-BCP model obtained 97.97% overall accuracy and a 2.03% miss rate.  Figure 7: Accuracy chart contrasted with state-of-the-art approaches for the proposed CF-BCP model Fig. 7 represents a contrast between state-of-the-art methods and the proposed CF-BCP model. The proposed CF-BCP Layer 1 model achieved 97.41% accuracy for the detection of breast cancer, which is better than the existing approaches. The proposed CF-BCP Layer 2 model also detects breast cancer types, which achieves 97.97% accuracy for the validation phase.

Conclusion
Rapidly spreading breast cancer has widely affected women's lives. Reliable and early detection leads to a reduction in the breast cancer death ratio. Moreover, CAD systems are highly assistive for medical practitioners in diagnosing breast tumors. Therefore, researchers have focused on early detection and proper treatment to increase the chances of survival. The significant contribution of the current study is that it presents a novel detection model consisting of cloud and decisionbased fusion for an intelligent breast cancer prediction system using a hierarchical DL approach to diagnose breast cancer. Another important contribution of the proposed CF-BCP model is that it has great potential to diagnose different types of breast cancer. The CF-BCP model accomplishes an accuracy of 97.41% for multimodal medical imaging fusion in detecting breast cancer phases and 97.97% accuracy in detecting breast cancer types after decision-based fusion empowered with fuzzy logic. The results of the proposed CF-BCP model are compared with the state-of-the-art approaches, which demonstrates that the proposed model could greatly increase the efficiency and productivity of medical practitioners. The suggested model is generalizable for new datasets because of its flexible and extendable nature.