A Survey of Convolutional Neural Network in Breast Cancer

Problems For people all over the world, cancer is one of the most feared diseases. Cancer is one of the major obstacles to improving life expectancy in countries around the world and one of the biggest causes of death before the age of 70 in 112 countries. Among all kinds of cancers, breast cancer is the most common cancer for women. The data showed that female breast cancer had become one of the most common cancers. Aims A large number of clinical trials have proved that if breast cancer is diagnosed at an early stage, it could give patients more treatment options and improve the treatment effect and survival ability. Based on this situation, there are many diagnostic methods for breast cancer, such as computer-aided diagnosis (CAD). Methods We complete a comprehensive review of the diagnosis of breast cancer based on the convolutional neural network (CNN) after reviewing a sea of recent papers. Firstly, we introduce several different imaging modalities. The structure of CNN is given in the second part. After that, we introduce some public breast cancer data sets. Then, we divide the diagnosis of breast cancer into three different tasks: 1. classification; 2. detection; 3. segmentation. Conclusion Although this diagnosis with CNN has achieved great success, there are still some limitations. (i) There are too few good data sets. A good public breast cancer dataset needs to involve many aspects, such as professional medical knowledge, privacy issues, financial issues, dataset size, and so on. (ii) When the data set is too large, the CNN-based model needs a sea of computation and time to complete the diagnosis. (iii) It is easy to cause overfitting when using small data sets.


Introduction
For people all over the world, cancer is one of the most feared diseases and one of the major obstacles to improving life expectancy in countries around the world [1][2][3]. According to the survey, cancer is one of the biggest causes of death before the age of 70 in 112 countries. At the same time, cancer is the third and fourth leading cause of death in 23 countries [4][5][6][7].
Among all kinds of cancers, breast cancer is the most common cancer for women [8][9][10][11][12]. According to the data from the United States in 2017, there were more than 250,000 new cases of breast cancer [13]. 12% of American women may get breast cancer in their lifetime [14]. The data surveyed in 2020 showed that female breast cancer had become one of the most common cancers [4].
A large number of clinical trials have proved that if breast cancer is diagnosed at an early stage, it will give patients more treatment options and improve the treatment effect and survival ability [8,[15][16][17]. Therefore, there are many diagnostic methods for breast cancer, such as biopsy [18].
The image of breast cancer is shown in Fig. 1. Invasive carcinoma and carcinoma in situ are two types of breast cancer [19]. Carcinoma in situ cannot be upgraded in the body. About one-third of new breast cancer is carcinoma in situ [20]. Most newly diagnosed breast cancer is invasive. Invasive cancer begins in the mammary duct and can spread to other breast sites [21].

Figure 1:
The breast cancer image [22] Sometimes, the breast cancer image could be divided into two categories, which are benign and malignant. The images of benign tumors and malignant tumors are given in Figs. 2 and 3. Several imaging modalities are used for the diagnosis and analysis of breast cancer [23][24][25]. The abbreviated imaging modality table is given in Table 1. (i) Screen-film mammography (SFM) is one of the most important imaging modalities for early breast cancer detection [26]. But SFM also has some disadvantages. First, the sensitivity of SFM is low for the detection of the breast with dense glandular tissue [27]. This disadvantage may be caused by the film. Because once the film is finished, it is impossible to improve it. So sometimes there are pictures with low contrast [28]. Furthermore, SFM is not digital. (ii) Digital mammography (DM) is one of the effective imaging modalities for early breast cancer detection [29,30]. At the same time, DM has always been the standard imaging modality for female breast cancer diagnosis and detection [31]. However, DM has some limitations. The specificity of DM is low, which could cause some biopsies [32]. Another limitation of DM is that patients may face high radiation exposure [27]. This may cause some health hazards to patients. (iii) Magnetic resource imaging (MRI) is suitable for clinical diagnosis and high-risk patients [33]. MRI is very sensitive to breast cancer [20]. MRI still has some problems. Compared with DM, the MRI detection cost is higher [34]. Although MRI has high sensitivity, its specificity is low [35]. (iv) Ultrasound (US) is one of the most common methods for the detection of breast cancer. The US has no ionizing radiation [36]. Therefore, compared with SFM and DM, the US is safer and has lower costs [37]. But the US is an imaging modality that depends on the operator [38]. Therefore, the success of the US in detecting and differentiating breast cancer lesions is largely affected by the operator. (v) Digital breast tomosynthesis (DBT) is a different imaging modality. Compared with traditional mammography, DBT can take less time for imaging [39] and provide more details of the dense chest [40]. One problem with DBT is that DBT may not detect malignant calcification when it is at the slice plane [41]. It also takes more time to read than DM [42]. (vi) Histopathological images (HP) can capture information about cell shape and structural information [43]. However, it is invasive and requires additional costs [44]. The details of these different imaging modalities are presented in Table 2.     Medical imaging is usually done manually by experts (pathologists, radiologists, etc.) [45]. Through the above overview of several medical imaging, there are some problems in medical imaging [46]. Firstly, experts are required to manually analyze medical imaging, but there are few experts in this field in many developing countries [47]. Secondly, the process of manual analysis of medical imaging is very long and cumbersome [48]. Thirdly, when experts manually analyze medical imaging, they can be influenced by foreign factors, such as lack of rest, decreased attention, etc. [27].
With the continuous progress of computer science, computer-aided diagnosis (CAD) models for breast cancer have become a hot prospect [49]. Scientists have been studying CAD models for breast cancer for more than 30 years [50,51]. CAD models for breast cancer have the following advantages [52]: (i) CAD models can improve specificity and sensitivity [53]. (ii) Unnecessary examinations can be omitted by CAD models [54]. This can shorten the diagnosis time and reduce the cost. (iii) The CAD models can reduce the mortality rate by 30% to 70% [13]. With the development of computing power, the convolutional neural network (CNN) is one of the most popular methods for the diagnosis of breast cancer [55][56][57]. Recently, a sea of research papers has been published papers about breast cancer based on CNN [58][59][60][61]. However, these research papers only propose one or several methods, which cannot make readers fully understand the diagnosis technology of breast cancer based on the CNN model. Therefore, we complete a comprehensive review of the diagnosis of breast cancer based on CNN after reviewing a sea of recent papers. In this paper, readers can not only see the CNNbased diagnostic methods for breast cancer in recent decades but also know the advantages and disadvantages of these methods and future research directions. The main contributions of this survey are given as follows: • A sea of major papers about the diagnosis of breast cancer based on CNN is reviewed in this paper to provide a comprehensive survey. • This survey presents the advantages and disadvantages of these state-of-the-art methods.
• A presentation of significant findings gives readers the opportunities available for future research. • We give the future research direction and critical challenges about the CNN-based diagnostic methods for breast cancer.
The rest structure of this paper is shown as Section 2 talks about CNN. Section 3 introduces the breast cancer data set. Section 4 presents the application of CNN in breast cancer. The conclusion is given in Section 5.

Input layer
Convolutional layer Output layer Pooling layer Fully connected layer The convolution layer is one of the most important components of CNN and usually connects the input layer [104][105][106][107][108]. The input is scanned by the convolution layer based on the convolution kernel for extracting features. Different convolution kernels will extract different features in the same input layer [109]. There may be multiple convolution layers in a CNN model [110]. Basic features are usually extracted by the front convolution layers. The convolution layers in the back are more likely to extract advanced features [88].

Weight Layer
Weight Layer We first define the parameters of the convolution layer: the input image size is I × I, the convolution kernel is K × K, S represents the stride, the padding is P, and the output size is O × O. Padding refers to additional pixels used to supplement the zero value around the input image [104,[111][112][113]. Stride refers to the step size of each convolution kernel sliding [114][115][116]. The formula is shown below: Fig. 7 gives a sample of convolution. In Fig. 7, the stride and padding are set as 1 and 0, respectively. I = 7, K = 3, P = 0, S = 1, thus O = 5.
More and more researchers use zero padding [117] in the convolution layer. In Fig. 8, the output size is the same as the input size with the one zero-padding.
The features from the input are extracted by the convolution layer [118][119][120][121]. After multiple convolutions, the feature dimension becomes higher and higher, resulting in too much data [122]. But too much data may contain too much redundant information [122][123][124]. This redundant information will not only increase the amount of training but also lead to overfitting problems [123,[125][126][127]. At this time, some researchers could select the pooling layer to downsample the extracted features. The main functions of the pooling layer are (i) translation invariance and (ii) feature dimensionality reduction [124].   At present, the three main pooling methods are max pooling [128], average pooling [129], and stochastic pooling [130], as given in Fig. 9.
Max pooling is to obtain the maximum value of pixels in the specific area of the feature map in a certain step [129]. The formula of max pooling (p m ) is as follows: Average pooling is to average the pixels in a specific area of the feature map in a certain step [131]. The formula of average pooling (p a ) is as follows: where |A R | means the number of elements in A R .
Stochastic pooling selects the map response based on the probability map [132]. The formula of b k is as follows: Stochastic pooling outputs are picked from the multinomial distribution. The formula of stochastic pooling (p s ) is as follows: The nonlinearity is introduced into CNN through activation. Two traditional activation functions are Sigmoid [133] and Tanh [134]. The equation of Sigmoid is given as: The Tanh is written as: These two traditional activation functions do not perform well in convergence. The rectified linear unit (ReLU) [135] accelerates the convergence. The equation of ReLU is as follows: There are some problems with the ReLU. When x is less than or equal to 0, the activation value is 0. In this case, leaky ReLU (LReLU) [136] is proposed. Compared with ReLU, when x is less than or equal to 0, the activation value is a small negative. The equation of LReLU is given as: Based on LReLU, researchers proposed PReLU [137]. When x is less than or equal to 0, the slope is learned adaptively from the data. The PReLU is shown as: where z is very small and decided by other parameters.
Each activation function has its characteristics, which is shown in Table 3.
The CNN model maps the input data to the feature space with the convolution layer, pooling layer, and activation function. The function of the fully connected layer is to map these to the sample space. The fully connected layer convolutes the feature map to obtain a one-dimensional vector, weighs the features, and reduces the spatial dimension.
CNN may consist of multi-layer fully connected layers. Global average pooling is proposed to substitute the fully connected layer, which greatly reduces parameters. However, global average pooling does not always perform better than the fully connected layer, such as in transfer learning.
The increasing depth of the CNN model increases the difficulty of adjusting the model. The input of each subsequent layer changes in the training. In this case, this could cause the disappearance of the gradient of the low-level network. The reason why the neural structure of a deep neural network converges more and more slowly is the gradient disappearance [138].
Batch normalization adjusts the input value of each layer to the standard normal distribution. The data is set as: Firstly, calculate the mean value of batch B: Secondly, calculate the variance: Thirdly, perform the normalization: where ∈ is greater than 0, which makes sure that the denominator is greater than 0.
Finally, two parameters are proposed to increase network nonlinearity: where α is the scale parameter and β is the shift parameter.
In the CNN model, too few training samples could lead to the overfitting problem. The overfitting problem is that the loss function of the CNN model is small and high accuracy is obtained during training, but the loss function is large, and the accuracy is low during testing. In this case, researchers usually select the dropout to prevent overfitting problems. In CNN model training, some nodes in the hidden layer are set as 0, as shown in Fig. 10. This reduces the interaction between hidden layers [139].  One of the important indexes used to evaluate the performance of a CNN model is the confusion matrix The confusion matrix is given in Table 4. However, the confusion matrix only counts numbers. Sometimes in the face of lots of data, it is difficult to measure the quality of the model simply by counting the numbers. Therefore, there are several other indicators for the basic statistical results.
where |A ∩ M| represents the intersection of A and M. 10. The Mean Absolute Error (MAE) is the average distance between the predicted (t) and the truth (y) of the sample.
where m is the number of samples. 11. The Intersection over Union (IoU) evaluates the distance between the predicted value (V ) and the ground truth (G).
where |V ∪ G| means the area of union.

Common Datasets
In recent years, a lot of data sets were produced and published. Researchers can use some of them for research. Table 5 shows the details of some public data sets.
For DDSM, all images are 299 × 299. The DDSM project is a collaborative effort at the Massachusetts General Hospital (D. Kopans, R. Moore), the University of South Florida (K. Bowyer), and Sandia National Laboratories (P. Kegelmeyer). Additional cases from Washington University School of Medicine were provided by Peter E. Shile, MD, Assistant Professor of Radiology, and Internal Medicine. There are a total of 55890 samples in the DDSM dataset. 86% of these samples are negative, and the rest are positive. All data is stored as tfrecords files.

Application of CNN in Breast Cancer
This diagnosis of breast cancer through CNN is generally divided into three different tasks: 1 Classification; 2 Detection; 3 Segmentation. Therefore, this section is presented in three parts based on three different tasks.

Breast Cancer Classification
In recent years, the CNN model has been proven to be successful and has been applied to the diagnosis of breast cancer [140]. Researchers would classify breast cancer into several categories based on CNN models. We would review the classification of breast cancer based on CNN in this section.
Alkhaleefah et al. [141] introduced a model combining CNN and support vector machine (SVM) classifier with radial basis function (RBF) for breast cancer image classification, as shown in Fig. 11. This method was roughly separated into three steps: Firstly, the CNN model was trained through breast cancer images. Secondly, the CNN model was fine-tuned based on the data set. Finally, the features extracted by the CNN model would be used as the input to RBF-Based SVM. They evaluated the proposed method based on the confusion matrix.

BN
ReLU Pooling … Pooling RBF-Based SVM Output Figure 11: The structure of CNN+SVM Liu et al. [142] introduced the fully connected layer first CNN (FCLF-CNN) method. This method added the fully connected layer before the convolution layer. They improved structured data transformation in two ways. The encoder in the first method was the fully connected layer. The second method was to use MSE losses. They tested different FCLF-CNN models and four FCLF-CNN models were ensembled. The FCLF-CNN model got 99.28% accuracy, 98.65% sensitivity, and 99.57% specificity for the WDBC data set, and 98.71% accuracy, 97.60% sensitivity, and 99.43% specificity for the WBCD data set.
Gour et al. [143] designed a network to classify breast cancer (ResHist). To obtain better classification results, they proposed a data enhancement technique. This data enhancement technique combined affine transformation, stain normalization, and image patch generation. Experiments show that ResHist had better classification results than traditional CNN models, such as GoogleNet, ResNet50, and so on. This method finally achieved 84.34% accuracy and 90.49% F1.
Wang et al. [144] introduced a hybrid CNN and SVM model to classify breast cancer. This method uses the VGG16 network as the backbone model. Because the data set was small, transfer learning was used in the VGG16 network. On the data set, they used the method of multi-model voting to strengthen the graph. At the same time, the image was also deformed. The accuracy of this method was 80.6%.
Yao et al. [145] introduced a new model to classify breast cancer. Extracting features from breast cancer images was based on CNN (DenseNet) and RNN (LSTM). Then the perceptron attention mechanism based on natural language processing (NLP) was selected to weight the extracted features. They used the targeted dropout in the model instead of the general dropout. They achieved 98.3% accuracy, 100% precision, 100% recall, 100% F1 for Bioimaging2015 Dataset.
Ibraheem et al. [24] proposed a three-parallel CNN branch network (3PCNNB-Net) to classify breast cancer. The 3PCNNB-Net was separated into three steps. The first step was mainly feature extraction. There were three parallel CNN to extract features. The three CNN models were the same. The second step was to use the average layer to merge the extracted features. The flattened layer, BN, and softmax layer were used as the classification layer. The 3PCNNB-Net achieved 97.04% accuracy, 97.14% sensitivity, and 95.23% specificity.
Agnes et al. [146] proposed a multiscale convolutional neural network (MA-CNN) to classify breast cancer, as presented in Fig. 12  Zhang et al. [115] designed an 8-layer CNN network for breast cancer classification (BDR-CNN-GCN). This network mainly consisted of three innovations. The first innovation was that they integrated BN and dropout. Second, they use rank-based stochastic pooling (RSP) instead of general maximum or average pooling. Finally, it was combined with two layers of graph convolutional network (GCN).
Wang et al. [147] introduced a breast cancer classification model according to CNN. In this paper, they selected inception-v3 as the backbone model for feature extraction of breast cancer images. And they did transfer learning to the inception-v3. This model got 0.886 sensitivity, 0.876 specificity, and 0.9468 AUC, respectively.
Saikia et al. [148] compared different classical CNN models in breast cancer classification. These classic CNN models used in this article were VGG16, VGG19, ResNet-50, and GoogLeNet-V3. The data set contained a total of 2120 breast cancer images.
Mewada et al. [149] introduced a new CNN-based model to classify breast cancer. In this new model, they added the multi-resolution wavelet transform. Spectral features were as important as spatial features in classification. Therefore, they integrated the features extracted from Haar wavelet with spatial features. They tested the new model on the BreakHist dataset and BCC2015 and obtained 97.58% and 97.45% accuracy, respectively.
Zhou et al. [150] proposed a new model for automatically classifying benign and malignant breast cancer, as shown in Fig. 13  Vidyarthi et al. [152] introduced a classification model combining CLAHE and CNN models for microscopic imaging of breast cancer. They tested the image preprocessing using CNN and without CNN. In this paper, they selected the BreakHist data set for testing. Finally, the hybrid model of CNN can get better classification results, which produces an accuracy of about 90%.
Hijab et al. [153] used a classical CNN model (VGG16) for breast cancer classification. They did some modifications to the VGG16. First, they selected the pre-trained VGG16 as the backbone model. Then they fine-tuned the backbone model. When fine-tuning, they froze all convolution layers except the last layer. The weights were updated by using random gradient descent (SGD). Finally, the fine-tuned VGG16 yielded 0.97 accuracy and 0.98 AUC.
Kumar et al. [154] proposed a self-made CNN model for breast cancer classification. Six convolutional layers, six max-pooling layers, and two fully connected layers are used to form the selfmade CNN model. The ReLU activation function was selected in this paper. The self-made CNN model was tested on the 7909 breast cancer images and achieved 84% efficiency.
Kousalya et al. [155] compared the self-made CNN model with DensenNet201 for the classification of breast cancer. In the self-made CNN model, there were two convolutional layers, two pooling layers, one flattened layer, and two fully connected layers. They tested these two CNN models on the different learning rates and batch sizes. In conclusion, the self-made CNN models with Particle Swarm Optimization (PSO) can yield better specificity and precision.
Mikhailov et al. [156] proposed a novel CNN model to classify breast cancer. In this model, the max-pooling and depth-wise separable convolution were selected to improve the classification performance. What's more, different activation functions were tested in this paper, which were ReLU, ELU, and Sigmoid. The novel CNN model with ReLU can achieve the best accuracy, which was 85%.
Karthik et al. [157] offered a novel stacking ensemble CNN framework for the classification of breast cancer. Three stacked CNN models were made for extracting features. They designed these three stacked CNN models. The features from these three stacked CNN models were ensembled to yield better classification performance. The ensemble CNN model achieved 92.15 accuracy, 92.21% F1-score, and 92.17% recall.
Nawaz et al. [158] proposed a novel CNN model for the multi-classification of breast cancer. In this model, DenseNet was used as the backbone model. The open data set (BreakHis data set) was selected to test the proposed novel model. The novel model could achieve 95.4% accuracy for the multi-classification of breast cancer.
Deniz et al. [159] proposed a new model for breast cancer classification, which obtained transfer learning and CNN models. The pre-trained VGG16 and AlexNet were used to extract features. These extracted features from these two pre-trained CNN models would be concatenated and then fed to SVM for classification. The model can achieve 91.30% accuracy.
Yeh et al. [160] compared CNN-based CAD and feature-based CAD for classifying breast cancer based on DBT images. In the CNN-based CAD, the feature extractor was the LeNet. After experiments, the LeNet-based CAD could yield 87.12% (0.035) and 74.85% (0.122) accuracy. In conclusion, the CNN-based CAD could outperform the feature-based CAD.
Gonçalves et al. [161] tested three different CNN models to classify breast cancer, which were ResNet50, DenseNet201, and VGG16. In these three CNN models, transfer learning was used to improve classification performance. Finally, the DenseNet could get 91.67% accuracy, 83.3% specificity, 100% sensitivity, and 0.92 F1-score.
Bayramoglu et al. [162] proposed two different CNN models for breast cancer classification. The single CNN model was used to classify a malignancy. The multi-task CNN (mt_CNN) model was used to classify malignancy and image magnification levels. The single CNN model and mt_CNN model could yield 83.25% and 82.13% average recognition rates, respectively.
Alqahtani et al. [163] offered a novel CNN model (msSE-ResNet) for breast cancer classification. In the msSE-ResNet, the residual learning and different scales were used to improve the results. The msSE-ResNet can achieve 88.87% accuracy and 0.9541 AUC.
For the classification of breast cancer based on CNN, there are some limitations. When these existing methods select the large public dataset, it will take a lot of training time. Five-fold crossvalidation was used to evaluate some proposed methods in these papers. Even though some results were very good, there were still many unsatisfactory results. The details of these methods are given in Table 6.

Authors Methods Results
Lotter et al. [151] A multi-scale CNN was designed for the classification of breast cancer.
They tested the multi-scale CNN on the DDSM dataset and obtained 0.92 AUROC. Vidyarthi et al. [152] A classification method combining CLAHE, and CNN model was proposed for microscopic imaging of breast cancer.
The results showed that the hybrid model of CNN can get better classification results, which produces an accuracy of about 90%. Hijab et al. [153] A classical CNN model (VGG16) was used for breast cancer classification. They did some modifications to the VGG16.
Finally, the fine-tuned VGG16 yielded 0.97 accuracy and 0.98 AUC. Kumar et al. [154] Six convolutional layers, six max-pooling layers, and two fully connected layers are used to form the self-made CNN model.
The self-made CNN model was tested on the 7909 breast cancer images and achieved 84% efficiency. Kousalya et al. [155] The self-made CNN model was compared with DensenNet201 for the classification of breast cancer. These two CNN models were tested on different learning rates and batch sizes.
The self-made CNN models with Particle Swarm Optimization (PSO) can yield better specificity and precision. Mikhailov et al. [156] The max-pooling and depth-wise separable convolution were used in this novel CNN model to classify breast cancer. ReLU, ELU, and Sigmoid were tested in this paper.  Yeh et al. [160] The CNN-based CAD and feature-based CAD for classifying breast cancer were compared. In the CNN-based CAD, the feature extractor was the LeNet.
In conclusion, the CNN-based CAD could outperform the feature-based CAD. Gonçalves et al. [161] Three different CNN models were tested to classify breast cancer, which were ResNet50, DenseNet201, and VGG16.
DenseNet could achieve the best results and get 91.67% accuracy, 83.3% specificity, 100% sensitivity, and 0.92 F1-score. Bayramoglu et al. [162] Two different CNN models were proposed for breast cancer classification. The single CNN model was used to classify a malignancy. The multi-task CNN (mt_CNN) model was used to classify malignancy and image magnification levels.
The single CNN model and mt_CNN model could yield 83.25% and 82.13% average recognition rates, respectively.
Alqahtani et al. [163] A novel CNN model (msSE-ResNet) for breast cancer classification. In the msSE-ResNet, the residual learning and different scales were used to improve the results.

Breast Cancer Detection
We will review the detection of breast cancer based on CNN in this section [164]. Researchers use the CNN model to detect candidate lesion locations in breast images. Melekoodappattu et al. [11] introduced a framework for breast cancer detection. The framework was mainly composed of CNN and image texture attribute extraction. They designed a 9-layer CNN model. In the extraction phase, they defined texture features and used Uniform Manifold Approximation and Projection (UMAP) to reduce the dimension of features. Then the multi-stage features were integrated for detection. They tested this model on two data sets which were MIAS and DDSM. This model obtained 98% accuracy and 97.8% specificity for the MIAS data set, and 97.9% accuracy and 98.3% specificity for the DDSM data set.
Zainudin et al. [170] designed three CNN models for mitosis and amitosis in breast cell detection. The layers of these three CNN were 6, 13, and 17, respectively. Experiments showed that the 17-layer CNN model achieved the best performance. Finally, the model achieved a 15.50% loss, 80.55% TPR, 84.49% accuracy, and 11.66% FNR.
Wu et al. [171] introduced a deep fused fully convolutional neural network (FF-CNN) for breast cancer detection. They selected the AlexNet model as the backbone model. They combined different levels of features to improve detection results. They used a multi-step fine-tuning method to reduce overfitting problems. The FF-CNN was tested on ICPR 2014 data set and obtained better detection accuracy and faster detection speed.
Gonçalves et al. [172] introduced a new framework for breast cancer detection. This new framework used two bionic optimization techniques to optimize the CNN model, which were particle swarm optimization and genetic algorithm. The authors used three CNN models, which were DenseNet-201, VGG-16, and ResNet-50. Experiments showed that the optimized network detection results were significantly improved. The F1 score of VGG-16 was increased from 0.66 to 0.92 and the F1 score of ResNet-50 was increased from 0.83 to 0.90. The F1 values of the three optimized networks were higher than 0.90.
Guan et al. [173] proposed two models to detect breast cancer. The first method was to train images by Generative Adversarial Network (GAN) and then put the trained images into CNN for experiments. The accuracy of this model was 98.85%. The second model was that they first select the VGG-16 model as the backbone model and then transferred the backbone model. The accuracy of this method was 91.48%. The authors combined the two methods, but the results of the combined model were not ideal.
Hadush et al. [174] proposed the breast mass abnormality detection model with CNN to reduce the artificial cost. Extracting features was completed by CNN. Then these features were input into the Region Proposed Network (RPN) and Region of Interest (ROI) of fast R-CNN for detection. Finally, the method achieved 92.2% AUC-ROC, 91.86% accuracy, and 94.67% sensitivity.
Huang et al. [175] presented a lightweight CNN model (BM-Net) to detect breast cancer. The lightweight CNN model consisted of MobileNet-V3 and bilinear structure. The MobileNet-V3 was the backbone model to extract the features. To save resources, they just replaced the fully connected layer with a bilinear structure. The BM-Net could achieve 0.88 accuracy and 0.71 score.
Mahbub et al. [176] proposed a novel model to detect breast cancer. They designed a CNN model, which consisted of six convolutional layers, five max-pooling layers, and two dense layers. The proposed model was composed of the designed CNN model and the fuzzy analytical hierarchy process model. The proposed model can get 98.75% accuracy to detect breast cancer.
Prajoth SenthilKumar et al. [177] used a pre-trained CNN model for the detection and analysis of breast cancer. They selected the VGG16 model as the backbone model. They detected breast cancer from the histology images based on the variability, cell density, and tissue structure. The model could get 88% accuracy.
Charan et al. [178] designed a 16-layers CNN model for the detection of breast cancer. The designed CNN model consisted of six convolution layers, four average-pooling layers, and one fully connected layer. The public data set (Mammograms-MIAS data set) was used for training and testing. The designed CNN model can achieve 65% accuracy.
Alanazi et al. [179] offered a novel CNN model for the detection of breast cancer. They designed a new CNN model and used three different classifiers to detect breast cancer. Three classifiers were K-nearest neighbor, logistic regression, and support vector machines, respectively. This new model can achieve 87% accuracy, which improved 9% accuracy than other ML methods.
Gonçalves et al. [180] presented a novel model to detect breast cancer. They proposed a new random forest surrogate to get better parameters in the pre-trained CNN models. The random forest surrogate was made of particle swarm optimization and genetic algorithms. Three pre-trained CNN models were used in this paper, which was ResNet50, DenseNet201, and VGG16. With the help of the proposed random forest surrogate, the F1-scores of DenseNet201 and ResNet50 could be improved from 0.92 to 1, and 0.85 to 0.92, respectively.
Guan et al. [181] applied the generative adversarial network (GAN) to generate more breast cancer images. The regions of interest (ROIs) form images to train GAN. Some augmentation methods were used to compare with GAN, such as scaling, shifting, rotation, and so on. They designed a new CNN model as the classifier. After experiments, the GAN can yield around 3.6% better than other transformations on the image augmentation.
Sun et al. [182] were inspired by human detection to propose a novel model for breast cancer detection based on the mammographic image. The mathematical morphology method was used to preprocess the images. The image template matching method was selected to locate the suspected regions of a breast mass. The PSO was used to improve the accuracy. The proposed model can achieve 85.82% accuracy, 66.31% F1-score, 95.38% recall, and 50.81% precision.
Chauhan et al. [183] used different algorithms to detect breast cancer. Three different algorithms were CNN, KNN, and SVM, respectively. They compared these three algorithms on the breast cancer data set. SVM could achieve 98% accuracy, KNN can yield 73% accuracy, and CNN could get 95% accuracy.
Gupta et al. [184] proposed a modified CNN model for the detection of breast cancer. The backbone of this model was ResNet. They modified the ResNet in three steps. Firstly, they used the dropout of 0.5. Then, the adaptive average pooling and adaptive max pooling were used by two layers of BN, the dropout, and the fully connected layer. The third step was the stride for down-sampling at 3 × 3 convolution. The modified CNN model could achieve 99.75% accuracy, 99.18% precision, and 99.37% recall, respectively.
Chouhan et al. [185] designed a novel framework (DFeBCD) for detecting breast cancer. In the DFeBCD, they designed the highway network based on CNN to select features. There were two classifiers, which were SVM and Emotional Learning inspired Ensemble Classifier (ELiEC). These two classifiers were trained by the selected features. This framework was evaluated by five-fold crossvalidation and achieved 80.5% accuracy.
There are some limitations in the detection of breast cancer based on CNN. If the dataset used in the research paper is very large, a sea of computation and time is needed to complete the training. On the other hand, if the dataset used in the research paper is very small, it could cause an overfitting problem. Most of the breast cancer diagnosis model based on CNN uses the pre-trained CNN model to extract features. But at this time, which layer has the best feature? Which layer of features should we extract? The summary of CNN for breast cancer detection is shown in Table 7. Chiao et al. [168] A mask region detection method was established based on CNN. This method detected the lesion of breast cancer based on ultrasound images.
Finally, this method achieved 0.75 average precision in detection and 85% accuracy in classification. Das et al. [169] A Deep Multiple Instance Learning (MIL) was designed based on the CNN model for breast cancer detection.
The MIL-CNN model achieved 96.63%, 93.06%, and 95.83% accuracy on the IUPHL, BreakHis, and UCSB data sets, respectively. Melekoodappattu et al. [11] They proposed the 9-layer CNN method to detect breast cancer. Then, they defined texture features and used Uniform Manifold Approximation and Projection (UMAP) to reduce the dimension of features. The multi-stage features were integrated for detection.
This model obtained 98% accuracy and 97.8% specificity for the MIAS data set, and 97.9% accuracy and 98.3% specificity for the DDSM data set. (Continued)

Authors
Methods Results Zainudin et al. [170] They designed three CNN models for mitosis and amitosis in breast cell detection. The layers of these three CNN were 6, 13, and 17, respectively.
Experiments showed that the 17-layer CNN model achieved the best results. Finally, the model achieved a 15.50% loss, 80.55% TPR, 84.49% accuracy, and 11.66% FNR. Wu et al. [171] A deep fused fully convolutional neural network (FF-CNN) was designed for breast cancer detection. They selected the AlexNet model as the backbone model and combined different levels of features.
The FF-CNN was tested on ICPR 2014 data set and obtained better detection accuracy and faster detection speed.
Gonçalves et al. [172] This new framework used particle swarm optimization and genetic algorithm to optimize the CNN model. DenseNet-201, VGG-16, and ResNet-50 were used as the backbone model.
The F1 score of VGG-16 was increased from 0.66 to 0.92 and the F1 score of ResNet-50 was increased from 0.83 to 0.90. The F1 values of the three optimized networks were higher than 0.90. Guan et al. [173] Two methods were proposed to detect breast cancer. The first method was to train images by Generative Adversarial Network (GAN) and then put the trained images into CNN for experiments. The second method was that they first select the VGG-16 model as the backbone model and then transferred the backbone model. The BM-Net could achieve 0.88 accuracy and 0.71 score. (Continued)

Authors Methods Results
Mahbub et al. [176] The proposed model was composed of the designed CNN model and the fuzzy analytical hierarchy process model. The designed CNN model consisted of six convolutional layers, five max-pooling layers, and two dense layers.
The proposed model can get 98.75% accuracy to detect breast cancer.
Prajoth SenthilKumar et al. [177] The VGG16 model was selected for the detection and analysis of breast cancer. They detected breast cancer from the histology images based on the variability, cell density, and tissue structure.
The model could get 88% accuracy on the testing data set.
Charan et al. [178] They designed a 16-layers CNN model for the detection of breast cancer, which consisted of six convolution layers, four average-pooling layers, and one fully connected layer. The public data set (Mammograms-MIAS data set) was used for training and testing.
The designed CNN model can achieve 65% accuracy.
Alanazi et al. [179] They designed a new CNN model and used three different classifiers to detect breast cancer, which were K-nearest neighbor, logistic regression, and support vector machines, respectively.
This new model can achieve 87% accuracy, which improved 9% accuracy than other ML methods.
Gonçalves et al. [180] They proposed a new random forest surrogate to get better parameters in the pre-trained CNN models, which were made of particle swarm optimization and genetic algorithms. Three pre-trained CNN models were used in this paper, which were ResNet50, DenseNet201, and VGG16.
With the help of the proposed random forest surrogate, the F1-scores of DenseNet201 and ResNet50 could be improved from 0.92 to 1, and 0.85 to 0.92, respectively.
Guan et al. [181] The Generative Adversarial Network (GAN) was applied to generate more breast cancer images. The Regions of Interest (ROIs) form images to train GAN. Some augmentation methods were used to compare with GAN, such as scaling, shifting, rotation, and so on. They designed a new CNN model as the classifier.
After experiments, the GAN can yield around 3.6% better than other transformations on the image augmentation.
(Continued) Chauhan et al. [183] Three different algorithms were used to detect breast cancer, which were CNN, KNN, and SVM, respectively.
SVM could achieve 98% accuracy, KNN can yield 73% accuracy, and CNN could get 95% accuracy. Gupta et al. [184] A novel modified CNN model was proposed for the detection of breast cancer. They modified the ResNet in three steps. Firstly, they used the dropout of 0.5. Then, the adaptive average pooling and adaptive max pooling were used by two layers of BN, the dropout, and the fully connected layer. The third step was the stride for down-sampling at 3 × 3 convolution.
Chouhan et al. [185] A novel framework (DFeBCD) was designed for detecting breast cancer. In the DFeBCD, they designed the highway network based on CNN to select features. There were two classifiers, which were SVM and Emotional Learning inspired Ensemble Classifier (ELiEC). These two classifiers were trained by the selected features.
This framework was evaluated by five-fold cross-validation and achieved 80.5% accuracy.

Breast Cancer Segmentation
In this chapter, we will review the segmentation of breast cancer based on CNN. The abnormal areas in breast images would be segmented based on the CNN model. Breast cancer image segmentation compares the similarity of feature factors between images and divides the image into several regions. Breast segmentation involves the removal of background region, pectoral muscles, labels, artifacts, and other defects add during image acquisition. The segmented area could be compared with the manually segmented area to verify the accuracy of the segmentation method.
Chen et al. [186] introduced a new model for the segmentation of breast cancer. This new framework mainly consisted of two steps. The first step was the segmentation CNN model. Another part was the structure of the QA network based on the ResNet-101 model. A structure was used to predict the quality of each slice. Another structure gave the DSC value.
Tsochatzidis et al. [6] introduced a new CNN model to segment breast masses. In this new CNN model, the convolution layer of each layer was modified. The loss function was also modified by adding an extra term. They evaluated the method on DDSM-400 and CBIS-DDSM datasets.
Lei et al. [56] developed a mask score region based on the R-CNN to segment breast tumors. The network consisted of five parts, namely, the regional suggestion network, the mask terminal, the backbone network, the mask scoring header, and the regional convolution neural network header. In this R-CNN model, the region of interest (ROI) was segmented by using the network blocks between module quality and region categories to build a direct correlation integration.
El Adoui et al. [187] proposed two CNN models to segment breast tumors in dynamic contrastenhanced magnetic resonance imaging (DCE-MRI). The first CNN model was based on SegNet, as presented in Fig. 16. The second model was to select U-Net as the backbone model. 85% of the data sets were used for training, and the other 15% were used for validation. The first method obtained 68.88% IoU, and the second method obtained 76.14% IoU. Kumar et al. [189] introduced a dual-layered CNN model (DL-CNN) for breast cancer region recognition and segmentation. The first layer CNN was used to identify the possible region. The second layer CNN was used to segment and reduce false positive. They tested the model on breast image data sets and obtained 0.9726 at 0.39706 for True Positive Rate at False-positive per image.
Ranjbarzadeh et al. [90] proposed a new CNN with multiple feature extraction paths for the segmentation of breast cancer (MRFE-CNN), as shown in Fig. 17  He et al. [193] proposed a novel network with the CNN model and transferring learning to classify and segment breast cancer. In this paper, two CNN models (AlexNet and GoogleNet) were selected as the backbone models. These two CNN models were used as the feature extractors and SVM was selected as the classifier. The segmentation of this model in breast cancer was similar to professional pathologists.
Soltani et al. [194] introduced a new model for automatic breast cancer segmentation. This method was based on the Mask RCNN. The backbone model used in this paper was detectron2. The model was tested on the INbreast data set and got 81.05% F1 and 95.87% precision.
Min et al. [195] introduced a new system (fully integrated CAD) for the automatic segmentation of breast cancer. The new system was composed of the detection-segmentation method and pseudo-color image generation. The detection-segmentation method was mainly with Mask RCNN. The public INbreast data set was chosen to test the new system. This system yielded a 0.88 Dice similarity index.
Arora et al. [196] proposed a model (RGU-Net) for breast cancer segmentation. The RGU-Net consisted of residual connection and group convolution in U-Net. There were several different sizes of encoder and decoder blocks. The conditional random field was selected to analyze the boundaries. The model was evaluated on the INbreast data set and produced 92.6% Dice.
Spuhler et al. [197] introduced a new CNN method (DCE-MRI) to segment breast cancer. The manual regions of interest were completed by the expert radiologist (R1). R2 and R3 were finished by a resident and another expert radiologist. Finally, the new model 0.71 Dice by using R1.
Atrey et al. [198] proposed a customized CNN for the segmentation of breast cancer based on MG and US. There were nine layers in this customized CNN model. Two convolutional layers, one max-pooling layer, one ReLU layer, one fully connected layer, one softmax layer, and a classification layer formed the whole customized CNN model. This model achieved 0.64 DSC and 0.53 JI for MG and 0.77 DSC and 0.64 JI for the US.
Sumathi et al. [199] proposed a new system to segment breast cancer. They used artificial bee colony optimization with fuzzy clustering to select features. Then, CNN was used as the classifier. This hybrid system could achieve 98% segmentation accuracy.
Xu et al. [200] designed an 8-layer CNN for the segmentation of breast cancer. This customized 8-layer CNN model consisted of 1-3 convolution layers, 1-3 pooling layers, a fully connected layer, and a softmax layer. This customized CNN model yielded 85.1% JSI.
Guo et al. [201] proposed a novel network to segment breast cancer. They designed a 6-layers CNN model, which consisted of two convolutional layers, two pooling layers, and two fully connected layers. The features were extracted by the customized CNN model and then fed to SVM. The proposed combined CNN-SVM achieved 0.92, 0.93, and 0.95 on the sensitivity index, DSC coefficient, and PPV.
Cui et al. [202] proposed a novel patch-based CNN model for the detection of breast cancer based on MRI. They designed a 7-layer CNN model, which consisted of four convolutional layers, two maxpooling layers, and one fully connected layer. The 7-layer CNN model achieved a 95.19% Dice ratio.
For the segmentation of breast cancer based on CNN, there are some limitations. These methods selected public datasets for experiments. But these public datasets need many expert doctors to label these images. What's more, the application of unsupervised learning technology in the segmentation of breast cancer is not very good. The summary of CNN for breast cancer segmentation is shown in Table 8.  Lei et al. [56] A mask score region based on the R-CNN was proposed to segment breast tumors. The network consisted of five parts, namely, the regional suggestion network, the mask terminal, the backbone network, the mask scoring header, and the regional convolution neural network header. (Continued)

Authors Methods Results
Atrey et al. [190] A new computer-aided automatic segmentation system was designed for breast lesions, which was mainly based on their self-made CNN model. He et al. [193] Two CNN models (AlexNet and GoogleNet) were selected as the backbone models to classify and segment breast cancer.
The segmentation of this model in breast cancer was similar to professional pathologists.
Soltani et al. [194] A new method was designed for breast cancer segmentation with the Mask RCNN.
The method was tested on the INbreast data set and achieved 81.05% F1 and 95.87% precision. Min et al. [195] A new system (fully integrated CAD) was designed for the automatic segmentation of breast cancer, which was composed of the detection-segmentation method and pseudo-color image generation.
This system yielded a 0.88 Dice similarity index.
Arora et al. [196] A model (RGU-Net) was designed for breast cancer segmentation, which was composed of residual connection and group convolution in U-Net.
The model was evaluated on the INbreast data set and produced 92.6% Dice.
Spuhler et al. [197] A new CNN model (DCE-MRI) was designed to segment breast cancer.
The new model achieved 0.71 Dice by using R1. Atrey et al. [198] A customized CNN was proposed for the segmentation of breast cancer based on MG and US. There were nine layers in this customized CNN model. Two convolutional layers, one max-pooling layer, one ReLU layer, one fully connected layer, one softmax layer, and a classification layer formed the whole customized CNN model.

Authors Methods Results
Sumathi et al. [199] A new system was proposed to segment breast cancer. They used artificial bee colony optimization with fuzzy clustering to select features. Then, CNN was used as the classifier.
This hybrid system could achieve 98% segmentation accuracy.
Xu et al. [200] An 8-layer CNN was designed for the segmentation of breast cancer. This customized 8-layer CNN model consisted of 1-3 convolution layers, 1-3 pooling layers, a fully connected layer, and a softmax layer. This customized CNN model yielded 85.1% JSI.
Guo et al. [201] A novel network was proposed to segment breast cancer. They designed a 6-layers CNN model, which consisted of two convolutional layers, two pooling layers, and two fully connected layers. The features were extracted by the customized CNN model and then fed to SVM.
The proposed combined CNN-SVM achieved 0.92, 0.93, and 0.95 on the sensitivity index, DSC coefficient, and PPV.
Cui et al. [202] A novel patch-based CNN model was proposed for the detection of breast cancer based on MRI. They designed a 7-layer CNN model, which consisted of four convolutional layers, two max-pooling layers, and one fully connected layer.
The 7-layer CNN model achieved a 95.19% Dice ratio.

Conclusion
Recently, the diagnosis of breast cancer based on CNN has made rapid progress and success. This also makes more and more researchers devote more energy to a breast cancer diagnosis with CNN. We complete a comprehensive review of the diagnosis of breast cancer based on CNN after reviewing a sea of recent papers. In this paper, readers can not only see the CNN-based diagnostic methods for breast cancer in recent decades but also know the advantages and disadvantages of these methods and future research directions. The main contributions of this survey: (i) A sea of major papers about the diagnosis of breast cancer based on CNN is reviewed in this paper to provide a comprehensive survey; (ii) This survey presents the advantages and disadvantages of these state-of-the-art methods; (iii) A presentation of significant findings gives readers the opportunities available in the research interest; (iv) We give the future research direction and critical challenges about the CNN-based diagnostic methods for breast cancer.
Based on the papers we have reviewed, many techniques have been used to boost their proposed CNN models for the diagnosis of breast cancer. Many researchers used pre-trained CNN models or their customized CNN models to extract features from input. To reduce the training time and computational cost, some researchers replace some last layers of CNN models with other techniques, such as SVM, ELM, and so on. In some papers, researchers would select more than one CNN models to extract different features. Then, these different features would be ensembled and fed to classifiers for improving performance.
Although this breast cancer diagnosis with CNN has achieved great success, there are still some limitations. (i) There are too few good data sets. A good public breast cancer dataset needs to involve many aspects, such as professional medical knowledge, privacy issues, financial issues, dataset size, and so on. (ii) When the data set is too large, the CNN-based model needs a sea of computation and time to complete the detection. (iii) It is easy to cause overfitting when using small data sets. (iv) Most of the breast cancer diagnosis model based on CNN uses the pre-trained CNN model to extract features. But at this time, which layer has the best feature? Which layer of features should we extract? These problems have not been well solved in recent studies.
Even though this paper reviews a sea of recent research papers, there are still some limitations. First, this survey only pays attention to CNN for breast cancer diagnosis. There are some other CAD methods for breast cancer diagnosis. Second, this survey only focuses on two-dimensional images.
In the future, researchers can try more unlabeled data sets for breast cancer detection. Compared with labeled datasets, unlabeled datasets are less expensive and more numerous. What's more, researchers can try more new methods for image feature extraction, such as EL, TL, xDNNs, U-Net, transformer, and so on. This paper reviews the CNN-based breast cancer diagnosis technology in recent years. With the progress of CNN technology, the diagnosis accuracy of breast cancer is getting higher and higher. We summarize the limitations and future research directions of CNN-based breast cancer diagnosis technology. Although breast cancer diagnosis technology based on CNN has achieved great success and can be used as an auxiliary means to help doctors diagnose breast cancer, there is still much to be improved.

Conflicts of Interest:
The authors declare that they have no conflicts of interest to report regarding the present study.