Breast Lesions Detection and Classification via YOLO-Based Fusion Models

: With recent breakthroughs in artificial intelligence, the use of deep learning models achieved remarkable advances in computer vision, ecom-merce, cybersecurity, and healthcare. Particularly, numerous applications provided efficient solutions to assist radiologists for medical imaging analysis. For instance, automatic lesion detection and classification in mammograms is still considered a crucial task that requires more accurate diagnosis and precise analysis of abnormal lesions. In this paper, we propose an end-to-end system, which is based on You-Only-Look-Once (YOLO) model, to simultaneously localize and classify suspicious breast lesions from entire mammograms. The proposed system first preprocesses the raw images, then recognizes abnormal regions as breast lesions and determines their pathology classification as either mass or calcification. We evaluated the model on two publicly available datasets, with 2907 mammograms from the Curated Breast Imaging Sub-set of Digital Database for Screening Mammography (CBIS-DDSM) and 235 mammograms from INbreast database. We also used a privately collected dataset with 487 mammograms. Furthermore, we suggested a fusion models approach to report more precise detection and accurate classification. Our best results reached a detection accuracy rate of 95.7%, 98.1% and 98% for mass lesions and 74.4%, 71.8% and 73.2% for calcification lesions, respectively on CBIS-DDSM, INbreast and the private dataset.

With the increasing number of breast mammograms and enhancement of computational capacity of computers, different deep learning models have been widely implemented to offer a better alternative. They aim to automatically extracting deep and high-level features directly from raw images without knowledge requirement [18]. This helped to improve results of automated systems and maintain a good tradeoff between precision of lesions detection and accuracy of distinguishing between different types of lesions from a simple mammogram [19][20][21][22]. Deep learning models have the ability to extract deep and multiple-scaled features, and combine them to assist experts to make the final decision. Accordingly, their strength to adapt to different cases has been proved for objects detection and classification tasks in many applications [23][24][25][26]. This resulted in many state-of-the-art models that were proved outstanding success on natural and medical images. These models were evolved from a simple Convolutional Neural Networks (CNNs) model to become other variations such as R-CNNs, Fast CNNs and Faster R-CNNs models [27][28][29]. These popular models have overcome many limitations of deep learning such as computational time, redundancy, overfitting and parameters size. However, training and implementing most of these models is often time-consuming and requires a high computational memory. Therefore, another variation called You-Only-Look-Once (YOLO), which is characterized with a low-memory dependence, has been recognized as a fast object detection model and suitable for CAD systems [30][31][32][33][34][35][36].
In this study, we propose an end-to-end system that is based on the YOLO-based model to simultaneously detect and classify breast lesions into mass tumors or calcification. Our approach contributes a new feature, which is an end-to-end system that can recognize both types of suspicious lesions whether only one type exists in an image or both simultaneously appear in the same image. As the choice of YOLO model was stated earlier, this implementation will also serve as a base for future tasks in order to present a complete breast cancer diagnostic framework (i.e., lesions segmentation and malignancy prediction, etc.). The performance of this prerequisite step was proved on different mammography datasets using deep learning methodologies (i.e., data augmentation, early stopping, hyperparameters tuning and transfer learning). An additional contribution was presented in this paper to boost the lesions detection and classification performance as follows. As the performance varies according to the input data of the model, single evaluation results were first reported over the variations of images, then different fusion models were developed to increase the final detection accuracy rate and join models with different settings. This will help to keep the best detected bounding boxes and remove the bad predictions that can mislead the future diagnostic tasks. The proposed methodology was performed on two most widely used datasets: CBIS-DDSM and INbreast, and also on an independent private dataset. The outcome of this work will justify the performance of the YOLO-based model for deep learning lesion detection and classification on mammography. Furthermore, it will present as a comparative study of YOLO-based model performance using different mammograms.
The rest of the paper is organized as follows. First, the literature review of breast lesion detection and classification using deep learning is introduced in Section 2. In Section 3, details of our methodology are presented, including a description of YOLO-based model architecture and the suggested fusion models approach, followed by details about the used breast cancer datasets and preprocessing techniques. Then, in Section 4, we discuss the hyperparameters tuning applied for training the model, and present experimental results that are compared with other works. We conclude the paper in Section 5 with a discussion about our proposed methodology and future works.

Literature Review
Since the development of machine learning technology, many applications have given more attention in adopting deep learning to solve complex problems, particularly in the fields of computer vision, image recognition, object detection [17][18][19] and segmentation [30][31][32][33][34][35]. Many studies showed that traditional techniques have failed to provide highly accurate models due to the limitation of hand-crafted features extracted from raw images. Indeed, traditional CAD systems that were proposed for breast lesions detection and classification could not overcome the huge variations in lesions size and texture, compared to deep learning methods [36][37][38]. Therefore, numerous CAD systems were successfully developed using deep learning architectures to improve the detection and classification of organs lesions such as liver lesions, lung nodules and particularly breast lesions [39,40].
Researchers have demonstrated the feasibility of regional-based models to build an end-to-end system for detecting and classifying malignant and benign tumors in the INbreast mammograms and achieved a detection rate of 89.4% [41]. The same idea was also presented in a recent work by Peng et al. [42] that introduced an automated mass detection approach, which integrated Faster R-CNN model and multiscale-feature pyramid network. The method yielded a true positive rate of 0.93 on CBIS-DDSM and 0.95 on INbreast dataset. Accordingly, Al-Antari et al. [43] employed YOLO model for breast masses detection that reported a detection accuracy of 98.96%. The output served after that for mass segmentation and recognition in order to provide a fully integrated CAD system for digital X-ray mammograms. Another work by Al-Antari et al. [44] in 2020 improved the results of the breast lesions detection and classification by adopting first the YOLO model for detection and then compared feedforward CNN, ResNet-50, and InceptionResNet-V2 for classification. Similarly, Al-masni et al. [45] proposed a CAD system framework that first detected breast masses using YOLO model with an overall accuracy of 99.7%, and then classified them into malignant and benign using Fully Connected Neural Networks (FC-NNs) with an overall accuracy of 97%.
Deep convolutional neural networks (DCNN) was also suggested for mammographic mass detection by using transfer learning strategy from natural images [46]. In 2018, a work presented by Ribli et al. [47] proposed a CAD system based on Faster R-CNN framework to detect and classify malignant and benign lesions and obtained an AUC score of 0.95 on INbreast dataset. Another work employed fully convolutional network (FCN) with adversarial learning in an unsupervised fashion to align different domains while conducting mass detection in mammograms [48].
Since breast tumors detection is a crucial step that remains a challenge for CAD systems, many reliable models were used to support this automatic diagnosis. For example, Singh et al. relied on Single Shot Detector (SSD) model to localize tumors in mammograms, and then extracted output boxes to apply segmentation and classification tasks [49]. This yielded sufficient true positive rate of 0.97 on INbreast dataset. Other recent studies proposed using YOLO model to achieve a better performance in detecting bounding boxes surrounding breast tumors. For example, Al-masni et al. [50] presented a YOLO-based CAD system that achieved an overall accuracy of 85.52% on DDSM dataset.
Tumor localization task was conducted in a detection framework for cancer metastasis using a patch-based classification stage and a heatmap-based post-processing stage [51]. This achieved a score of 0.7051 and served for whole slide image classification. Breast tumors detection was also addressed in 2016 by Akselrod-Ballin et al. [52] where images were divided into overlapped patches and fed into a cascaded R-CNN model to first detect masses and then classify them into malignant or benign. In 2015, a work presented by Dhungel et al. [53] relied on a multi-scale Deep Belief Network (DBN) to first extract all suspicious regions from entire mammograms and then filter out the best regions using Random Forest (RF). This technique achieved a true positive rate of 96%. In 2017, a work presented by Akselrod-Ballin et al. [54] developed a three-stage cascade of Faster-RCNN model to detect and classify abnormal regions in mammograms. Their overall detection and classification accuracy reached 72% and 77% on INbreast dataset.
Most of these reviewed works and their diagnosis results showed how artificial intelligence has successfully contributed to solve the challenge of breast cancer detection. However, practical implementation and system evaluation along with the high complexity of memory and time remain a problem to investigate. The majority of these works have tackled the problem of detecting only mass tumors in the entire breast and then classifying them into malignant and benign. Our approach was developed differently to address the task of detection and classification of two types of breast lesions (i.e., mass and calcification). We expand our methodology by presenting fusion models approach that combines predictions of different models to improve the final results.

Methods and Materials
In this study, we present an end-to-end model for simultaneous detection and classification of breast lesions in mammograms. The process uses a deep learning YOLO-based model that generates suspicious regions from the entire input breast images and classifies the type of lesions as either mass or calcification. We also propose a fusion models approach to improve the model performance and to join different learnings.

YOLO-Based Model
Object detection refers to a regression problem that maps right coordinates of images' pixels to a bounding box that surrounds a specific object. Popular regional-based neural networks models predict multiple bounding boxes and use regions to localize objects within images after being fed into a CNN that generates a convolutional feature map. This approach applies a selective search that extracts most adequate regions from images and then predicts the offset values for the final bounding boxes. Typically, this technique is experimentally slow and memory consuming, therefore a YOLO deep learning network was proposed where a single CNN predicts at the same time bounding boxes allocation and their class label probabilities from entire images. The lowcomputational aspect of YOLO comes from the fact that it does not require extracting features on sliding windows. In fact, it only uses features from the entire image to directly detect each bounding box and its class label probability.
YOLO architecture, as explained in Fig. 1, is simply based on the fully convolutional neural network (FCNN) design. Particularly, it splits each entire image into m × m grids and for each grid, B bounding boxes are returned with a confidence score and C class probabilities.

Figure 1: Proposed YOLO-based architecture
Confidence score is computed by multiplying the probability of existing class object with the intersection over the union (IoU) score as detailed in Eq. (1).
In addition, the detected object is classified as mass or calcification according to its class probability and its confidence score for that specific class label as explained below in Eq. (2).
Cclass probability = Prob (Classi | object) × IoU score (2) In this work, we adopted YOLO-V3, which is the third improved version of YOLO networks, in order to detect more different scaled object, and it uses multi-scale features extraction and detection. As shown in Fig. 1, the architecture first employs an extraction step that is based on the DarkNet backbone framework [55]. It was inspired by the ResNet architecture and VGG-16, and it presents a new design of 53 layers, as illustrated in the lowest block in Fig. 1, with skip connections in order to prevent gradients from diminishing and vanishing while propagating through deep layers. After that, the extracted features at different scales were fed into the detection part that presents three fully connected layers. After that, it applies the concept of anchor boxes that is borrowed from Faster-RCNNs model. In fact, prior boxes were pre-determined by training a K-means algorithm on the entire images. After that, the output matrixes of multi-scale features were defined as grid cells with anchor boxes. This helps to determine the IoU percentage between the defined ground-truth and anchor boxes. It also ensures selecting the boxes with best scores comparing to a certain threshold. At the end, four offsets values of bounding boxes against each anchor box were predicted with a confidence score and a class label probability. Hence, detection considered correct bounding boxes that had both scores exceeding a certain threshold [56].

Fusion Models Approach
According to the generalized YOLO-based model we presented earlier in Fig. 1, bounding boxes that surround suspicious breast lesions are detected with certain confidence score as explained in previous subsection. This score varies with the model settings, the input data fed to the model and with the internal classification step performed by YOLO to determine the class label probability score (i.e., Mass or Calcification). Based on this hypothesis, evaluation of such a model can be expanded to improve the final predictions result.
In this work, we suggested first selecting the best predicted bounding boxes within all augmented images (i.e., rotated, transformed, translated, etc.) according to their IoU score. This helped to determine the best representative mammograms to correctly localize and classify breast lesions. Second, we suggested joining different predictions of the model's implementation in order to lower the error rate and combine performance of differently configured models. These models were trained and configured differently to finally create a fusion-based model dedicated for best performance.
In fact, we note that Model1, referred as M 1 , is trained and configured differently for one class targeting either Mass or Calcification. Therefore, the two developed models from M 1 are now referenced as M 1 (Mass) for Mass class and M 1 (Calcification) for Calcification class. Model2, referred as M 2 , is configured for multi-class training and identification and used for fusion to improve the performance of single-class models. The model M 2 will now be identified as M 2 (Mass and Calcification) since it targets multiple classes.
After developing and testing each model M i , our proposed fusion approach is to create a fusion model for Mass class using M 1 (Mass) and for Calcification class using M 1 (Calcification), while benefiting from the M 2 (Mass and Calcification) to improve the performance of the M 1 models.
We first report the Mass predictions1 using M 1 (Mass) that have IoU score more than threshold1. Next, we select only images with Mass lesions and report their predictions using M 2 (Mass and Calcification) and another threshold2. After that, we filter out predicted images that are not within the Mass predictions1 and save them as Mass predictions2. We finally combine the two predictions into final Mass predictions as shown in Fig. 2. We repeat the same logic for Calcification predictions according to the flow in Fig. 2. In all our fusion models, we used a threshold1 to be 0.5 and threshold2 to be 0.35 that yielded satisfying results.

Datasets
In this study, the CBIS-DDSM and INbreast public datasets were used in our experiments to train and evaluate the proposed methodology. We also evaluated the performance with a small private dataset with different cases. [57] is an updated and standardized version of the of Digital Database for Screening Mammography (DDSM) dataset, where images were converted from Lossless Joint Photographic Experts Group (LJPEG) to Digital Imaging and Communications in Medicine (DICOM) format. It was reviewed by radiologists after eliminating inaccurate cases and confirmed with the histopathology classification. It contains 2907 mammograms from 1555 patients and it is organized in two categories of pathology: Mass images (50.5%) and Calcification images (49.5%).

CBIS-DDSM
Mammograms were collected with two different views for each breast (i.e., MLO and CC). Images have average size of 3000 × 4800 pixels and are associated with their pixel-level ground-truth for suspicious regions location and type. All mammograms may have one or multiple lesions with different sizes and locations. Besides, our experimental datasets have different resolution and capture quality, which can be observed visually from Fig. 3, and this is due to the different modality that was used to extract mammograms. Consequently, performance results varied as demonstrated using multiple testsets.

Data Preparation
Mammograms were collected using the scanning technique of digital X-ray mammography that usually compresses the breast. This may generates deformable breast regions and degrades the quality of mammography images [59,60]. Therefore, some preprocessing steps should be applied to correct the data and remove additional noise [44,45]. In this work, we applied histogram equalization only on the CBIS-DDSM and the private dataset to enhance any compressed region and create a smooth pixels-equalization that helps distinguishing suspicious regions from the normal regions. We did not enhance the INbreast dataset as it was correctly acquired using the Full Field Digital Mammography (FFDM) and thus its quality is satisfying. Furthermore, our suggested YOLO-based model requires mammograms and the coordinates of regions of interest (ROI) that surrounds breast lesions. According to the existing ground-truth that represent experts' annotations, we extracted the lesions coordinates represented in x, y, width and height and the class (mass or calcification). Next, mammograms were resized using a bi-cubic interpolation over 4 × 4 neighborhood. For experimental reasons, we used images sizes of 448 × 448 because the input size should be divisible by 32 according to DarkNet backbone architecture of YOLO-V3, and this size should also fit on the GPU memory.
Training deep learning models requires a large amount of annotated data that helps maintaining its generalization aspect. For medical applications, most of the collected datasets have small number of instances and often suffer from an imbalanced distribution, which remains a challenge for training deep learning models [61]. To overcome this problem, two solutions were recently employed in many studies: data augmentation and transfer learning. Data augmentation offers a process of increasing experimentally size of the dataset [2,8,10,12,18,39,43,45]. In this paper and for the particular detection task, we augmented the original mammograms six times. First we rotated original images with the angles θ = {0 • , 90 • , 180 • , 270 • } and we transformed them using Contrast Limited Adaptive Histogram Equalization (CLAHE) method [62] with two variations {tile grid size of (4, 4) and a contrast threshold of 40, tile grid size of (8, 8) and a contrast threshold of 30}. Thus, a total of 18.909, 1410, and 2922 mammograms were respectively collected for CBIS-DDSM, INbreast, and the private dataset to train and test the proposed model.
Deep learning models start with initializing the trainable parameters (i.e., weights, bias). To do that, there are two commonly adopted methods: random initialization and transfer learning [2,10,19,43,45,49,63,64]. In our study, we only relied on transfer learning technique by using the weights of a pre-trained model on a larger annotated dataset (i.e., ImageNet, MSCOCO, etc.) and then we re-trained and fine-tuned the new weights on our specific task and augmented dataset. This helped to accelerate the convergence and avoid overfitting problems. Hence, we used the weights that were trained using the DarkNet backbone framework on the MSCOCO dataset. The pre-trained model architecture was originally based on the VGG-16 model.

Experiments and Results
All experiments using the proposed deep learning model were conducted on a PC with the following specifications: Intel(R) Core (TM) i7-8700K processor with 32 GB RAM, 3.70 GHz frequency, and one NVIDIA GeForce GTX 1090 Ti GPU.

Evaluation Metrics
In this study, we used only object detection and classification measures to evaluate the performance of our YOLO-based model. To ensure the true detection of breast lesions in the mammograms, we first measured the intersection over union (IoU) score between each detected box and its ground-truth, and then we tested if it exceeded a particular confidence score threshold that will be discussed later. Eq. (3) details the IoU score formula.
IoU score = Area of Intersection Area of Union (3) We also relied on another objective measure that considered the predicted class probability of true detected boxes. Inspired by the work [65], we computed the number of true detected masses and calcifications over the total number of mammograms as defined in Eq. (4).

Detection accuracy rate =
True detected cases Total number of cases (4) This means we excluded cases having a lower IoU score before computing the final detection accuracy rate. Indeed, predicted boxes that had confidence probability scores equal or greater than the confidence score threshold, were only considered for computing the final detection accuracy rate. We measured the detection accuracy rate globally and for each independent class to evaluate the performance of the simultaneous detection and classification.

Hyperparameters Tuning
The proposed YOLO-based model presents a list of hyperparameters that includes learning rate, number of epochs, dropout rate, batch size, number of hidden units, confidence score threshold and so on. Considering their effect on the model performance, only three hyperparameters were selected for the tuning. For all datasets, we randomly split all mammograms for each class into groups of 70%, 20%, and 10% respectively for training, testing, and validation sets.
In each experiment, trainable parameters were fixed and each hyperparameter was varied. For all experimental datasets, we used Adam as optimizer, and all experiments were reported using the detection accuracy rate. First, we set the learning rate to 0.001, number of epochs to 100 and the batch size to 64 according to the work [45], and then we trained the model with different confidence score thresholds until we report the value that provided satisfying detected objects for further tasks (i.e., segmentation and shape classification). As shown in Fig. 4a, the best confidence score value for all datasets is 0.35 to accept all detected objects the model confident from them by more than 35%. Next, we repeated the experiments but we varied learning rate values to report the best detection accuracy rate for all datasets as shown in Fig. 4b. In addition, the early stopping strategy for the second half of iterations was used to reduce the learning rate by 10% if the loss function did not decrease every 10 epochs. Next, we selected the best learning rate which is 0.001 and we varied the batch size to report the best results for the three datasets as illustrated in Fig. 4c. Finally, we set the learning rate to be 0.001 and batch size to be 16, and we varied the number of epochs until all datasets reported the best performance for 100 epochs as shown in Fig. 4d.

Results
Different experiments were conducted to assess the effect of varying input images data and target classes (i.e., mass, calcification) of our suggested YOLO-based model. Furthermore, additional experiments were conducted for the fusion models approach to improve the results.

Single Models Evaluation
The breast lesions detection and classification model was trained differently over the mammography datasets. We varied the input data fed to the model and configured the classification to be with multiple classes using M 2 . Performance of the model is reported in Tab. 1.
Results show the advantage of data augmentation and resize over the original mammography datasets. In fact, the performance increased with 10% for CBIS-DDSM dataset with almost half of inference time. Similarly, the model achieved a better detection accuracy rate with more than 6.5% and 40% less inference time. The same improvement with 29.6% is noticed on the private dataset with a 28% drop in inference time. Accordingly, using the augmented and resized datasets, we varied the prediction classes by training M 1 independently on Mass and Calcification, and M 2 on both, and results are reported in Tab. 2 below.  Results show that Private dataset had the highest performance comparing with the public datasets and this can be explained with the good resolutions and the easy localization of most of the lesions in those mammograms. Moreover, the public datasets had more deteriorated lesions that are harder to simultaneously detect and classify.
Accordingly, results in Tab. 2 show the clear ability of the YOLO-based model to better detect and classify the mass lesions from the entire mammograms than the calcification lesions. This is aligned with the difference between the two types of lesions in terms of shape, size and texture. In fact, calcifications are often small and randomly distributed in challenging positions within the breast [66]. As shown in Fig. 5, calcifications do not have standard shape and they can be bilateral, thick linear, clustered, pleomorphic and vascular, etc. These varied shapes can limit the detection and classification for this type of lesions and yield more failed cases than for the other lesions. Below in Fig. 5, it shows a case of a coarse-like calcification that has crossed thick lines with irregular size (image on the left, taken from the CBIS-DDSM dataset). Another case shows pleomorphic calcifications that have randomly distribution (image on the middle, taken from the INbreast dataset). In addition, example of clustered calcifications located on the pectoral muscle that presents a challenging case in mammography (image on the right, taken from the Private dataset).
Moreover, we notice that both models have the best results toward mass lesions using the private dataset, and toward calcification lesions using the INbreast dataset. This can be explained with the degraded quality presented in the digitized X-rays mammograms of CBIS-DDSM dataset. Consequently, performance is affected by the image quality and our study proved that detection and classification highly require full-field digital mammography images which involves direct conversion and preserve the shape and textures breast lesions [67]. Moreover, Tab. 2 demonstrates that training the model on both prediction classes slightly decreased the performance and this can be explained by the inability of YOLO-based model to detect and distinguish some different types of lesions having similar shapes. However, we proved the robustness of our suggested model toward mass detection with a maximum detection accuracy rate of 96.2 using the private dataset. All experiments had similar inference time with a maximum value of 0.58 seconds. Examples from each dataset are illustrated in Fig. 6, and each lesions breast has its confidence score. We clearly notice that multiple lesions were accurately detected in the same mammogram.

Fusion Models Evaluation
This study proposed an additional step to evaluate the simultaneous detection and classification model. This presents an expanded evaluation that fuses models trained with different settings as detailed in Section 3.2. In fact, before presenting the results, single models M 1 and M 2 were first reported over best-selected mammograms from the augmented datasets. This means for every set of predicted mammograms including the original and their five augmented images (i.e., rotated, transformed), we selected the image having the highest IoU score. Next, different models were fused into a new Fusion model, as detailed in Tab. 3, and we measured the detection accuracy rate toward every prediction class.  It is clearly observed that our suggested fusion models approach improved the results of detection and classification on mammography images. Indeed, fusion strategies were reviewed in the past for medical image segmentation [68][69][70], and our approach is a new decision-level fusion strategy for object detection and classification that proved the advantage of fusing results of multiple models.
Finally, a comparison of mass detection results of the latest studies and similar methods are listed in Tab. 4. Our implemented method using the fusion models approach is sufficiently fast and accurate. Comparing both detection accuracy rate and inference time with the other works shows that we achieved a better overall performance on the public datasets: CBIS-DDSM with a detection accuracy rate of 95.7% and INbreast with a detection accuracy rate of 98.1%.
It is to notice that comparative results with the state-of-the-art methods relied on both detection accuracy rate and testing inference time, so even though the work by Al-Antari et al. [43] outperformed the detection results for INbreast, but it was more expensive than our implementation in terms of inference time. Additionally, experiments in each work were based on different preprocessing techniques, which can perform differently on both standard datasets. Dhungel et al. [8] mammography, etc.). The quality of predicted images also affirms the robustness of YOLO to successfully identify breast lesions over pectoral muscle, next to breast nipples, or above the dense tissues as shown in Fig. 6. Experimental results showed that training YOLO-based deep learning model is overall fast and accurate, where our results outperform the SSD method [35], the Faster R-CNN model [44], the CNN model [17] and other machine learning techniques [8,16] that had a maximum detection accuracy rate of 98% on INbreast dataset but a significantly high inference time. The comparison revealed that YOLO model is the right choice for mass detection in mammography as presented in other existing YOLO implementations [41,43,44] with a maximum detection accuracy rate of 97.27% on INbreast dataset, and our study enhanced the state-of-the-art results to be 98.1%. However, limitations of the proposed YOLO model can occur in the training configuration that depends on preparing the right format of input data. Thus, input images should be accompanied by the true locations and class labels of the lesions during the training. This requires extracting the coordinates of lesions from the ground truth and consequently YOLO model has an input dependency.
In addition, this paper provided feasible and promising results using the proposed fusion models approach that was considered to join different models and lower the miss-prediction error. Moreover, as the breast lesions detection plays a critical role in the CAD systems and fullyintegrated breast cancer diagnosis [32,43,45], our methodology provided an improved detection performance compared with the recent deep learning models. This helps to avoid carrying out additional errors when conducting further diagnosis on the detected lesions.
For a complete clinical application that can assist radiologists, future work aims at extracting the correctly detected masses and calcifications and conducting lesions segmentation, shape and type classification (malignant or benign), and malignancy degree prediction of breast tumors. This will provide an entire framework for breast cancer diagnosis that may also include clinical reports analysis.