U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • My Bibliography
  • Collections
  • Citation manager

Save citation to file

Email citation, add to collections.

  • Create a new collection
  • Add to an existing collection

Add to My Bibliography

Your saved search, create a file for external citation management software, your rss feed.

  • Search in PubMed
  • Search in NLM Catalog
  • Add to Search

An update on the pathological classification of breast cancer

Affiliations.

  • 1 Translational Medical Sciences Unit, School of Medicine, University of Nottingham, Nottingham, UK.
  • 2 Department of Cellular Pathology, Nottingham University Hospitals NHS Trust, Nottingham City Hospital Nottingham, Nottingham, UK.
  • 3 Department of Anatomical and Cellular Pathology, Prince of Wales Hospital, The Chinese University of Hong Kong, Ngan Shing Street, Shatin, NT, Hong Kong SAR.
  • 4 Department of Histopathology, St. Vincent's University Hospital, Dublin, Ireland.
  • PMID: 36482272
  • PMCID: PMC10108289
  • DOI: 10.1111/his.14786

Breast cancer (BC) is a heterogeneous disease, encompassing a diverse spectrum of tumours with varying morphological, biological, and clinical phenotypes. Although tumours may show phenotypic overlap, they often display different biological behaviour and response to therapy. Advances in high-throughput molecular techniques and bioinformatics have contributed to improved understanding of BC biology and refinement of molecular taxonomy with the identification of specific molecular subclasses. Although the traditional pathological morphological classification of BC is of paramount importance and provides diagnostic and prognostic information, current interest focusses on the use of a single gene and multigene assays to stratify BC into distinct groups to guide decisions on systemic therapy. This review considers approaches to the classification of BC, including their limitations, and with particular emphasis on the fundamental role of morphology in establishing an accurate diagnosis of primary invasive carcinoma of breast origin. This forms the basis for further morphological characterization and for all other approaches to BC classification that are used to provide prognostic and therapeutic predictive information.

Keywords: breast cancer; classification; clinical; differentiation; grade; molecular; outcome; stage.

© 2022 The Authors. Histopathology published by John Wiley & Sons Ltd.

PubMed Disclaimer

Conflict of interest statement

The author's have no conflicts of interest to declare.

A case of mucinous cystadenocarcinoma…

A case of mucinous cystadenocarcinoma featuring complex papillary growth pattern ( A )…

A case of tall cell…

A case of tall cell carcinoma with reversed polarity featuring nuclei placed away…

Similar articles

  • New Advances in Molecular Breast Cancer Pathology. Rakha EA, Pareja FG. Rakha EA, et al. Semin Cancer Biol. 2021 Jul;72:102-113. doi: 10.1016/j.semcancer.2020.03.014. Epub 2020 Apr 5. Semin Cancer Biol. 2021. PMID: 32259641 Review.
  • Specific cell differentiation in breast cancer: a basis for histological classification. Rakha E, Toss M, Quinn C. Rakha E, et al. J Clin Pathol. 2022 Feb;75(2):76-84. doi: 10.1136/jclinpath-2021-207487. Epub 2021 Jul 28. J Clin Pathol. 2022. PMID: 34321225 Review.
  • Molecular classification of breast cancer: what the pathologist needs to know. Rakha EA, Green AR. Rakha EA, et al. Pathology. 2017 Feb;49(2):111-119. doi: 10.1016/j.pathol.2016.10.012. Epub 2016 Dec 28. Pathology. 2017. PMID: 28040199
  • The molecular basis of breast cancer pathological phenotypes. Heng YJ, Lester SC, Tse GM, Factor RE, Allison KH, Collins LC, Chen YY, Jensen KC, Johnson NB, Jeong JC, Punjabi R, Shin SJ, Singh K, Krings G, Eberhard DA, Tan PH, Korski K, Waldman FM, Gutman DA, Sanders M, Reis-Filho JS, Flanagan SR, Gendoo DM, Chen GM, Haibe-Kains B, Ciriello G, Hoadley KA, Perou CM, Beck AH. Heng YJ, et al. J Pathol. 2017 Feb;241(3):375-391. doi: 10.1002/path.4847. Epub 2016 Dec 29. J Pathol. 2017. PMID: 27861902 Free PMC article.
  • Advanced Approaches to Breast Cancer Classification and Diagnosis. Zubair M, Wang S, Ali N. Zubair M, et al. Front Pharmacol. 2021 Feb 26;11:632079. doi: 10.3389/fphar.2020.632079. eCollection 2020. Front Pharmacol. 2021. PMID: 33716731 Free PMC article. Review.
  • Deep learning-based risk stratification of preoperative breast biopsies using digital whole slide images. Boissin C, Wang Y, Sharma A, Weitz P, Karlsson E, Robertson S, Hartman J, Rantalainen M. Boissin C, et al. Breast Cancer Res. 2024 Jun 3;26(1):90. doi: 10.1186/s13058-024-01840-7. Breast Cancer Res. 2024. PMID: 38831336 Free PMC article.
  • A Rare and Intriguing Case Report of Metaplastic Breast Carcinoma. Kani V, Chander V, Sonti S, Manian S, Vasudevan S, Esakki M, Grace Priyadarshini S, Rajendran K. Kani V, et al. Cureus. 2024 Mar 21;16(3):e56619. doi: 10.7759/cureus.56619. eCollection 2024 Mar. Cureus. 2024. PMID: 38646373 Free PMC article.
  • BRCA Mutations and MicroRNA Expression Patterns in the Peripheral Blood of Breast Cancer Patients. Alavanda C, Dirimtekin E, Mortoglou M, Arslan Ates E, Guney AI, Uysal-Onganer P. Alavanda C, et al. ACS Omega. 2024 Apr 3;9(15):17217-17228. doi: 10.1021/acsomega.3c10086. eCollection 2024 Apr 16. ACS Omega. 2024. PMID: 38645356 Free PMC article.
  • Clinical evaluation of deep learning-based risk profiling in breast cancer histopathology and comparison to an established multigene assay. Wang Y, Sun W, Karlsson E, Kang Lövgren S, Ács B, Rantalainen M, Robertson S, Hartman J. Wang Y, et al. Breast Cancer Res Treat. 2024 Jul;206(1):163-175. doi: 10.1007/s10549-024-07303-z. Epub 2024 Apr 9. Breast Cancer Res Treat. 2024. PMID: 38592541 Free PMC article.
  • Lactadherin immunoblockade in small extracellular vesicles inhibits sEV-mediated increase of pro-metastatic capacities. Durán-Jara E, Del Campo M, Gutiérrez V, Wichmann I, Trigo C, Ezquer M, Lobos-González L. Durán-Jara E, et al. Biol Res. 2024 Jan 3;57(1):1. doi: 10.1186/s40659-023-00477-8. Biol Res. 2024. PMID: 38173019 Free PMC article.
  • Lukong KE. Understanding breast cancer ‐ the long and winding road. BBA Clin. 2017; 7; 64–77. - PMC - PubMed
  • Bloom HJG. Further studies on prognosis of breast carcinoma. Br. J. Cancer 1950b; 4; 347–367. - PMC - PubMed
  • Cutler SJ, Black MM, Friedell GH, Vidone RA, Goldenberg IS. Prognostic factors in cancer of the female breast. II. Reproducibility of histopathologic classification. Cancer 1966; 19; 75–82. - PubMed
  • Biesheuvel C, Weigel S, Heindel W. Mammography screening: Evidence, history and current practice in Germany and other European countries. Breast Care (Basel) 2011; 6; 104–109. - PMC - PubMed
  • Ellis IO, Galea M, Broughton N, Locker A, Blamey RW, Elston CW. Pathological prognostic factors in breast cancer. II. Histological type. relationship with survival in a large study with long‐term follow‐up. Histopathology 1992; 20; 479–489. - PubMed

Publication types

  • Search in MeSH

Related information

Linkout - more resources, full text sources.

  • Europe PubMed Central
  • Ovid Technologies, Inc.
  • PubMed Central
  • Genetic Alliance

full text provider logo

  • Citation Manager

NCBI Literature Resources

MeSH PMC Bookshelf Disclaimer

The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.

ORIGINAL RESEARCH article

Breast cancer detection and classification empowered with transfer learning.

\nSahar Arooj

  • 1 Riphah School of Computing and Innovation, Riphah International University Lahore, Lahore, Pakistan
  • 2 Department of Computer Science, College of Computer Science and Information Technology (CCSIT), Imam Abdulrahman Bin Faisal University (IAU), Dammam, Saudi Arabia
  • 3 Faculty of Computing, Riphah International University Islamabad, Islamabad, Pakistan
  • 4 Department of Forensic Sciences, University of Health Sciences, Lahore, Pakistan
  • 5 Networks and Communications Department, College of Computer Science and Information Technology, Imam Abdulrahman Bin Faisal University (IAU), Dammam, Saudi Arabia
  • 6 Department of Software, Gachon University, Seongnam, South Korea
  • 7 John von Neumann Faculty of Informatics, Obuda University, Budapest, Hungary
  • 8 Institute of Information Engineering, Automation and Mathematics, Slovak University of Technology in Bratislava, Bratislava, Slovakia
  • 9 Faculty of Civil Engineering, TU-Dresden, Dresden, Germany

Cancer is a major public health issue in the modern world. Breast cancer is a type of cancer that starts in the breast and spreads to other parts of the body. One of the most common types of cancer that kill women is breast cancer. When cells become uncontrollably large, cancer develops. There are various types of breast cancer. The proposed model discussed benign and malignant breast cancer. In computer-aided diagnosis systems, the identification and classification of breast cancer using histopathology and ultrasound images are critical steps. Investigators have demonstrated the ability to automate the initial level identification and classification of the tumor throughout the last few decades. Breast cancer can be detected early, allowing patients to obtain proper therapy and thereby increase their chances of survival. Deep learning (DL), machine learning (ML), and transfer learning (TL) techniques are used to solve many medical issues. There are several scientific studies in the previous literature on the categorization and identification of cancer tumors using various types of models but with some limitations. However, research is hampered by the lack of a dataset. The proposed methodology is created to help with the automatic identification and diagnosis of breast cancer. Our main contribution is that the proposed model used the transfer learning technique on three datasets, A, B, C, and A2, A2 is the dataset A with two classes. In this study, ultrasound images and histopathology images are used. The model used in this work is a customized CNN-AlexNet, which was trained according to the requirements of the datasets. This is also one of the contributions of this work. The results have shown that the proposed system empowered with transfer learning achieved the highest accuracy than the existing models on datasets A, B, C, and A2.

Introduction

Medical imaging is a valuable tool for detecting the existence of various medical diseases and analyzing investigational outcomes. The use of biomedical imaging in cancer treatment is crucial. Cancer is a major public health issue in the modern world. According to the World Health Organization (WHO), cancer in 2018 caused 9.6 million deaths, and a probable 10 million deaths were caused by cancer in 2020 ( 1 ). Cancer tumors are caused by the uncontrollable growth of cells in the breast. One of the most frequent malignancies in women is breast cancer. BC is estimated to attack more than 8% of women at some point in their life. BC can start in any part of the breast. The majority of BC begins in the lobules or ducts. However, BC can be detected early, allowing patients to obtain proper therapy and so increase their chances of survival.

Imaging technologies such as magnetic resonance imaging (MRI), diagnostic mammography ( 2 ) (X-rays), thermography, and ultrasound (sonography) can help analyze and identify breast cancer ( 3 ). Ultrasound images are used in this proposed study. Breast cancer is classified as benign and malignant. Benign tumor cells only grow in the breast and do not split throughout the other cells. A malignant tumor is made up of cancerous cells that have the ability to expand uncontrollably, spread to other areas of the body, and infect other tissues. Because cancer cells vary in size, shape, and location, automatically detecting and localizing cancer cells in BC images are a huge difficulty. Machine learning (ML) ( 4 ) approaches have found widespread use in a variety of domains, including educational prediction, pattern recognition, image editing, feature reduction, defect diagnosis, face identification, micro-expression recognition, NLP, and medical diagnosis. Its greatest potential has been discovered in the diagnosis of breast cancer ( 5 ).

Many researchers have proposed numerous strategies for the automatic classification of cells in breast cancer detection in recent decades ( 6 ). By identifying nucleus traits, cancerous cells of breast cancer can be classified as benign and malignant. However, the system's efficiency and accuracy decrease as a result of the complexity of typical machine learning procedures such as pre-processing, segmentation, feature extraction, and others. Traditional ML problems can be solved using the recently developed DL technique. With exceptional feature representation, this technique can perform picture classification and object localization challenges. The transfer learning approach used a natural-image dataset such as ImageNet and then applied a fine-tuning technique to solve this problem. The main benefit of transfer learning is that it improves classification accuracy and speeds up the training process.

First, network parameters were pre-trained using the data and used in the required domain, and then the system restrictions were changed for improved performance. This study used a model for the classification and detection using TL. The proposed model has two components. The first component is training, and the second component is testing. BC classification can be done using a CNN pre-trained such as the ResNet50, VGG 16, VGG 19, and Inception V2 Res Net. In this work, we have done the job of BC classification and detection by using the AlexNet model. AlexNet is a powerful model that can achieve high accuracies on even the most difficult datasets. AlexNet is a leading architecture for any object identification task and classification, and it has a wide range of applications in the artificial intelligence field of computer vision. Some previous studies used the AlexNet, but in this work, we used a customized AlexNet model which has not been used before in previous studies. In the customized AlexNet, the first and last three layers of the architecture are modified, and newly modified layers are the image input layer, fully connected layer, classification layer, and softmax layer, although the remaining layers remain fixed. The customized model has all of the features for image processing that it learned during the process of training. The main goal of this project was to detect and classify breast cancer, reduce training time, increase accuracy, and enhance classification performance.

There are many previous studies on breast tumors using various types of models, but with some limitations, breast cancer has limited studies due to the lack of publicly available benchmark datasets. This proposed system worked on three datasets A, B, C, and A2, A2 is dataset A with two classes with a total number of 10,336 images, which is a good dataset. This study is the first to compare three common datasets and suggest the use of customized transfer learning algorithms for breast cancer classification and detection on multiple datasets. By using the customized AlexNet, we achieved the optimum accuracy. This work used ultrasound images and histopathology images, the sample images of ultrasounds are shown in Figure 1 , and the sample images of histopathology are shown in Figure 2 .

www.frontiersin.org

Figure 1 . Ultrasound image samples: (A) benign, (B) malignant, and (C) normal.

www.frontiersin.org

Figure 2 . Histopathology image samples: (A) benign and (B) malignant.

This study is divided into five sections. Section 2 is the literature review, section 3 is the proposed system model, section 4 is the simulation and results, and section 5 is the conclusion of this work.

Literature Review

Diagnosis of BC disease is a challenge for researchers. To solve this problem of breast cancer, various models and techniques such as ML, DL, and TL are used. Researchers used datasets based on mammography (X-rays), magnetic resonance imaging (MRI), ultrasound (sonography), and thermography to diagnose breast cancer disease.

Fractal dimension (FD) is the best indicator of ruggedness for regular elements, according to their findings. Breast lumps are uneven and can vary from malignant to benign; as a result, the breast is one of the best places to apply fractal geometry. The support vector machine, on the other side, is a new categorization technique. They ( 2 ) employed two techniques, FA: SVM and Box Count Method (BCM) in distinct operations that produced good results in respective sectors. The BCM is used to extract features. The retrieved feature “FD” assesses the difficulty of the 42-image input dataset. The generated FD is then processed using the SVM classifier which is used to classify malignant and benign cells. Their highest accuracy is 98.13%.

Breast cancer is a major disease among women between the ages of 59 and 69. They ( 4 ) also showed that finding tiny tumors early improves predictions and reduces death rates significantly. Mammography is a useful screening diagnostic method. However, due to tiny changes in tissue densities within mammography pictures, mammography interpretation is challenging. This is particularly true for solid tissues of the breast, and according to this study, screening is more appropriate in greasy breast tissue than in solid breast tissue. Their research focuses on BC detection, as well as danger issues and breast cancer assessments. Their research also focuses on the early diagnosis of BC using 3D MRI mammographic technologies and the classification of mammography pictures using ML.

Their research ( 5 ) proposes a heterogeneous efficient machine learning strategy for the early detection of breast cancer. The suggested method follows the CRISP-DM process and employs a stack to construct the collaborative model, which involves three algorithms: KNN, SVM, and decision tree. This meta-classifier's performance is compared to the separate presentations of DT, SVM, and KNN and other particular classifiers NB, SGD, LR, ANN, and a homogeneous collaborative model of (KNN, SVM, DT) and (RF). Using chi-square, the top five characteristics such as glucose, resist in, HOMA, insulin, and BMI are calculated. At K = 20, the proposed collaborative model has the best accurateness of 78% and the smallest log loss of 0.56, denying the null hypothesis. The one-tailed t -test, which delivers a lesser consequence at ∞ = 0.05, yields a P -value of 0.014.

In this paper ( 7 ), they tested the presentation of using conveyed features from a pre-trained model on a dataset of 1,125 breast ultrasound cases. Their dataset is composed of 2,392 regions of interest (ROIs). Each ROI was marked up as cystic, malignant, or benign. Using a convolutional neural network (CNN) ( 6 ) from each ROI, features were taken out and used to train (SVM) classifiers. For comparison, classifiers were also trained before retrieving tumor features. CNN-extracted feature-trained classifiers were pretty similar to human-designed feature-trained classifiers. The SVM ( 8 ) which was trained on both human-designed features and CNN-extracted features had a 90% accuracy rate in the classification task. The accuracy of the SVM trained on CNN features was 88%, compared to 85% for the SVM trained on features that are human-designed in the task of determining malignant or benign. Deep learning (DL) methods currently in use rely on large datasets. It is worth noting that the study's dataset is not available to the general public.

In this work ( 8 ), they look at the potential uses of machine learning for brain problems. They show why machine learning is generating so much interest among researchers and clinicians in the field of brain disorders ( 9 ) by highlighting three main applications: predicting sickness onset, assisting with diagnosis, and predicting longitudinal outcomes. They explore the hurdles that must be solved for a successful translational implementation of machine learning in routine psychiatric and neurologic care after exhibiting various applications.

This paper ( 10 ) used two datasets of breast ultrasound from two different systems. The first set of data is called breast ultrasound images (BUSI). There is a total of 780 photographs in the BUSI dataset (normal 133, malignant 210, and 437 benign). B dataset has 163 pictures (110 benign and 53 malignant). They used a generative adversarial network (GAN) technology for data augmentation. Researchers can access their BUSI dataset for free. In addition, DL algorithms are applied in this study for breast ultrasound classification. They compare the performance of two alternative methods: a CNN-AlexNet approach and a transfer learning technique with and without augmenting. Their network is trained with a 0.0001 learning rate and 60 epochs. They achieved the accuracies of 94% on BUSI data, 92% on dataset B, and 99% on augmentation.

In this paper ( 11 ), they introduce a publicly available collection of 7,909 breast cancer histopathology images. Both benign and malignant images are included in the dataset. The aim connected with this dataset is to automatically classify these photographs into two categories which would be a useful computer-aided diagnosis tool for the clinician. The accuracy ranges from 80 to 85% indicating that there is still space for improvement. In their work to evaluate the feature collection, they used multi-classifiers KNN, SVM, quadratic linear analysis, and RF.

The use of DL techniques for breast ultrasound lesion identification is proposed in this study ( 12 ), and three alternative methods are investigated: patch-based Le Net, transfer learning ( 13 ), and U-Net approach with the AlexNet model. Two conventional ultrasound picture datasets were obtained, and two separate ultrasound devices are compared and contrasted in this study. Dataset A contains 306 photographs (246 benign and 60 malignant), while dataset B has a total of 163 images (110 benign and 53 malignant). They employed grayscale ultrasound pictures that were divided into 28 × 28 patches. RMS Propagation with LR of 0.01 and 60 epochs is used to train the network. They used the AlexNet model to attain a maximum accuracy which is 91% for dataset A and 89% for dataset B.

Based on two methodologies cross-validation and 80–20, a DL model based on the TL methodology is built in this study ( 14 ) to proficiently help in the automatic identification and identification of the breast cancer suspicious area. Deep learning architectures are designed to solve certain problems. Transfer learning applies what one has learned while working on one problem to another. They used six evaluation metrics to assess the proposed model's performance. To train this model, they used a learning rate of 0.01 and 60 epochs. Transfer learning is effective in detecting breast cancer by categorizing mammogram images of the breast with general accuracy, sensitivity, specificity, precision, F-score, and accuracy of 98.96, 97.83, 99.13, 97.35, 97.6.%, and 95%, respectively.

They ( 15 ) investigate a quantitative solution to a machine learning problem in this paper. They used transfer learning to train a set of hybrid traditional neural networks based on Azevedo et al. ( 15 ) work. Their mission was to tackle BCDR's difficulty in identifying full-image mammograms as malignant or benign. Data collected in this study were used throughout our research to illustrate the regions of the mammograms that the networks were targeting while measuring various performance indicators. They also indicate that some designs perform much better than others depending on the task. According to their findings, the greatest accuracy is 84%.

They ( 16 ) demonstrate in their study that the early detection and classification of breast cancer are critical in assisting patients in taking appropriate action. Mammography images, on the contrary, have low sensitivity and efficiency for identifying breast cancer. Furthermore, MRI has a higher sensitivity for detecting breast cancer than mammography. A novel Back Propagation Boosting Recurrent Widening Model (BPBRW) with a Hybrid Krill Herd African Buffalo Optimization (HKH-ABO) method is created in this study to diagnose breast cancer at an earlier stage utilizing breast MRI data. The system is initially trained using MRI breast pictures. Furthermore, the proposed BPBRW with HKH-ABO mechanism distinguishes between benign and malignant breast cancer tumors. Additionally, Python is used to simulate this model. They demonstrate that their model has a 99.6% accuracy rate.

They ( 17 ) constructed four distinct predictive models and offered data exploratory techniques (DET) to increase breast cancer detection accuracy in this study. Prior to the models, researchers dug deep into four-layered critical DET, such as feature distribution, correlation, removal, and hyperparameter optimization, to find the most robust feature categorization into malignant and benign classifications. On the WDBC and BCCD datasets, the proposed approaches and classifiers were tested. To evaluate each classifier's efficiency and training time, standard performance metrics such as confusion matrices and K-fold approaches were used. With DET, the models' diagnostic capacity improved, on polynomial SVM achieving 99.3% accuracy, LR 98.06, KNN 97.35, and EC 97.61% accurateness with the WDBC database.

Their ( 18 ) goal was to create a hierarchical breast cancer system model that would improve detection accuracy and reduce breast cancer misdiagnosis. To categorize breast cancer tumors and compare their performances, the dataset was subjected to ANN and SVM. The SVM utilizing radial features produced the best accuracy of classification of 91.6%, whereas the ANN obtained 76.6%. As a result, SVM was used to conclude about the importance of breast screening. The second stage involved applying transfer learning to train AlexNet, InceptionV3, and ResNet101. AlexNet scored 81.16%, ResNet101 scored 85.51%, and InceptionV3 scored 91.3 %, according to the data.

They ( 19 ) present a framework based on the notion of transfer learning in their research. In addition, a variety of augmentation procedures, including multiple rotation combinations, scale, and shifting, were implemented to prevent a fitting problem and create consistent outcomes by expanding the number of screened mammography pictures. Their proposed solution was tested on the Screening mammography Image Analysis Society (MIAS) database and achieved an accuracy of 89.5% using ResNet50 and 70% utilizing the NASNet-Mobile network. Pre-trained categorization networks are much more efficient and effective, making them more suitable for diagnostic imaging, especially for short training datasets, according to their suggested system.

They ( 20 ) used machine learning-based algorithms to help the radiologist read mammography pictures and classify the tumor in an acceptable amount of time in this study. They extracted a number of features from the mammogram's region of interest, which the physician manually labeled. To train and create the suggested structural classification models, these properties are added to a classification engine. They tested the suggested system's accuracy using a dataset that had never been encountered before in the model. As a result, this research discovered that a variety of circumstances can affect the results, which they ignored after investigating. After merging the selection of features optimization approaches, this study advises employing the optimized SVM or Nave Bayes, which provided 100% accuracy.

Their ( 21 ) research focuses on employing TL with fine-tuning and on training the CNN with areas derived from the IN breast and MIAS datasets to apply, evaluate, and compare architectures such as AlexNet, Google Net, Vgg19, and Resnet50 to classify breast lesions. They looked at 14 classifiers, each of which corresponded to benign or malignant microcalcifications and masses, as several previous studies have done. With the CNN, they obtained the best results. With an AUC of 99.29%, an F1 score of 91.92%, accuracy of 91.92%, precision of 92.15%, sensitivity of 91.70%, and specificity of 97.66% on a balanced database, Google Net is the better model in a Cad model for breast cancer.

The effectiveness of BC categorization for malignant and benign tumors was evaluated utilizing several machine learning algorithms (k-NN, RF, and SVM) and aggregation methods to calculate the prediction of BC survival by applying 10-fold cross-validation. Their research ( 22 ) used a dataset from WDBC that included 23 selected variables evaluated by 569 people, of whom 212 had malignant tumors and 357 had benign tumors. The analysis was done to look at the characteristics of the tumors using the mean, worst values, and standard error. There are 10 properties for each feature. According to the results, AdaBoost has the maximum accuracy for 30 features (98.95%), 10 mean features (98.07%), and 10 worst features (98.77%) with the lowest error rate. To obtain the best accuracy rate, their recommended approaches are categorized using 2-, 3-, and 5-fold cross-validation. When all approaches were compared, AdaBoost ensemble methods had the highest accuracy, with 98.77% for 10-fold cross-validation and 98.41 and 98.24% for 2- and 3-fold cross-validation, respectively. Nonetheless, 5-fold cross-validation revealed that SVM generated the highest accuracy rate of 98.60% with the least error rate.

Breast cancer affects a large number of people all around the world. Mammography is a key advancement in breast cancer detection. It is difficult for doctors to recognize due to its intricate structure. Their ( 23 ) research suggests using a CNN to detect cancer cells early. By separating malignant and benign mammography pictures, detection and accuracy can be greatly improved. The Break His ×400 database comes from Kaggle, and the architectures NASNet-Large, DenseNet-201, Big Transfer (M-r101x1x1), and Inception ResNet-V3 perform admirably. M-r101x1x1 has a maximum accuracy of 90% among them. The most important goal of their research is to use selected neural networks to accurately classify breast cancer. This research could help to enhance the systematic diagnosis of early-stage breast cancer.

Despite the fact that there are several scientific studies on the categorization and identification of cancer tumors using various types of models but with some limitations. Breast cancer has limited studies due to the lack of publicly available benchmark datasets. In their work ( 14 ), they have used multiple methods such as ResNet50, inception V3, Inception V2 Res Net VGG 19, and VGG 16 but their dataset is too small and they just work on one single dataset and their maximum accuracy is 98.96. In this work ( 10 ), they used two different datasets using transfer learning. Datasets are good, but their maximum accuracy is 94% on the BUSI dataset and 92% on dataset B. In this work ( 12 ), they also used two different datasets by using CNN multiple models, and they achieved a maximum accuracy with AlexNet, 91% on dataset A and 89% on dataset B. In their work ( 11 ), they used a good and large dataset but they also achieved a maximum accuracy is 80–85%. Table 1 shows the comparison of previous studies in terms of accuracy and limitations. Previous studies ( 4 , 5 , 7 , 10 – 12 , 14 ) have some limitations like less number of images in the dataset, less accuracy, hand-crafted features required, lack of diverse datasets, no publically available dataset, and an imbalanced number of images in datasets.

www.frontiersin.org

Table 1 . Comparison and limitations of previous studies.

The following are the primary contributions of this work:.

• This work used three different datasets of breast cancer and compare their results on the same model.

• Improving the accuracy of classification and detection by customizing the model AlexNet.

• Model proposed achieved the maximum accuracy results using transfer learning approaches.

Proposed Model

In order to assess and identify diseases in medical images, machine learning techniques were applied. Many ML ( 24 ) and DL ( 25 ) approaches have been widely employed in medical image processing in recent years to detect and evaluate items in medical images. The use of DL techniques to detect breast cancer at an early stage aids medical practitioners in determining its therapy. Breast cancer has been diagnosed early using a variety of DL and transfer learning approaches. DL methods are useful tools for detecting the disease early. Figure 3 shows the application-level representation of the suggested system model.

www.frontiersin.org

Figure 3 . Application level of the proposed system.

As we know that in deep learning, few steps are very important: first is data acquisition, data pre-processing, and then the training of the datasets. As we know that if the data are image-based, then the deep learning methods give more accurate results as compared to the machine learning, which is the reason we used deep learning-based solution. There are various kinds of deep learning models like CNN and KNN, as we know that computational resources are also required to compute such kinds of problems like processing power. In further deep learning if we have less computational resources like this one, then we used transfer learning instead of the other deep learning models that is why here we used transfer learning to save the computing power resource optimization. In transfer learning, we used a pre-trained model AlexNet, and we customized this model according to our problem which saves computing power. After that, we stored it in the cloud so that we can use this pre-trained model.

The detailed proposed system model is shown in Figure 4 . The projected method for breast cancer identification and classifications contains two major components. The first component is pre-processing and training, and the second is testing. Based on deep learning techniques, the proposed system model accepts images to help in the classification and early detection of diseases in various stages. Previous research and the Kaggle repository were used to collect the training data, which consisted of ultrasound and histopathology images and the data were collected in raw form. The raw data were handled by the pre-processing layer, which converted the images according to the requirement of the model which is 227 * 227 for AlexNet and customized the pre-trained model AlexNet for transfer learning.

www.frontiersin.org

Figure 4 . Proposed system model of BC identification and detection.

The second layer is training, and for training, this study used pre-processed images of the training layer [227 * 227] and import the customized trained model. The model must be retrained if the learning conditions are not met; otherwise, the trained model is saved in the cloud. The intelligent trained model detects and classifies breast cancer into three categories: benign, malignant, and normal. If the patient is normal, no need to visit the doctor, and if the patient is having symptoms of benign or malignant system, refer her to the doctor for the treatment of BC. Table 2 shows the pseudocode of the proposed algorithm.

www.frontiersin.org

Table 2 . Pseudocode of the proposed model.

In general, a dataset should be provided to construct a healthcare system employing deep learning. Three separate datasets of breast images are used in this investigation. This study referred to datasets as datasets A, B, and C. Dataset A is collected from ( 10 , 26 ), dataset B is collected from ( 11 , 27 ), and dataset C is collected from ( 28 ). Dataset A includes medical images of breast cancer obtained by an ultrasound scan. The images in this dataset A are divided into three categories: normal, benign, and malignant. Dataset B contains histopathology pictures of malignant and benign breast cancers, and images were taken as part of a clinical investigation. Dataset C images are also histopathology images. Dataset C is divided into two categories: malignant and benign. We also used dataset A with two classes, benign and malignant, and called it as dataset A2. The number of images in all datasets is shown in Table 3 .

www.frontiersin.org

Table 3 . Dataset parameters.

After data collection, pre-processing of images is done. This pre-processing is critical for removing the limitations of abnormality observation and dimension of images according to the AlexNet model. The quality of the images can be increased, and the results can become more precise. Splitting is an important part of a model for training and testing. This proposed model is done by splitting datasets randomly into 80% for the training set and 20% for testing.

Transfer Learning

Transfer learning is a technique that involves training a CNN model to learn features for a wide range of domains. The proposed TL method is based on AlexNet. The images of the breasts are in grayscale. To make model training easier, pre-processing actions like resizing images into 227 * 227 were taken. This study divided the dataset into training and testing groups randomly, so the models were able to identify significant elements in each image and get a perfect score on the test set. The AlexNet model was used to train all datasets (A, B, and C). The model that has been trained is kept and reused.

This proposed methodology customized the AlexNet model. AlexNet is an eight-layer network with learnable parameters in which three are fully connected layers and five are convolutional layers with max pooling. ReLU is a non-linear initiation function that exists in each layer. Images from the pre-processed layer are read by the network's input layer. The fully connected layers learn disease features to categorize images into specific classes. Early convolutional layers extract common features from pictures by using filters such as detection of edges and preserving the spatial connection between pixels, but later convolutional layers using filters extract general features from images such as detection of edges.

This CNN ( 29 ) network was modified to our needs, and the pre-processed images were then loaded into the proposed AlexNet transfer learning model ( 30 ). According to the problem, the first and last three layers of the architecture are modified and newly modified layers are the image input layer, fully connected layer, softmax layer, and classification layer, although the remaining layers remain fixed. This customized network is used for TL. The first layer will set the dimension into 227 * 227, and the last three layers are set up according to the labels of the output class and they can categorize the images into their respective groups. The output's size, which is divided into numerous types, is the input parameter for fully connected layers. The fully connected layer in the proposed model will connect three classes: benign, malignant, and normal. Softmax layers are used to apply softmax functions to the input. A fully connected layer learns the class's precise features to differentiate across classes. So, fully connected layers are altered according to dataset classes. To identify images in distinct class labels, this projected network is trained on breast cancer labels of multi-class.

Learning rate and number of epochs are two of the parameters that can be used as training options. The learning rate and epochs are used to train the network. The training was done on various epochs such as 10, 30, and 50, and it was discovered that the ideal epoch was 50, with a learning rate of 0.0008. For training, the stochastic gradient descent with momentum (SGDM) technique of optimization is used. Newly edited layers use these training settings for the breast cancer dataset. The CNN layers are accountable for extracting the general features of images and then for the classification and identification of new datasets by reusing these learning parameters. Models that have been customized and trained are placed on clouds and can be reused. Pre-processed images are passed to the customized model AlexNet during the validation stage. The customized model has all of the features for image processing that it learned during the process of training, so it assesses the images and classifies them into normal, benign, and malignant diseases. After the classification and detection of breast cancer if the patient is normal, no need to visit a doctor, and if the patient has symptoms of disease, then refer to a doctor.

Simulation and Results

Breast cancer is caused by the uncontrollable growth of cells in the breast. One of the most frequent malignancies in women is breast cancer. BC is estimated to attack more than 8% of women at some point in their life. However, BC can be detected early, allowing patients to obtain proper therapy and so increase their chances of survival. There are many previous studies on breast tumors using various types of models, but with some limitations, breast cancer has limited studies due to the lack of publicly available benchmark datasets. In this study, we worked on three datasets (A, B, C, and A2, A2 is dataset A with two classes) with a total number of 10,336 images which is a good dataset. This study is the first to compare three common datasets and suggests the use of customized transfer learning algorithm AlexNet for breast classification and detection on multiple datasets. Table 14 shows that previous studies do not give accurate results that is why we need a better solution to diagnose breast cancer with more accuracy. Some previous studies used AlexNet, but in this work, we used customized AlexNet model that is not used before in previous studies. The customized model has all of the features for image processing that it learned during the process of training. By using customized AlexNet, we achieved good results that are shown in this section of this study.

In this section, multiple tests were carried out to investigate the performance of this model on datasets A, B, and C. Benign, malignant, and normal were used to categorize the datasets. AlexNet ( 31 ) was used to create the proposed model for detecting and classifying breast cancer. The categorization and findings are done in MATLAB 2020a. Evaluation metrics are used to assess produced results. In the training phase, for training, 80% of the dataset is utilized while for testing 20% is used. Transfer learning is applied on AlexNet and compared in form of accuracy (Acc), sensitivity (Sen), specificity (Spe), false-negative ratio (FNR), Miss classification rate (MCR), false-positive ratio (FPR), true positive (TP), false positive (FP), true negative (TN), and false negative (FN) ( 24 , 25 ). These assessment measures are used to quantify a predictive model's performance.

For binary classes of datasets A2, B, and C.

The proposed system classifies datasets A, B, and C into two and three classes, namely benign, malignant, and normal. This work trained datasets on multiple epochs like 10, 30, and 50, and the best accuracy of the proposed model is 99.4% for dataset “A,” 96.7% for dataset B, 99.1% for dataset C, and 100% for dataset A2 on 50 epochs and 0.0008 learning rate. The model proposed for classification and identification of the BC provided improved accuracy as compared to the earlier work of dataset A ( 10 ), their accuracy was 94% of dataset B ( 11 ), their accuracy was 80% to 85% and no previous work on dataset C, and the proposed model also achieved 100% on dataset A with two classes (dataset A2).

The algorithm is trained on multiple parameters. Transfer learning-based parameters are utilized for training this model and to get the required output in the proposed system. On 10, 30, and 50 epochs to attain optimal accuracy and loss rate, this study trained the model multiple times. Table 4 shows the dataset A classes (benign, malignant, and normal accuracy, respectively, 98.9, 100, and 100% and miss rate, respectively, 1.1%, 0.0%, and 0.0%), dataset B classes (benign and malignant accuracy, respectively, 96.0 and 97.0% and miss rate, respectively, 4.0 and 3.0%), dataset C classes (benign and malignant accuracy, respectively, 99.1 and 99.1% and miss rate, respectively, 0.9 and 0.9%), and dataset A2 classes (benign and malignant accuracy, respectively, 100 and 100% and miss rate, respectively, 0.0 and 0.0%) on 50 epochs.

www.frontiersin.org

Table 4 . Training model on 50 epochs class-wise.

Table 5 shows the accuracy and miss rate of dataset “A” on 10, 30, and 50 epochs. Accuracy is 70.5%, miss rate is 29.5% on 10 epochs, accuracy is 96.8%, miss rate is 3.2% on 30 epochs, and accuracy is 96.8%, miss rate is 3.2% on 50 epochs. Table 6 shows the miss rate and accuracy of dataset B on 10, 30, and 50 epochs. Accuracy is 77.5%, miss rate is 22.5% on 10 epochs, accuracy is 95.6%, miss rate is 4.4% on 30 epochs, and accuracy is 96.7%, miss rate is 3.3% on 50 epochs. Table 7 shows the accuracy and miss rate of dataset C on 10, 30, and 50 epochs. Accuracy is 96.0%, miss rate is 4.0% on 10 epochs, accuracy is 97.3%, miss rate is 2.7% on 30 epochs, and accuracy is 99.1%, miss rate is 0.9% on 50 epochs. Table 8 shows the accuracy and miss rate of dataset A2 on 10, 30, and 50 epochs. Accuracy is 89.1%, miss rate is 10.9% on 10 epochs, accuracy is 96.1%, miss rate is 3.9% on 30 epochs, and accuracy is 100%, miss rate is 0.0% on 50 epochs.

www.frontiersin.org

Table 5 . Training model on dataset A.

www.frontiersin.org

Table 6 . Training model on dataset B.

www.frontiersin.org

Table 7 . Training model on dataset C.

www.frontiersin.org

Table 8 . Training model on dataset A2.

Figure 5 represents the proposed system's labeled pictures of BC according to the dataset “A” classes benign, malignant, and normal. Figure 6 represents according to the dataset B classes benign and malignant. Figure 7 represents according to the dataset C classes benign and malignant. Figure 8 represents according to the dataset A2 classes benign and malignant.

www.frontiersin.org

Figure 5 . Image classification of dataset A.

www.frontiersin.org

Figure 6 . Image classification of dataset B.

www.frontiersin.org

Figure 7 . Image classification of dataset C.

www.frontiersin.org

Figure 8 . Image classification of dataset A2.

Table 9 shows the confusion matrix of breast cancer classification dataset “A” on 50 epochs. The total number of photographs used for 50 epochs was 780 of dataset “A” with 624 images used for training and 156 images used for testing. A total of 86 images of benign were used for classification in which 86 were classified as benign, 42 images of malignant were used in the classification in which 42 were classified as malignant, and 28 images of normal were used in the classification in which 27 were classified as benign and 1 as benign.

www.frontiersin.org

Table 9 . Confusion matrix of dataset “A” (testing).

Table 10 shows the confusion matrix of breast cancer classification dataset B on 50 epochs. The total number of photographs used for 50 epochs was 7,783 of dataset A with 6,226 images used for training and 1,557 images used for testing. A total of 508 images of benign were used for classification in which 476 were classified as benign and 32 as malignant, and 1,049 images of malignant were used for classification in which 1,029 were classified as malignant and 20 as benign.

www.frontiersin.org

Table 10 . Confusion matrix of dataset B (testing).

Table 11 shows the confusion matrix of breast cancer classification dataset C on 50 epochs. The total number of photographs used for 50 epochs was 1,126 of dataset C with 225 images used for training and 901 images used for testing. A total of 109 images of benign were used for classification in which 108 were classified as benign and 1 as malignant, and 116 images of malignant were used for classification in which 115 were classified as malignant and 1 as benign.

www.frontiersin.org

Table 11 . Confusion matrix of dataset C (testing).

Table 12 shows the confusion matrix of breast cancer classification dataset A2 on 50 epochs. The total number of photographs used for 50 epochs was 647 of dataset A2 with 518 images used for training and 129 images used for testing. A total of 87 images of benign were used for classification in which 87 were classified as benign and 0 as malignant, and 42 images of malignant were used for classification in which 42 were classified as malignant and 0 as benign.

www.frontiersin.org

Table 12 . Confusion matrix of dataset A2 (testing).

Figure 9 represents the training accuracy plot which is made up of iterations and epochs and displays the results for 50 epochs of dataset “A.” The precision was initially modest, but as the number of epochs increased, it gradually improved. The proposed system is trained at a learning rate of 0.0008 and a total number of six repetitions for each epoch. The chart depicts the percentage of accuracy for training that began at 1 epoch and ended at 50 epochs.

www.frontiersin.org

Figure 9 . Training of dataset A at 50 epochs.

Figure 10 represents the training accuracy plot which is made up of iterations and epochs and displays the results for 50 epochs of dataset B. The precision was initially modest, but as the number of epochs increased, it gradually improved. The proposed system is trained at a learning rate of 0.0008 and a total number of 60 repetitions for each epoch. The chart depicts the percentage of accuracy for training that began at 1 epoch and ended at 50 epochs.

www.frontiersin.org

Figure 10 . Training of dataset B at 50 epochs.

Figure 11 represents the training accuracy plot which is made up of iterations and epochs and displays the results for 50 epochs of dataset C. The precision was initially modest, but as the number of epochs increased, it gradually improved. The proposed system is trained at a learning rate of 0.0008 and a total number of eight repetitions for each epoch. The chart depicts the percentage of accuracy for training that began at 1 epoch and ended at 50 epochs.

www.frontiersin.org

Figure 11 . Training of dataset C at 50 epochs.

Figure 12 represents the training accuracy plot which is made up of iterations and epochs and displays the results for 50 epochs of dataset A2. The precision was initially modest, but as the number of epochs increased, it gradually improved. The proposed system is trained at a learning rate of 0.0008 and a total number of five repetitions for each epoch. The chart depicts the percentage of accuracy for training that began at 1 epoch and ended at 50 epochs.

www.frontiersin.org

Figure 12 . Training of dataset A2 at 50 epochs.

The proposed model gives the more precise results as shown in Table 13 , and it gives TP of class benign 86 from 87, malignant 42 from 42, and normal 26 from 27. The proposed model gave precise results which are shown in Table 14 ; on dataset “A,” it gives 99.35% accuracy and 1.149 % FNR for class benign, 100% accuracy and 0.0 % FNR for class malignant, and 99.35% accuracy and 0.0 % FNR for class normal. Table 15 shows that the proposed model gives 96.66% accuracy and 4.03% FNR on dataset B, 99.11% accuracy and 0.9174% FNR on dataset C, and 100% accuracy and 0.0% FNR on dataset A2.

www.frontiersin.org

Table 13 . TP, FP, FN, and TN of dataset A.

www.frontiersin.org

Table 14 . Statistical measures of dataset A.

www.frontiersin.org

Table 15 . Statistical measures of datasets B, C, and A2.

The graphical representations of the statistical measures of dataset “A” are shown in Figure 13 , and the graphical representations of the statistical measures of datasets B, C, and A2 are shown in Figures 14 , 15 . Multiple methods for detecting BC have been utilized in the past. In the identification of a disease for a given class, the proposed methodology attained good accuracy. As a result, early disease diagnosis can assist medical experts in providing treatment to prevent breast cancer spread. Table 15 shows the comparison of the suggested system model's performance with the literature work in terms of accuracy and miss rate. The proposed model obtained an accuracy and miss rate of 99.4% and 0.6%, respectively, on dataset “A,” accuracy and miss rate of 96.66 and 3.34%, respectively, on dataset B, accuracy and miss rate of 99.11 and 0.89%, respectively, on dataset C, and accuracy and miss rate of 100% and 0%, respectively, on dataset A2. These results show that the proposed model achieved accuracy more than the previous models such as AlexNet, VGG 16, Inception, Res net, and NASNet ( 10 ) on dataset BUSI and B, SVM ( 11 ), AlexNet ( 12 ) on dataset A and B, Inception V3, Res net 50, VGG 16, VGG 19, and Inception V2 Res net ( 14 ).

www.frontiersin.org

Figure 13 . Statistical measures of dataset A.

www.frontiersin.org

Figure 14 . Statistical measures of datasets B, C, and A2.

www.frontiersin.org

Figure 15 . Accuracy and miss rate in contrast with previous studies.

The early detection and classification of breast cancer help to prevent the disease's spread. The use of transfer learning AlexNet on breast cancer classification and detection was examined in this work. Deep learning and transfer learning approaches are adapted to the specific properties of any dataset. The proposed model used the customized AlexNet technique on three datasets, A, B, C, and A2, A2 is the dataset A with two classes. This proposed model empowered with transfer learning achieved the best results by using the customized AlexNet. Dataset A has a maximum accuracy of 99.4%, whereas dataset B has a maximum accuracy of 96.70%, dataset C has a maximum accuracy of 99.10%, and dataset A2 has a maximum accuracy of 100%. In future work, we will apply fusion on these datasets for optimum results. We will also apply other CNN algorithms and our model of machine learning on these datasets.

Data Availability Statement

The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author/s.

Author Contributions

SA, A-u-R, and MFK have collected data from different resources. SA, MZ, and MAK performed formal analysis and simulation. SA, MFK, KA, and MZ contributed to writing—original draft preparation. A-u-R, MZ, and MAK contributed to writing—review and editing. MAK and AM performed supervision. SA, KA, and MFK drafted pictures and tables. MZ and AM performed revisions and improve the quality of the draft. All authors have read and agreed to the published version of the manuscript.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher's Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

1. World Health Organization Cancer. (2022). Available online at: https://www.who.int/news-room/fact-sheets/detail/cancer (accessed September 1, 2021).

2. Saeed S, Jhanjhi NZ, Naqvi M, Humyun M, Ahmad M. Optimized breast cancer pre-mature detection method with computational segmentation: a systematic review mapping. Approach Appl Deep Learn Virt Med Care . (2022) 5:24–51. doi: 10.4018/978-1-7998-8929-8.ch002

CrossRef Full Text | Google Scholar

3. Zhang X, Zhang Y, Zhang Q, Ren Y, Qiu T. Extracting comprehensive clinical information for breast cancer using deep learning methods. Int J Med Inform . (2019) 132:103985. doi: 10.1016/j.ijmedinf.2019.103985

PubMed Abstract | CrossRef Full Text | Google Scholar

4. Swain M, Kisan S, Chatterjee JM, Supramaniam M, Mohanty SN, Jhanjhi NZ, et al. Hybridized machine learning based fractal analysis techniques for breast cancer classification. Int J Adv Comput Sci Appl . (2020) 11:179–84. doi: 10.14569/IJACSA.2020.0111024

5. Nanglia S, Ahmad M, Khan FA, Jhanjhi NZ. An enhanced predictive heterogeneous ensemble model for breast cancer prediction. Biomed Signal Process Control . (2022) 72:103279. doi: 10.1016/j.bspc.2021.103279

6. Huynh B, Drukker K, Giger M. Computer-aided diagnosis of breast ultrasound images using transfer learning from deep convolutional neural networks. Med Phys . (2016) 43:3705. doi: 10.1118/1.4957255

7. Krizhevsky A, Sutskever I, Hinton GE. Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Syst . (2012) 25:1097–105. doi: 10.1145/3065386

8. Scarpazza C, Baecker L, Vieira S, Mechelli A. Applications of machine learning to brain disorders. Mach Learn . (2020) 3:45–65. doi: 10.1016/B978-0-12-815739-8.00003-1

9. Deepak S, Ameer PM. Automated categorization of brain tumor from MRI using CNN features and SVM. J Ambient Intell Humaniz Comput . (2021) 12:8357–69. doi: 10.1007/s12652-020-02568-w

10. Dhabyani WA, Gomaa M, Khaled H, Aly F. Deep learning approaches for data augmentation and classification of breast masses using ultrasound images. Int J Adv Comput Sci Appl . (2019) 10:1–11. doi: 10.14569/IJACSA.2019.0100579

11. Spanhol FA, Oliveira LS, Petitjean C, Heutte L. A dataset for breast cancer histopathological image classification. IEEE Trans Biomed Eng . (2015) 63:1455–62. doi: 10.1109/TBME.2015.2496264

12. Yap MH, Pons G, Martí J, Ganau S, Sentís M. Automated breast ultrasound lesions detection using convolutional neural networks. IEEE J Biomed Health Inform . (2017) 22:1218–26. doi: 10.1109/JBHI.2017.2731873

13. Ghazal TM, Abbas S, Munir S, Khan MA, Ahmad M. Alzheimer's disease detection empowered with transfer learning. Comput Mater Contin . (2021) 70:5005–19. doi: 10.32604/cmc.2022.020866

14. Saber A, Sakr M, Seida OMA, Keshk A, Chen H. A novel deep-learning model for automatic detection and classification of breast cancer using the transfer-learning technique. IEEE Access . (2021) 9:71194–209. doi: 10.1109/ACCESS.2021.3079204

15. Azevedo V, Silva C, Dutra I. Quantum transfer learning for breast cancer detection. Quantum Mach Intell . (2022) 4:1–14. doi: 10.1007/s42484-022-00062-4

16. Dewangan KK, Dewangan DK, Sahu SP, Janghel R. Breast cancer diagnosis in an early stage using novel deep learning with hybrid optimization technique. Multimed Tools App . (2022) 81:13935–60. doi: 10.1007/s11042-022-12385-2

17. Rasool A, Bunterngchit C, Tiejian L, Islam MR, Qu Q, Jiang Q. Improved machine learning-based predictive models for breast cancer Diagnosis. Int J Environ Res Public Health . (2022) 19:3211. doi: 10.3390/ijerph19063211

18. Lin RH, Kujabi BK, Chuang CL, Lin CS, Chiu CJ. Application of deep learning to construct breast cancer diagnosis model. Appl Sci . (2022) 12:1957. doi: 10.3390/app12041957

19. Alruwaili M, Gouda W. Automated breast cancer detection models based on transfer learning. Sensors . (2022) 22:876. doi: 10.3390/s22030876

20. Alshammari MM, Almuhanna A, Alhiyafi J. Mammography image-based diagnosis of breast cancer using machine learning: a pilot study. Sensors . (2021) 22:203. doi: 10.3390/s22010203

21. Castro-Tapia S, Castaneda-Miranda CL, Olvera-Olvera CA, Guerrero-Osuna HA, Ortiz-Rodriguez JM, Martinez Blanco MR, et al. Classification of breast cancer in mammograms with deep learning adding a fifth class. Appl Sci . (2021) 11:11398. doi: 10.3390/app112311398

22. Mashudi NA, Rossli SA, Ahmad N, Noor NM. Breast cancer classification: features investigation using machine learning approaches. Int J Integr Eng . (2021) 13:107–18. doi: 10.30880/ijie.2021.13.05.012

23. Islam MA, Tripura D, Dutta M, Shuvo MNR, Fahim WA, Sarkar PR, et al. Forecast breast cancer cells from microscopic biopsy images using big transfer (BiT): a deep learning approach. Int J Adv Comput Sci Appl . (2021) 12:478–86. doi: 10.14569/IJACSA.2021.0121054

24. Ahmed U, Issa GF, Khan MA, Aftab S, Khan MF, Said RAT, et al. Prediction of diabetes empowered with fused machine learning. IEEE Access . (2022) 10:8529–38. doi: 10.1109/ACCESS.2022.3142097

25. Siddiqui SY, Haider A, Ghazal TM, Khan MA, Naseer I, Abbas S, et al. IoMT cloud-based intelligent prediction of breast cancer stages empowered with deep learning. IEEE Access . (2021) 9:146478–91. doi: 10.1109/ACCESS.2021.3123472

26. Kaggle. (2021) Available online at: https://www.kaggle.com/mostafaeltalawy/brest-cancer (accessed September 1, 2021).

Google Scholar

27. Kaggle. (2021) Available online at: https://www.kaggle.com/anaselmasry/breast-cancer-dataset (accessed September 1, 2021).

28. Kaggle. (2021) Available online at: https://www.kaggle.com/aryashah2k/breast-ultrasound-images-dataset (accessed September 1, 2021).

29. Schmidhuber J. Deep learning in neural networks: an overview. Neural Netw . (2015) 61:85–117. doi: 10.1016/j.neunet.2014.09.003

30. Arooj S, Khan MF, Khan MA, Khan MS, Taleb N. Machine Learning Models for the Classification of Skin Cancer. In: International Conference on Business Analytics for Technology and Security (ICBATS) (IEEE) (2022). p. 1–8. doi: 10.1109/ICBATS54253.2022.9759054

31. Alom MZ. The history began from alexNet: a comprehensive survey on deep learning approaches. Arxiv . (2018) 18:1–12. doi: 10.48550/arXiv.1803.01164

Keywords: breast cancer (BC), deep learning (DL), learning rate (LR), machine learning (ML), transfer learning (TL), convolutional neural network (CNN)

Citation: Arooj S, Atta-ur-Rahman, Zubair M, Khan MF, Alissa K, Khan MA and Mosavi A (2022) Breast Cancer Detection and Classification Empowered With Transfer Learning. Front. Public Health 10:924432. doi: 10.3389/fpubh.2022.924432

Received: 20 April 2022; Accepted: 31 May 2022; Published: 04 July 2022.

Reviewed by:

Copyright © 2022 Arooj, Atta-ur-Rahman, Zubair, Khan, Alissa, Khan and Mosavi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Muhammad Adnan Khan, adnan@gachon.ac.kr

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Diagnostics (Basel)
  • PMC10341268

Logo of diagno

Breast Cancer Classification through Meta-Learning Ensemble Technique Using Convolution Neural Networks

Muhammad danish ali.

1 Department of Computer Science, COMSATS University Islamabad, Abbottabad Campus, Abbottabad 22060, Pakistan; moc.liamg@986ilahsinaddammahum (M.D.A.); moc.liamg@06587nanda (A.S.); moc.liamg@ihalebiabuh (H.E.); kp.ude.dtaiuc@neetam (M.M.Y.)

Adnan Saleem

Hubaib elahi, muhammad amir khan.

2 Faculty of Computer and Mathematical Sciences, Universiti Teknologi MARA, Shah Alam 40450, Malaysia

Muhammad Ijaz Khan

3 Institute of Computing and Information Technology, Gomal University, Dera Ismail Khan 29220, Pakistan; moc.liamg@171zaji

Muhammad Mateen Yaqoob

Umar farooq khattak.

4 School of Information Technology, UNITAR International University, Kelana Jaya, Petaling Jaya 47301, Malaysia

Amal Al-Rasheed

5 Department of Information Systems, College of Computer and Information Sciences, Princess Nourah Bint Abdulrahman University, Riyadh 11671, Saudi Arabia; as.ude.unp@deehsarlaaa

Associated Data

We ran simulations to see how well the proposed approach performed. Any questions concerning the study in this publication are welcome and can be directed to the lead author (Muhammad Danish Ali) upon request.

This study aims to develop an efficient and accurate breast cancer classification model using meta-learning approaches and multiple convolutional neural networks. This Breast Ultrasound Images (BUSI) dataset contains various types of breast lesions. The goal is to classify these lesions as benign or malignant, which is crucial for the early detection and treatment of breast cancer. The problem is that traditional machine learning and deep learning approaches often fail to accurately classify these images due to their complex and diverse nature. In this research, to address this problem, the proposed model used several advanced techniques, including meta-learning ensemble technique, transfer learning, and data augmentation. Meta-learning will optimize the model’s learning process, allowing it to adapt to new and unseen datasets quickly. Transfer learning will leverage the pre-trained models such as Inception, ResNet50, and DenseNet121 to enhance the model’s feature extraction ability. Data augmentation techniques will be applied to artificially generate new training images, increasing the size and diversity of the dataset. Meta ensemble learning techniques will combine the outputs of multiple CNNs, improving the model’s classification accuracy. The proposed work will be investigated by pre-processing the BUSI dataset first, then training and evaluating multiple CNNs using different architectures and pre-trained models. Then, a meta-learning algorithm will be applied to optimize the learning process, and ensemble learning will be used to combine the outputs of multiple CNN. Additionally, the evaluation results indicate that the model is highly effective with high accuracy. Finally, the proposed model’s performance will be compared with state-of-the-art approaches in other existing systems’ accuracy, precision, recall, and F1 score.

1. Introduction

Breast cancer is a type of cancer that affects women worldwide, with a high incidence rate and significant morbidity and mortality. The two main types of cancers are benign and malignant cancers. Benign cancers are less dangerous to life since the condition exists inside the affected area and does not spread to other bodily areas. The malignant tumor, sometimes called a cancerous tumor, spreads to other bodily components and negatively impacts them, making it an extremely dangerous type of cancer. When the body’s internal tissues are destroyed, abnormally growing body cells result in malignant tumors, often known as cancer. If a malignant tumor is not treated early, it will soon become incurable and cause death. According to the World Health Organization’s (WHO) estimates from 2018, malignant tumors are currently the leading cause of mortality among other diseases [ 1 ]. When someone is told they have a tumor, the widespread opinion is that the patient’s life is at risk. It is not true until the tumor is determined to be malignant. In the present research, we examine the accuracy of tumor classification using meta-learning techniques. If a benign tumor is discovered, the concern is unnecessary because it is not extremely dangerous. Simple modifications to your diet and lifestyle can manage it. A benign tumor can be readily removed by simple surgery because the immune system surrounds it with a protective sac, isolating it from the body. Malignant tumors can sometimes develop from benign tumors; however, this seldom happens. As a result, medical professionals encourage their patients to get regular exams. Malignant tumors multiply uncontrolled and spread quickly to other regions of the body; thus, if a medical professional detects one in a patient, care must be taken. A tumor can be classified depending on its nature, which is a difficult agreement. Initial features of a tumor, such as its elasticity, form, and discomfort, may aid in its diagnosis and categorization. The negative effects of a tumor could be minimized with early identification. There are several types of cancer, including breast cancer, brain cancer, marrow cancer, and blood cancer, etc. We want to use deep learning to classify breast cancer. One of the more rapidly developing illnesses in women is breast cancer. Every year, a significant number of new breast cancer cases are recorded. According to the World Health Organization’s (WHO) study from January 2018, over 627,000 women worldwide passed away from breast cancer, accounting for 15% of all women’s cancer-related deaths. The overall incidence of breast cancer is higher in developed countries. The most important phase in preventing breast cancer-related fatalities is the early discovery and categorization of the disease. Early-stage breast cancer may be evaluated through screening, which detects cancer when minor breast symptoms develop. Other techniques for breast screening include mammography and clinical breast examinations, etc. Medical professionals use brief-intensity X-rays in mammography to look for abnormalities in the breasts. Ultrasound and MRI scans are two more imaging methods to look for problems with breast cancer [ 2 ].

The WHO recommends mammography screening for breast cancer in developed countries where individuals have access to resources and are concerned about their health. Women between 50 and 69 who are well-educated and resourced should periodically get a routine exam. Mammography is not as good and economical in underdeveloped nations with few resources and poor health conditions. Increasing knowledge of that can help with the early diagnosis of breast tumors [ 3 ]. Invasive ductal carcinoma, ductal carcinoma in situ, and invasive lobular carcinoma are the three most prevalent kinds of breast cancer, as shown in Figure 1 .

An external file that holds a picture, illustration, etc.
Object name is diagnostics-13-02242-g001.jpg

Most common types of breast cancer.

The most frequent type of breast cancer is invasive ductal carcinoma (IDC), which involves the infiltration of the ductal carcinoma in the breast. These are the cancer categories. Invasive ductal carcinomas (IDC), which originate in the milky ducts of the breast, account for around 80% of all cases of breast cancer. Compared to ductal breast cancer, which starts from the milk ducts, the “pipes” that convey milk from the breast-producing lobes to the breast the nibble, invasive cancer has “invaded” or spread to the surrounding tissues of the breasts. Any cancer known as carcinoma develops from the tissues or skin that cover the internal organs, such as the tissue of the breast. Invasive ductal carcinoma refers to cancer that has made it through the milk duct wall and has started to invade the breast tissues, all in all. The IDC can infect the lymph nodes and migrate to other body areas over time [ 4 ].

Invasive breast cancer affects approximately 12 percent of women in the USA, and the majority of these cases are classified as IDC, according to a report by the American Cancer Society. IDC affects people at any stage of life; however, it is more prevalent among older women. Research from the American Cancer Society states that invasive breast cancer is seen in almost two-thirds of women age 55 or older. So, it is vital to determine the type of tumors as early as feasible. A biopsy of the target organ is used to confirm the presence of tumors. There are several ways to determine the kind of tissues being studied. There are multiple ways to determine the kind of tissues being examined. In this paper, we use different deep learning approaches to classify the types of tumors and compare their performance and score [ 5 ]. The result using these approaches helps in early tumor classification, reducing the number of cases and improving outcomes for those affected by the disease. The main contributions of this paper are as follows.

  • Development of an optimized breast cancer detection and early diagnosis model using a meta-learning algorithm integrated with deep learning technique;
  • This research proposed that meta-learning algorithms are designed to excel at few-shot learning, where the goal is to learn a new task with only a small amount of training data. Traditional machine learning algorithms typically require much data to learn a new task effectively;
  • Proposal to build a strong ensemble classifier with a meta-learning algorithm for the accurate identification of different metastases in breast tumors;
  • Demonstrations of the experimental outcomes were conducted using the BUSI dataset.

The paper is structured as follows: Section 1 and Section 2 provide an overview of the background and importance of breast cancer detection. Section 3 presents the proposed meta-learning algorithm for breast cancer detection. Section 4 is about the results of the proposed methodology and finally, Section 5 concludes the articles.

2. Related Work

Breast cancer is a leading cause of cancer-related deaths among women worldwide, making early detection crucial for improving patient outcomes. Using deep learning models to detect and classify breast cancer in mammography and histopathological images. Several studies have shown promising results in this area, including using attention-based models, convolutional neural networks with small SE blocks, and multi-task learning. Additionally, deep residual networks are effective in accurately classifying breast cancer

Ref. [ 5 ] used a large dataset of breast cancer images in their study and achieved high accuracy in classifying benign and malignant cases. Another study [ 6 ] demonstrated promising results in accurately segmenting breast ultrasound images, an important step in diagnosing breast cancer.

In [ 7 ], the CNN model was effective in classifying malignant and benign breast cancer cases. They [ 8 ] propose a hidden Markov model (HMM)-based approach for estimating pedestrian walking direction in smart cities. It compares the performance of various HMM-based models’ datasets. This method outperformed traditional machine learning methods in accurately estimating pedestrian walking direction. The study also identified the critical features contributing to pedestrian walking direction estimation. The proposed method could be implemented in real-time pedestrian monitoring systems in smart cities, potentially improving pedestrian safety in urban areas significantly.

In [ 3 , 9 ], they combined feature selection and extraction techniques with DL models to improve prediction accuracy. This strategy beats traditional machine learning methods to predict accuracy. The study also found the essential characteristics strongly linked to the clinical outcome of breast cancer. This study lays the groundwork for future research into creating accurate and reliable prediction models for breast cancer clinical outcomes. They discussed the advantages of deep learning in detecting and classifying cancer through mammography, including the potential to achieve higher accuracy rates than traditional methods. They also highlighted the importance of larger datasets for training deep learning models and the use of transfer learning and data augmentation techniques to improve model performance. In addition, they discussed the use of deep understanding in breast histology for analyzing tissue structures and identifying patterns associated with breast cancer [ 10 , 11 ]. The model combines an attention mechanism with a (CNN) for the most informative portions of the image for classification. Test the suggested model’s performance using a publicly available dataset and found that it outperformed traditional ML approaches and DL models. The proposed model’s attention mechanism enables the detection of the most significant regions in mammography pictures, potentially reducing false positives and unnecessary biopsies [ 12 ]. By automated picture processing, the researchers hoped to improve the accuracy of a breast cancer diagnosis. The researchers separated a publicly available dataset of breast histopathology images into training and testing sets. Next, they trained and tested the model’s effectiveness in differentiating malignant and non-cancerous images using the CNN with tiny SE blocks. The suggested model outperformed numerous state-of-the-art deep learning models (AUC) [ 13 ]. They show their multi-task learning strategy outperforms single-task learning approaches in classification accuracy. They speculate that this approach could be helpful in additional medical picture classification tasks. The paper discusses the potential of deep learning models for enhancing the accuracy and efficiency of breast cancer diagnosis and the need to address multi-task learning approaches for best performance [ 14 ]. Their research presents a breast cancer classification method based on deep residual networks. They emphasize the importance of accurately identifying breast cancer from mammography images for early diagnosis and treatment. They extract information and classify photos using a deep learning approach, specifically residual networks. The suggested system is trained and assessed using a large dataset of mammography pictures. The findings show that the proposed strategy is successful and outperforms existing methods for breast cancer categorization. The work adds to current attempts to improve breast diagnosis. Ref. [ 15 ] offers a deep learning strategy for classifying breast cancer utilizing a multiple-model scheme in their work. To boost classification accuracy, the scientists used transfer learning to adapt pre-trained models to the breast cancer dataset and merge numerous models. The suggested method outperforms existing deep learning models for breast cancer categorization, according to the findings. The work with a breast cancer diagnosis used deep learning techniques, which can be utilized to increase the accuracy of breast cancer diagnosis in clinical practice. Ref. [ 16 ] proposes a method that uses a meta-learning framework to learn the optimal weight initialization for different CNN models. The CNN models are pre-trained on other datasets to capture various features. The optimal weight initialization known by meta-learning is used to fine-tune the CNNs on a breast cancer histopathology image on the test dataset. The experiment demonstrates the effectiveness of the proposed multi-model scheme with meta-learning for breast cancer classification. Breast cancer is a severe health concern, and research efforts focus on improving diagnosis and treatment. ML and DL techniques have shown great potential in accurately diagnosing breast cancer [ 1 , 17 ]. The proposed method involves training multiple CNN models with different hyperparameters and architectures and then using an ensemble learning technique to combine the outputs of these models for improved accuracy. They achieved a high classification accuracy of 97.4%. The results suggest that the proposed method has great potential for improving breast cancer diagnosis, which can ultimately lead to better patient outcomes. Ref. [ 18 ] proposed method leverages meta-learning, which aims to learn how to learn, to improve the model’s ability to adapt to new tasks. They used multiple pre-trained models as base learners and utilized a feature fusion technique to combine the features extracted from different models. The fused parts are then used to train a meta-learner to improve the classification accuracy. The experimental findings show that the suggested method beats state-of-the-art breast cancer classification methods regarding accuracy and stability. They offer a promising approach for enhancing breast cancer classification performance by combining meta-learning and feature fusion techniques. The work of ref. [ 19 ] entails training and combining numerous deep learning models to obtain more accurate results. They examined their approach using a dataset of breast cancer histopathology images, with promising results. This method automated diagnostic systems while reducing human errors in breast cancer diagnosis. Ref.’s [ 20 ] proposed process entails training multiple convolutional neural network models on different subsets of data and then combining their predictions with an ensemble learning strategy. They also used data augmentation techniques to use for generalization performance. The findings suggest that combining deep learning and ensemble learning can significantly improve breast cancer classification accuracy, which has important implications for clinical decision making and treatment planning. Ref. [ 21 ] compared the multi-classification performance of breast cancer histopathology images using conventional machine learning and deep learning approaches. According to their findings, deep learning models outperformed traditional machine learning models in accuracy. Ref. [ 22 ] suggested a dual-branch (CNN) for breast cancer picture categorization in 2020. Their model intends to extract global and local image features via the network’s two branches. To learn the high-level features of the images, the global branch used a pre-trained ResNet50 network, whereas the local department used a multi-scale feature extraction module. The breast cancer histopathology image dataset obtained high classification accuracy, proving the dual-branch CNN’s usefulness for breast cancer classification.

Refs. [ 22 , 23 ] examine the difficulties in mammography picture interpretation and DL approaches to improve the accuracy and efficiency of a breast cancer diagnosis. Various deep learning models and architectures are utilized for mammography analysis. The review also discusses the various datasets used in the studies and the assessment measures used to perform models. The research continues with a discussion of the limitations and future directions of deep learning for mammography analysis, such as the necessity for larger and more diverse datasets and the relevance of model interpretability and explainability. This review is valuable for researchers and practitioners using deep learning approaches to diagnose breast cancer. The [ 24 ] is based on a bag-of-features convolutional neural network ensemble (BOF-CNNs). The suggested method extracts information from mammograms and classifies them as cancer or benign using a combination of deep learning and standard image processing techniques. The BOF-CNNs method extracts features from various patches of an input picture that are then aggregated using the bag-of-features method. The BOF-CNN ensemble is trained to classify the characteristics and deliver a final diagnosis. The DDSM dataset includes both benign and malignant mammograms. The results reveal that the ensemble of BOF-CNNs outperforms other state-of-the-art approaches in terms of accuracy, sensitivity, and specificity in breast cancer classification. The proposed system is durable and adaptable to different types of mammography, making it a potential tool for automated breast cancer detection. Ref. [ 25 ] offers a breast cancer classification strategy combining deep learning and multiple kernel learning. The proposed method extracts feature from histopathology pictures using several convolutional neural networks (CNNs) and then integrates these features using multiple kernel learning to improve classification job performance. The findings demonstrated that the suggested strategy outperformed existing methods. The research delves into the possibilities of merging DL and multiple kernel learning for BC classification, emphasizing the significance of establishing robust, and reliable approaches. Ref. [ 26 ] used deep learning, feature selection, and extraction approaches to predict the clinical prognosis of breast cancer. According to their findings, integrating these methodologies can increase the accuracy of breast cancer outcome prediction. Refs. [ 4 , 27 ] proposed a method that incorporates multiple medical images. Their method used a deep CNN and transferred learning for accurate classification. Ref. [ 28 ] used multi-level CNNs and ensemble learning to develop a breast cancer classification method. Their approach improves classification accuracy by combining multiple CNN models. Ref. [ 29 ] developed a modified CNN model for breast cancer classification, incorporating data augmentation and ensemble learning to enhance classification accuracy. Ref. [ 30 ] used deep learning and rule-based feature selection for breast cancer classification. Their approach used a deep CNN for feature extraction and a rule-based feature selection method for classification. Ref. [ 31 ] used transfer learning and adversarial training to classify mammogram images. They used a deep CNN model for classification. Ref. [ 32 ] reviewed various DL models and discussed their accuracy. Ref. [ 33 ] reviewed and compared deep learning-based breast cancer classification methods. They discussed different models and their accuracy for classification. Ref. [ 34 ] developed a multi-modal breast cancer classification method that combines mammography and ultrasound images. Their approach utilizes an ensemble deep learning model to improve classification accuracy. Ref. [ 35 ] proposed a method based on a (CNN) and adaptive feature fusion. They achieved high accuracy using a weighted feature fusion strategy, combining the features extracted from different CNNs. Ref. [ 36 ] suggested a semi-supervised multi-view CNN-based technique for classifying breast cancer. The suggested technique obtained great accuracy by including labeled and unlabeled samples in the training phase. Ref. [ 37 ] created a DL framework for BC that incorporated mammography and ultrasound images. The suggested approach obtained great accuracy by leveraging characteristics collected from many levels of a CNN. Ref. [ 38 ] proposed deep learning and feature selection-based breast cancer classification techniques. They achieved great accuracy using a genetic algorithm to choose the most important information from mammography images. Ref. [ 39 ] conducted mammography image classification methods, describing existing state-of-the-art approaches and indicating new research avenues. Ref. [ 40 ] created a DL technique for BC classification merged mammography, and ultrasound pictures, obtaining good accuracy by merging the information derived from diverse modalities. Ref. [ 41 ] introduced a unique breast cancer classification approach that combines deep learning and feature selection, obtaining high accuracy by identifying the most discriminative features in mammography pictures. Ref. [ 42 ] conducted another study on BC using the Xception method, a form of the DL model. The study found that the Xception algorithm accurately classified breast cancer histopathology images.

This transfer learning approach performs effectively, even with minimal datasets. In the proposed study, the essential learners are combined utilizing the meta-learning technique.

The current literature review primarily highlights two issues: first, the limitation posed by imbalanced datasets, and second, the recurring issue of overfitting observed in numerous studies. Our research aims to provide solutions to these prevalent problems.

3. Material and Methods

Early detection and diagnosis of breast cancer can boost survival rates dramatically. DL-based CAD systems have recently shown promising results in the automatic classification of BC—a meta-learning strategy for breast cancer classification using multiple convolution neural networks (CNNs). The Breast Ultrasound Imaging Dataset (BUSI) was utilized to assess the performance of our suggested technique. The BUSI dataset contains many benign and malignant breast ultrasound images with varying characteristics, making it difficult to classify. The proposed method improves breast cancer classification performance and gives a more reliable and accurate diagnosis.

3.1. Proposed Methodology

The proposed classification approach for breast cancer is detailed in this section. Figure 2 issues a brief overview of the proposed method to classify breast cancer as an initial phase. The images of breast cancer are pre-processed to enhance their image quality and size, contrast enhancement, and noise removal approaches. Subsequently, deep learning characteristics are retrieved from the processed images using a few basic CNN models trained on the ImageNet dataset. A meta-learner is then given the prediction results from the fundamental CNN design for the final classification of images of breast cancer. The base classifier and the meta-learner have been trained using the BUSI dataset, which consists of benign and malignant breast cancer images. The meta-model, prepared using the BUSI dataset consisting of benign and malignant breast cancer images, is used to classify unseen breast cancer images as benign or malignant.

An external file that holds a picture, illustration, etc.
Object name is diagnostics-13-02242-g002.jpg

Proposed architecture for binary class breast cancer classification.

3.2. Dataset

We have used the Breast Ultrasound Images Dataset (BUSI), which contains two classes: benign and malignant. Both types are highly imbalanced. Classification model performance can be significantly affected by an imbalanced collection of data. When deep networks are trained with an imbalanced dataset, classification bias appears. Additionally, a data augmentation technique was used to increase the size of both classes. The new dataset contains 10,000 breast cancer images for the malignant and benign classes and it is depicted in Figure 3 below. The benign and malignant categories consist of 5000 and 5000 images, respectively. The dataset also includes associated annotations for each image, including tumor location and size. The BUSI dataset proved especially effective in designing and testing computer-aided diagnosis (CAD) systems, which utilize ML and DL algorithms to help radiologists interpret medical pictures. The BUSI dataset is an excellent resource for BC detection and diagnosis researchers and healthcare providers. Researchers can design more accurate and efficient CAD systems by training and assessing machine learning models on this dataset, which can enhance accuracy and speed. The images were classified into benign and malignant categories based on biopsy results. The dataset was split into training, validation, and test sets, with 70%, 10%, and 20% of the images in each group.

An external file that holds a picture, illustration, etc.
Object name is diagnostics-13-02242-g003.jpg

Breast cancer images, benign and malignant.

3.3. Pre-Processing

The information provided discusses a research effort that focuses on breast cancer categorization utilizing meta-learning methodologies and multiple convolution neural networks (CNNs). It seeks to improve CNN performance using a meta-learning strategy that entails training multiple CNNs using BUSI datasets and then combining their predictions for higher accuracy. They used the BreaKHis dataset, which contains pictures of breast cancer tumours. This publicly available dataset is divided into three categories: training, validation, and testing. Seventy percent of the photos are designated for training, ten percent for validation, and twenty percent for testing.

The work first trains numerous CNNs on distinct subsets of the training dataset to adopt the meta-learning approach. Each CNN is introduced with a unique set of hyperparameters such as learning rate, batch size, and optimizer. The trained CNNs are then used to create predictions on the testing dataset, aggregated using a weighted average. They evaluated the suggested approach’s performance using many criteria, including accuracy, sensitivity, and specificity. The results reveal that the proposed method outperforms numerous baseline models for breast cancer classification with reasonable accuracy. They improve CNN performance for breast cancer classification and can be expanded to other medical imaging jobs requiring precise and reliable predictions.

3.4. Classification Models and Fine Tuning

In this study, a meta-learning strategy was used, with various convolutional neural networks (CNNs) serving as the base models. The extensive ImageNet dataset, which consists of numerous general picture datasets, served as the initial training ground for these model. The breast ultrasound dataset was then used to refine them. As base models, the following CNN architectures were used.

3.4.1. Inception V3

Inception V3 is a deep CNN architecture that uses factorized convolutions to reduce the number of parameters. Inception is an image categorization framework based on deep convolutional neural networks. It was first presented in 2014 and has since grown in popularity as a tool for picture recognition. The ImageNet dataset, which contains millions of photos and is widely used as a benchmark for image recognition, was utilized to pre-train the Inception model used in this study. To enhance the ability to fine-tune the model on the dataset, we have added a dense layer with “relu” activation, dropout, and softmax layers with seven different outputs at the bottom of the architecture. A stochastic gradient descent (Adam) optimizer with a momentum of 0.9 and a learning rate of 0.0001 are used to finalize this design on 10,000 image samples for 30 epochs.

3.4.2. ResNet50

ResNet50 is a deep CNN design that uses residual connections to aid in the prevention of the vanishing gradient problem. ResNet50 is an architecture for deep residual neural networks that was introduced in 2015. It is intended to incorporate skip connections that allow gradients to move through the network more freely. The ImageNet dataset was also used to train the ResNet50 model used in this study. To enhance performance, we modified ResNet50 to incorporate a dense layer with ‘relu’ activation, dropout, and softmax layers with different outputs. The improved ResNet50 is then fine-tuned on 10,000 images (for 30 epochs) using an Adam optimizer parameter with a momentum of 0.9 and a learning rate of 0.0001.

3.4.3. DenseNet121

DenseNet121 is a deep CNN design that uses dense connections to improve information flow between layers. DenseNet121 is an architecture for deep convolutional neural networks introduced in 2016. It is intended to increase gradient flow and reduce the number of network parameters by introducing dense blocks that allow all levels to access the feature maps of all preceding layers directly. So, after making some adjustments, such as adding a dense layer with “relu” activation, dropout, and softmax layers with different outputs, we incorporated denseNet121 as a model. The altered architecture was then refined using 10,000 images throughout 30 epochs, using a learning rate of 0.0001 and an Adam optimizer with a momentum of 0.9.

3.5. Feature Extraction

Extracting relevant features from raw data can be used for classification or other analysis tasks. Features extraction is significant in breast cancer classification using medical pictures because it identifies the relevant properties of the images that suggest malignant tissue. Handcrafted feature extraction and DL feature extraction are two strategies used in breast cancer categorization. Handmade feature extraction entails extracting features based on prior knowledge about malignant and benign tissue properties such as texture, shape, and intensity. These features are subsequently utilized for training a BC classifier. On the other hand, deep learning-based feature extraction entails using deep CNN to learn the relevant medical images automatically. This technique has demonstrated promising results in breast cancer classification because it can find complicated patterns and correlations in data that handmade features may be unable to identify.

The performance of feature extraction techniques is typically evaluated using metrics such as accuracy, sensitivity, specificity, and F1 score. These metrics provide a measure of the effectiveness of the feature extraction technique in identifying cancer cells. Feature extraction is an essential step in breast cancer classification using medical images.

Deep learning utilizes multiple layers of machine learning to extract characteristics from images. The CNN model uses several layers: convolutional, pooling, nonlinear, and fully connected. The convolutional layer is given a picture of a matrix with a range of pixel values. A convolutional layer moves a small matrix called a filter that acts on the input matrix by passing the input matrix through one of its vertices. By adjusting the filter while using the input image, convolution is produced. The filter’s task is to multiply each value by the sum of the values in the very first few pixels. After all multiplication, each discount is combined to produce a single value. A filter is moved from one of the corners to the other by the filter doing similar behaviors.

Equation (1) displays the convolution process. The input is represented in Equation (1) by the number t , our kernel function by the symbol u , and the rows and columns of the image matrix by the letters m and n , respectively. An additional pooling layer is introduced to neural networks before the convolutional layer has been removed. When adding new layers, convolutional neural networks often add a pooling layer following a convolutional layer. Most neural networks use a pooling layer to reduce the size of their feature maps while retaining essential data [ 25 ].

The activation function ReLU (rectified linear unit) creates nonlinearity. The activation function of ReLU provides evidence for its primary mode of operation. ReLu is a per-pixel operation that replaces all negative values of a pixel inside the feature map with zero or put it another way. The network’s output will be zero if its input is 0, as the ReLU function activates only when the node input’s value seems higher than 0. Anytime the information is less than zero, the output will also be zero. These activation functions are simple to train and offer high performance, making them the activation function used by most neural networks.

Equation (2) displays the relu activation function with “ j ” as the input. After a sequence of layers, including convolutional, relu, pooling, and nonlinear layers, a highly connected layer is necessary. Before being used as an input by the fully connected layer, the output of the last pooling or convolution layer is flattened [ 26 ]. The process of flattening involves turning a three-dimensional matrix into a vector. A fully connected layer produces an n-dimensional vector, where n is the total number of categories in the dataset. The output layer employs a softmax activation function for classification in a fully connected layer. The softmax function converts the output vector of a fully connected layer to vector values between 0 and 1, which it then sums. Feature extraction refers to extracting discriminative features from the input data using CNN. We optimize the CNN structure and hyperparameters through a grid search and process to identify the optimal configuration.

3.6. Performance Metrics

In this study, we suggested a unique strategy to classify breast cancer utilizing meta-learning with several CNN models as base models. The recommended approach used three well-known CNN architectures as foundation models: Inception V3, ResNet50, and DenseNet121. The study’s findings revealed that the proposed method beat individual CNN models regarding the accuracy, sensitivity, specificity, and F1 score. A meta-learning technique using three basic models, Inception V3, ResNet50, and DenseNet121, yielded the most significant results. The proposed method achieved accuracy, precision, recall, and F1 score. The study indicated that meta-learning with several base models is an efficient method for classifying breast cancer. The technique enables the combination of the strengths of numerous CNN models to outperform individual models. Using multiple base models also reduces the risk of overfitting and increases the model’s generalization capacity.

The proposed method for breast cancer categorization is promising, with substantial implications for enhancing the accuracy and speed of classification. They advised that in future studies, they investigate using more CNN models as basic models, incorporate different types of medical pictures, and investigate the use of other meta-learning techniques. The outcome of the performance metric is accuracy 90%, precision, recall, and F1 score. Accuracy was measured by determining the proportion of correctly classified samples out of the total number of samples.

Sensitivity or recall was calculated by determining the proportion of true positives out of all positive samples.

Sensitivity or precision was calculated by determining the proportion of true positives out of all positive samples.

Specificity was defined as the fraction of true negatives in all negative samples.

The F1 score was calculated as a combination of precision and recall.

3.7. Meta-Learning

Meta-learning approaches involve learning a meta-model that can predict the performance of other models on specific tasks. The meta-model is typically trained on a set of meta-features that describe the characteristics of the model, such as their complexity, accuracy, and generalization ability. Meta models were trained on the predictions of the three base models on the validation set and then used to predict the final classification of the test set as discussed in Algorithm 1 below. Our results show that we used to improve the prediction of our meta-model.

Proposed Method for Breast Cancer Classification
1: Breast Cancer Images (A, B); where B = {b/b ∈ {Benign, Malignant}}
2: The model that classifies the breast image a ∈ A
3: Resize dataset images to 300 × 300 dimensions
4: Image data augmentation can be used to address the overfitting issue.
5:  Image normalization
6: Set of CNN pre-trained models X = {Resnet50, DenseNet121, InceptionV3,}
7:    for Xx ∈ x do
8:    epochs = 1 to 30
9:   for mini-batches (Ai, Bi) ∈ (Atrain, Btrain) do
10:    Model parameters changed
11:    if all over the previous five epochs, the accuracy of the validation is not increasing then
12:    Training has to end.
13:     end if
14:       end for
15:        end for
16:     for all A ∈ Atest do
17: Combined outputs from all models should be fed into logistic regression for final classification.
18:  end for

In this approach, we propose as shown in Figure 4 below, three novel CNN architectures—ResNet50, DenseNet121, and InceptionV3—ensembled in our proposed methodology. Neither of them employed the latest developments in the meta-learning technique to classify benign and malignant in the literature review. An optimizer is used to change the model’s learning rate. In this research, the Adam optimizer is utilized. The training accuracy score is evaluated using the accuracy measure. To identify the loss, binary cross-entropy is used. In binary class classification, this is the loss function that is most frequently utilized. The model performs better when the loss score goes down. Each model’s output is given to the meta-learner. We use a logistic regression classifier as a meta-learner to make our final prediction.

An external file that holds a picture, illustration, etc.
Object name is diagnostics-13-02242-g004.jpg

Architecture of the proposed meta-learning of CNN models.

4. Experimental Setting and Performance Evaluation

We thoroughly assess and analyze the performance outcomes obtained using various model configurations to demonstrate the effectiveness of our meta ensemble model in screening benign and malignant breast cancer. We will now go on to the experimental conditions, performance indicators, quantitative and qualitative findings analysis, and discussion.

The distribution of samples in the dataset from both classes is shown in Table 1 . Presents the count distribution of images across all classes in the whole dataset. A split ratio of 70:10:20 is used to divide the total dataset into training, validation, and test sets. The images included in the dataset with the ratio mentioned above are then used to train and evaluate the meta ensemble model, as well as the individual sub-models. We use image augmentation to address the issue of a short dataset, improve training effectiveness, and guard against model overfitting. Additionally, image augmentation is thought to improve the generalizability of models. To address the issue of limited dataset size, data augmentation was used to expand the training dataset. Here is a summary of the augmentation features that were employed as well as other hyperparameters that were set.

Distribution of samples in the dataset from both classes, benign and malignant.

ClassNumber of Samples
TrainingValidationTestingTotal Images
Benign350050010005000
Malignant350050010005000

For its implementation, we decided to leverage the TensorFlow and Keras functional APIs. Using Google Colab, which offers free GPU access, we train and evaluate our models. Model configuration and augmentation features are shown in Table 2 . For the training of models and model validation, we employ the Adam optimizer with momentum. For the Adam optimizer, we used an initial learning level of 0.0001. In addition, we employed the binary cross-entropy loss function for both training and validating the model. The binary cross-entropy loss function is an obvious option for a binary classification job, such as differentiating between malignant and benign breast cancer, as it speeds up model convergence. Additionally, we make use of the model checkpoint and reduce loss plateau decay (ReduceLROnPlateau) callbacks from Keras.

Model configuration and augmentation features.

ParametersValue
Max epochs30
Size of batch32
OptimizerAdam
Loss functionBinary cross-entropy
Learning rate0.0001
Range of rotationRandom with factor (0.5)
ShufflingYes
FlipNearest

Results Analysis

Table 3 shows the performance results for different CNN models and the proposed meta-model in classifying benign and malignant images for breast cancer diagnosis. Each CNN model was evaluated based on its ability to classify benign and malignant images accurately. The performance measures were accuracy, precision, recall, and F1 score. For Inception V3, an accuracy of 0.83 is achieved for benign and malignant images. The precision of 0.78 for benign and 0.91 for malignant. Recall 0.93 for benign and 0.74 for malignant images. F1 score of 0.85 for benign and 0.82 for malignant images.

The performance results obtained from both the CNN and the proposed meta-model.

ModelClassAccuracyPrecisionRecallF1 Score
Inception V3Benign0.830.780.930.85
Malignant0.910.740.82
ResNet50Benign0.880.840.940.89
Malignant0.930.820.88
DenseNet121Benign0.840.810.890.85
Malignant0.880.790.83
Ensemble Meta-ModelBenign0.900.860.950.90
Malignant0.940.840.89

In the case of ResNet50: The accuracy of 0.88 for benign and malignant images. The precision of 0.84 for benign and 0.93 for malignant. Recall 0.94 for benign and 0.82 for malignant images. The F1 score for benign images was 0.89, while the score for malignant images was 0.88.

In the case of DenseNet121: The accuracy of 0.84 for benign and malignant images. The precision of 0.81 for benign and 0.88 for malignant. Recall 0.89 for benign and 0.79 for malignant images. The model achieved an F1 score of 0.85 for benign and 0.83 for malignant images. In the DenseNet121 model case, the training and validation accuracies are relatively high and close to each other, which suggests that the model fits the data in its well-trained learning of the underlying patterns in the training data and generalizing well to unseen data.

In our proposed meta-model: The proposed meta-model outperformed the individual CNN models regarding accuracy, precision, recall, and F1 score. The results for the meta-model are as follows.

The model’s overall performance was evaluated using accuracy scores, and it achieved a consistent score of 0.90 for both benign and malignant images. Precision score for benign images was 0.86, while the score for malignant images was 0.84. Recall was 0.95 for benign and 0.89 for malignant images. The F1 score for benign images was 0.90, while the score for malignant images was 0.89.

The results suggest that the proposed meta-learning ensemble technique CNN could be a promising approach for improving the accuracy and reliability of breast cancer diagnosis. Accurately classifying cancer images for every category is crucial for an efficient diagnosis system. The meta-model does very well to classify benign instances clear of malignant moles. Additionally, using data augmentation and dropout regularization techniques has helped achieve good results.

Additionally, we keep updated on the learning curves for every model we have looked at. The models have a moderate learning trend throughout training while displaying a rather consistent decline in validation losses (as seen in Figure 5 ). Additionally, the initial training of the model was conducted on the BUSI dataset for both benign and malignant classes. Subsequently, the model was tested on breast cancer images, achieving accuracy after 30 epochs of training (as shown in Figure 6 ). The meta-model converges training and validation losses far more effectively than the CNN sub-models. Because our dataset only consists of a few events, learning curves generally do not overfit. The stacked ensemble model’s use of data augmentation and dropout regularization techniques has mostly been responsible for achieving this. Training the meta-model helps ensure that the final model generalizes well to new unseen data.

An external file that holds a picture, illustration, etc.
Object name is diagnostics-13-02242-g005.jpg

Training and validation accuracy was achieved using three sub-models and the meta-model.

An external file that holds a picture, illustration, etc.
Object name is diagnostics-13-02242-g006a.jpg

Training and validation loss using three sub-models and the meta-model.

Figure 7 summarizes the performance of a CNN model and meta-model in classifying breast cancer as benign and malignant. A confusion matrix is a table used to evaluate the performance of a classification model on a dataset. It is also known as an error matrix or a contingency table. The confusion matrix summarizes the model’s predictions, including the true positive, true negative, false positive, and false negative rates. It is a useful tool for understanding the model’s performance and identifying areas where it may make mistakes.

An external file that holds a picture, illustration, etc.
Object name is diagnostics-13-02242-g007.jpg

Summarizes the performance of a CNN and meta-model in classifying breast cancer as benign and malignant.

To enhance our understanding of the class distinction in the investigated meta-models, we employ receiver operating characteristic (ROC) curves, as depicted in Figure 8 . An ROC curve plots the true positive rate (TPR) against the false positive rate (FPR), using a range of threshold values derived from the probability outcomes of deep learning models. TPR is indicative of the probability of accurately classifying benign images as malignant. In contrast, FPR represents the risk of false alarms, which is the scenario where a benign image is incorrectly classified as showing symptoms of malignancy.

An external file that holds a picture, illustration, etc.
Object name is diagnostics-13-02242-g008.jpg

ROC curves for the ensemble meta-model and different CNN sub-models.

5. Conclusions

This paper discusses a novel approach for breast cancer classification that achieved state-of-the-art results using the BUSI dataset. The approach presented in the paper achieved an accuracy of 90%. The use of multiple CNN models, including Inception V3, ResNet50, and DenseNet121, in a meta-learning framework allowed for better generalization and improved accuracy, particularly in detecting malignant tumors. The paper demonstrated the potential of meta-learning and ensemble techniques for improving the accuracy and efficiency of a breast cancer diagnosis. The approach in medical imaging datasets could be extended to other types of cancer or medical conditions. In terms of future work, the researcher suggested several avenues for further research. One potential direction is to explore the use of other meta-learning algorithms, such as model-agnostic meta-learning or reinforcement learning, and compare their performance with the approach presented in the paper. Another direction is to investigate the impact of incorporating clinical data, such as patient history or biopsy results, into the classification model. Furthermore, it noted that the dataset used in the study is limited in terms of the number of samples and the diversity of the cases. The proposed approach for breast cancer classification using meta-learning and ensemble techniques has demonstrated promising results and other medical imaging datasets.

Meta-learning involves many trainable parameters, leading to increase model complexity. In the future, designing novel architecture can help reduce the number of trainable parameters while maintaining or improving performance. The idea is to find simpler, more efficient structures that can capture the underlying patterns in the data with fewer parameters.

Acknowledgments

The authors sincerely appreciate the support from Princess Nourah bint Abdulrahman University Researchers Supporting Project number (PNURSP2023R235), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia.

Funding Statement

Princess Nourah bint Abdulrahman University Researchers Supporting Project number (PNURSP2023R235), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia.

Author Contributions

Conceptualization, M.D.A., M.A.K. and M.M.Y.; methodology, U.F.K., A.S., H.E. and M.I.K.; software, M.D.A. and M.M.Y.; validation, A.A.-R., M.A.K., U.F.K. and A.S.; formal analysis, M.I.K. and H.E.; investigation, M.M.Y. and U.F.K.; resources, M.M.Y. and M.A.K.; data curation, A.A.-R. and M.I.K.; writing—original draft preparation, M.D.A., A.S. and U.F.K.; writing—review and editing, M.I.K., U.F.K., M.A.K. and M.M.Y.; visualization, A.S. and M.A.K. supervision, M.A.K.; project administration, U.F.K., A.A.-R. and M.A.K.; funding acquisition, A.A.-R. All authors have read and agreed to the published version of the manuscript.

Data Availability Statement

Conflicts of interest.

The authors declare no conflict of interest.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

IEEE Account

  • Change Username/Password
  • Update Address

Purchase Details

  • Payment Options
  • Order History
  • View Purchased Documents

Profile Information

  • Communications Preferences
  • Profession and Education
  • Technical Interests
  • US & Canada: +1 800 678 4333
  • Worldwide: +1 732 981 0060
  • Contact & Support
  • About IEEE Xplore
  • Accessibility
  • Terms of Use
  • Nondiscrimination Policy
  • Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 03 August 2024

Breast cancer clinical trial participation among diverse patients at a comprehensive cancer center

  • Emily L. Podany   ORCID: orcid.org/0000-0002-9700-3616 1 , 2 ,
  • Shaun Bulsara 1 ,
  • Katherine Sanchez 1 ,
  • Kristen Otte 1 ,
  • Matthew J. Ellis 1 , 3 &
  • Maryam Kinik 1  

npj Breast Cancer volume  10 , Article number:  70 ( 2024 ) Cite this article

405 Accesses

4 Altmetric

Metrics details

  • Breast cancer
  • Public health

This study was designed to determine the enrollment patterns in breast cancer clinical trials (CCTs) of patients with diverse backgrounds in an equal access setting and to evaluate the factors contributing to low rates of clinical trial accrual in patients of low socioeconomic status (SES). We performed a retrospective review of a prospectively maintained database of new patients seen at the Dan L. Duncan Comprehensive Cancer Center dating from 5/2015 to 9/2021, which included 3043 patients screened for breast CCTs. We compared the rate of CCT availability, eligibility, and enrollment between two patient populations: Smith Clinic, where most patients are of low SES and uninsured, and Baylor St. Luke’s Medical Center (BSLMC) with mostly predominantly insured, higher income patients. We performed logistic regression to evaluate whether differences in age, clinic, race, trial type, and primary language may be underlying the differences in CCT enrollment. More patients were eligible for CCTs at Smith Clinic (53.7% vs 44.7%, p  < 0.001). However, Smith Clinic patients were more likely to decline CCT enrollment compared to BSLMC (61.3% declined vs 39.4%, p  < 0.001). On multivariate analysis, Black patients had a significantly higher rate of CCT refusal overall (OR = 0.26, 95% CI 0.12–0.56, p  < 0.001) and BSLMC only (OR = 0.20, 95% CI 0.060–0.60, p  = 0.006). Our data shows that it is likely an oversimplification to assume that equal access will lead to the elimination of CCT disparities. Efforts to diversify CCTs must include consideration of structural and institutional inequities as well as social needs.

Similar content being viewed by others

breast cancer classification research paper

Evaluation of overall survival and barriers to surgery for patients with breast cancer treated without surgery: a National Cancer Database analysis

breast cancer classification research paper

Impact of deviation from guideline recommended treatment on breast cancer survival in Asia

breast cancer classification research paper

Treatment patterns and healthcare resource utilization for triple negative breast cancer in the Brazilian private healthcare system: a database study

Introduction.

Black cancer patients in the United States have both increased overall cancer mortality and increased cancer-specific mortality 1 , 2 , 3 . In breast cancer, Black women have a 41% higher risk of dying from breast cancer when compared with White women and present on average at a later stage 2 , 4 , 5 . Structural inequities pertaining to access to care, diagnosis timing, and treatment delay affect Black women disproportionately 6 , 7 . While these are socioeconomic predictors of the observed poor outcomes, it is also well documented that Black women have a higher incidence of more aggressive breast cancer subtypes (i.e., triple negative breast cancer (TNBC)) than any other ethnic or racial groups 8 . It is critical to understand the biological basis of the observed poor outcomes of breast cancer among Black women. As we work to design precision-driven interventions for prevention, timely diagnosis, and treatment, achieving cancer health equity is not feasible without improving diversity in cancer clinical trials (CCTs).

In an ideal world, the populations studied in CCTs would be representative of the diversity of patients seen in clinic, and CCTs would be used as a tool to decrease inequity. Unfortunately, well-documented disparities exist within CCTs. The National Institutes of Health (NIH) Revitalization Act of 1993 aimed to increase the number of women and underrepresented racial groups in clinical research through mandated inclusion, yet numbers remain low 9 . In 2014, only approximately 1% of NCI-sponsored clinical trials were primarily focused on racial and ethnic minorities 10 . Studies have shown that the average enrollment of Black Americans in CCTs is at best between 5 and 7%, despite Black Americans making up more than 13% of the general population of the United States 11 , 12 , 13 .

When access to CCTs is not a barrier to enrollment, the rate of clinical trial participation by racial and ethnic minorities, especially those of low SES, has not been well studied. Data often comes from safety-net hospitals or private institutions, but rarely are both serving the same catchment area. The Dan L. Duncan Comprehensive Cancer Center (DLDCCC) in Houston, Texas provides access to breast CCTs at two clinical sites: Smith Clinic (SC), within the safety-net Harris Health System, and Baylor St. Luke’s Medical Center (BSLMC). We hypothesized that the racial and socioeconomic gap in clinical trial enrollment would be at least partially improved by similar access to breast CCTs at the two sites.

Eligibility for CCTs

Of the 3043 patients screened for breast CCTs, 366 patients were found to be eligible for CCT, and some patients were eligible for multiple CCTs. There were 431 total offers to CCTs (Fig. 1 ).

figure 1

We identified 3043 new patients seen at DLDCC clinical sites between 5/2015 and 9/2021, 366 of whom were eligible for CCT. The majority of patients at each site were eligible for neoadjuvant trials and patients were often eligible for more than one CCT.

The patient demographics of the 3043 new patients seen at the DLDCCC and screened for breast CCT eligibility from 5/2015 to 7/2021 are shown in Table 1 . Notably, 50% of these patients were White at BSLMC in comparison to 11% at SC, and 74% listed English as their primary language at BSLMC versus 47% at SC. Patients at SC were on average younger, and more frequently presenting with TNBC compared to BSLMC.

More patients at SC had a trial available to them (752/1400, 53.7%) versus at BSLMC (734/1643, 44.7%, p -value < 0.001) (Table 1 ). Patients at SC were also more likely to be eligible for CCTs (191/1400, 13.6%) than patients at BSLMC (175/1643, 10.7%, p -value = 0.011).

Enrollment patterns in CCTs

Despite higher eligibility, patients at SC were less likely to accept these CCT offers (74/191 accepted, 38.7%) than patients at BSLMC (106/175 accepted, 60.6%, p-value < 0.001) (Table 1 ). This difference in acceptance was significant on univariate but not multivariate analysis (Table 2 ). Age was not found to be significantly associated with trial enrollment.

Univariate analysis of the patients showed that Black patients, Hispanic/Latino patients, and Spanish speaking patients were significantly more likely to decline CCT participation. However, on overall multivariate analysis, only the Black patient category was associated with significantly higher rate of enrollment refusal (odds ratio (OR) = 0.26, 95% CI 0.12–0.56, p  < 0.001). (Table 2 ) On the multivariate analyses across the two clinical sites, patients were significantly more likely to accept biobanking trials than other trial types at SC (OR = 16.90, 95% CI 2.13–363.77, p  = 0.018) and at BSLMC (OR = 20.10, 95% CI 3.37–395.53, p -value = 0.007). Patients at SC were also more likely to enroll into preventive trials (OR = 7.88, 95% CI 1.53–59.39, p-value = 0.020). Primary language was not found to be a determining factor in trial enrollment or refusal at either site. Black patients at BSLMC were less likely to enroll in CCTs (OR = 0.20, 95% CI 0.060–0.60, p  = 0.006) on multivariate analysis, however this was not significant on multivariate analysis in the SC subset (OR = 0.41, 95% CI 0.11–1.53, p = 0.180). (Supplemental Tables 1 and 2 ).

While patients at SC have equal opportunity when it comes to access to clinical trials, trial enrollment is only at 37% in this clinic site, compared to BSLMC clinic where over 61% of trial eligible patients consent to enrollment. Overall, Black patients were less likely to consent to trial enrollment, and the rate of trial refusal was lowest for biobanking trials across both sites and preventive trials at SC compared to other trials. Speaking a primary language other than English was not found to be a major barrier to enrollment in our population.

Our data shows that it is likely an oversimplification to assume that equal access will lead to a complete elimination of CCT disparities. At SC, which serves a more diverse population with a higher percentage of low income and uninsured patients, the patients were significantly more eligible for breast CCTs. As noted in Table 1 , patients at SC had a higher rate of TNBC (29.3% versus 14.3% at BSLMC), which we hypothesize may be one reason for the higher rate of eligibility. More SC patients were eligible for neoadjuvant trials and biobanking trials than at BSLMC (Fig. 1 ), possibly also due to the higher TNBC rates in this population.

Despite the higher rate of eligibility at SC, these patients were significantly more likely to decline the CCT. Our findings in a highly racially and ethnically diverse patient population supports the literature that shows that Black patients are less likely to agree to participate in clinical trials. The causes of discrepancy between eligibility and enrollment are multifactorial and complex. Studies have shown equal willingness among patients of different races to participate in clinical trials 14 , 15 , 16 , 17 , yet disparities in enrollment persist. In 2008, Ford 18 identified three categories of reasons for low accrual: awareness, opportunity, and acceptance/refusal barriers or promoters.

Awareness barriers include lack of knowledge about the purpose and availability of CCTs 13 , 19 . Cancer health literacy has been found in some studies to be significantly lower in Black patients 14 , 20 , 21 , though others found that the role of factual knowledge did not make a significant difference in accrual 22 . The FDA in 2020 published guidelines and potential approaches to increase the diversity of clinical trial populations, including making diversity of enrollment a priority, involving the community, and educating potential participants 23 . When CCTs do not recruit a diverse patient population and fail to be made available to racial or ethnic minorities 17 , the results cannot be assumed to be generalizable to the community at large.

Opportunity barriers include limitations due to socioeconomic status and ineligibility. Research has shown us that CCT participants are less likely to be Black and more likely to be of a higher socioeconomic status 24 , 25 , 26 , 27 , 28 . Black patients are more likely to be deemed ineligible for clinical trials 13 , 16 , 29 . This is partially due to a higher rate of comorbidities such as hypertension, vision loss, or diabetes, as well as benign neutropenia—a condition that has not been shown to increase risk of infection 16 . However, studies have also shown that Black Americans are more likely to be deemed ineligible due to perceived noncompliance or mental status, and that subjective judgements on eligibility more often favor White patients 29 .

Barriers to acceptance include an understandable mistrust in a medical system that has historically caused harm to people of color, perceived financial burden, logistical difficulties including transportation, and family or cultural pressures. When Black American patients are asked about their reasons for opting out of CCT, studies show us that a lack of trust is one of the most common factors influencing their decision 13 , 22 . Barriers relating to logistics or finances are seen more often in safety-net hospitals and clinics 19 .

At BSLMC, where the population is less diverse, we noted a difference in CCT enrollment by race in multivariate analysis. However, this finding was not significant at SC, which has a more diverse population (Supplemental Tables 1 and 2 ). This is an interesting exploratory finding that can be further elucidated in future studies but may point to more diverse clinic experiences encouraging CCT enrollment. This could be due to higher trust in the clinic, awareness of clinical trials, or physicians offering trials more equitably. A limitation of our finding is the low number of patients in each category, and future studies would need to clarify these findings with a larger patient population.

Although it is imperative that we continue to shine a light on these important issues, we must be ready to envision and enact both local and national policy changes. Moving forward, we are focusing on community engagement, patient education, and dialogue with our patients to explore specific interventions designed to improve our Black patient population’s views of trial enrollment. Interventions have been attempted around the country to varying levels of success, including patient navigation systems 30 , 31 , 32 , patient education videos 12 , 13 , the recent ACCURE trial which included multiple levels of intervention including electronic medical record changes and specific physician roles 33 , diversifying staff, ensuring trial resources are in multiple languages, and offering financial incentives 23 . The reason for trial refusal was unfortunately not uniformly captured in the clinical trial database nor in patients’ electronic medical records. This is a limitation of our study; we do not have specific patients’ refusal reasons. In a follow up study that is currently being conducted, we have designed a patient education intervention to collect specific information on patients’ attitudes towards clinical trial enrollment and refusal. This follow-up study will serve as a roadmap for designing patient and community targeted outreach programs to improve our trial enrollment.

Cancer clinical trials have maintained restrictive eligibility criteria that inevitably censor out a large population of patients 34 . It is crucial that efforts continue on all fronts to improve cancer clinical trial diversity, including clinical trial design and challenging long-standing beliefs on eligibility criteria. Prevention and treatment alike need to be considered when designing an equitable future for cancer care, and as others have shown, these efforts must include consideration of structural and institutional inequities as well as social needs. Research and data collection are only the first steps in a necessary journey toward equity in cancer care.

Study population and data collection

This is a retrospective cohort study of new patients seen from 5/2015 to 9/2021, which included 3043 patients screened for breast CCTs at DLDCCC clinical sites. The populations receiving care at the two clinical sites differ greatly. At SC, half of the patients earn less than $25,000 annually, 60% are uninsured and use a county financial assistance program known as the “Gold Card,” and 65% are not proficient in English. Fifty-nine percent of SC patients self-identify as Hispanic and 29% self-identify as Black, with White patients making up 10% of the population. At BSLMC, over 95% of the patients have federal and commercial insurance and 68% are White, 13% are Black, and 3% are Hispanic. We collected information on age at the time of screening for CCTs, patient-reported race, and primary spoken language.

The study was conducted according to the ethical guidelines set forth in the Declaration of Helsinki and in concordance with the Heath Insurance Portability and Accountability Act. The study was approved by the institutional review board (IRB) of Baylor College of Medicine. The requirement of patient informed consent was waived by the IRB as the data was deidentified prior to analysis.

DLDCCC is an active participant in several cooperative group consortia including the Translational Breast Cancer Research Consortium (TBCRC), Southwest Oncology Group (SWOG), and NRG oncology (from the parental organizations of NSABP, RTOG, and GOG), and it is frequently the leading site for national investigator initiated clinical trials (IIT). We collected information on whether there were trials available for the patients’ diagnosis, trial eligibility, and the type of trial the patients were screened for. We designated 5 categories of trials based on the intent of the trial (e.g., scalp cooling trial to prevent chemotherapy-associated hair loss) and the stage of therapy that the trial was offered (e.g., COMPASS RD, an adjuvant trial). The categories were preventive, neoadjuvant, adjuvant, metastatic, and biobanking. The same CCTs were open at both sites under the DLDCCC.

Statistical analysis

We first performed Chi-square tests to determine whether there were differences in trial availability, trial eligibility, and trial acceptance rate according to DLDCCC clinical sites. We then performed univariate and multivariate logistic regression to evaluate whether differences in age, clinic site, race, trial type, and primary language may be underlying the observed differences in CCT enrollment rates. We performed logistic regression on the overall dataset as well as by clinic. We calculated odds ratios with 95% confidence intervals to measure the strength of association between the predictors and enrollment. P-values less than 0.05 were considered statistically significant. Analysis was performed using R version 4.1.0.

Data availability

The participants in this study did not give written consent for their data to be shared. Due to the clinical nature of the dataset, it is not available publicly.

Singh, G. K. & Jemal, A. Socioeconomic and racial/ethnic disparities in cancer mortality, incidence, and survival in the United States, 1950-2014: over six decades of changing patterns and widening inequalities. J. Environ. Public Health 2017 , 2819372 (2017).

Article   PubMed   PubMed Central   Google Scholar  

USCS Data Visualizations - CDC. https://gis.cdc.gov/cancer/USCS/#/AtAGlance/ .

Reports on Cancer - Cancer Statistics. https://seer.cancer.gov/statistics/reports.html .

Siegel, R. L., Miller, K. D., Wagle, N. S. & Jemal, A. Cancer statistics, 2023. CA Cancer J. Clin. 73 , 17–48 (2023).

Article   PubMed   Google Scholar  

AACR Cancer Disparities Progress Report 2022 Cancer Progress Report. https://cancerprogressreport.aacr.org/disparities/ .

Hossain, F. et al. Neighborhood social determinants of triple negative breast cancer. Front Public Health 7 , 18 (2019).

Wang, F. et al. Racial/ethnic disparities in all-cause mortality among patients diagnosed with triple-negative breast cancer. Cancer Res. 81 , 1163–1170 (2021).

Article   CAS   PubMed   Google Scholar  

Clarke, C. A. et al. Age-specific incidence of breast cancer subtypes: understanding the black-white crossover. J. Natl Cancer Inst. 104 , 1094–1101 (2012).

Murthy, V. H., Krumholz, H. M. & Gross, C. P. Participation in cancer clinical trials: race-, sex-, and age-based disparities. JAMA 291 , 2720–2726 (2004).

Chen, M. S., Lara, P. N., Dang, J. H. T., Paterniti, D. A. & Kelly, K. Twenty years post-NIH Revitalization Act: enhancing minority participation in clinical trials (EMPaCT): laying the groundwork for improving minority clinical trial accrual: renewing the case for enhancing minority participation in cancer clinical trials. Cancer 120 , 1091–1096 (2014).

Al Hadidi, S., Mims, M., Miller-Chism, C. N. & Kamble, R. Participation of African American persons in clinical trials supporting U.S. food and drug administration approval of cancer drugs. Ann. Intern. Med. 173 , 320–321 (2020).

Awidi, M. & Al Hadidi, S. Participation of black Americans in cancer clinical trials: current challenges and proposed solutions. JCO Oncol. Pr. 17 , 265–271 (2021).

Article   Google Scholar  

Swaby, J., Kaninjing, E. & Ogunsanya, M. African American participation in cancer clinical trials. Ecancermedicalscience 15 , 1307 (2021).

Echeverri, M. et al. Cancer health literacy and willingness to participate in cancer research and donate bio-specimens. Int. J. Environ. Res. Public Health 15 , 2091 (2018).

Byrne, M. M., Tannenbaum, S. L., Glück, S., Hurley, J. & Antoni, M. Participation in cancer clinical trials: why are patients not participating? Med Decis. Mak. 34 , 116–126 (2014).

Langford, A. T. et al. Racial/ethnic differences in clinical trial enrollment, refusal rates, ineligibility, and reasons for decline among patients at sites in the National Cancer Institute’s Community Cancer Centers Program. Cancer 120 , 877–884 (2014).

Wendler, D. et al. Are racial and ethnic minorities less willing to participate in health research? PLoS Med. 3 , 0201–0210 (2006).

Google Scholar  

Ford, J. G. et al. Barriers to recruiting underrepresented populations to cancer clinical trials: a systematic review. Cancer 112 , 228–242 (2008).

Hernandez, N. D. et al. African American cancer survivors’ perspectives on cancer Clinical trial participation in a safety-net hospital: considering the role of the social determinants of health. J. Cancer Educ. 37 , 1589–1597 (2022).

Wong, S. T. et al. Using visual displays to communicate risk of cancer to women from diverse race/ethnic backgrounds. Patient Educ. Couns. 87 , 327–335 (2012).

Watson, M. et al. Family history of breast cancer: what do women understand and recall about their genetic risk? J. Med. Genet. 35 , 731–738 (1998).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Meng, J., McLaughlin, M., Pariera, K. & Murphy, S. A comparison between Caucasians and African Americans in willingness to participate in cancer clinical trials: the roles of knowledge, distrust, information sources, and religiosity. J. Health Commun. 21 , 669–677 (2016).

Enhancing the Diversity of Clinical Trial Populations—Eligibility Criteria, Enrollment Practices, and Trial Designs Guidance for Industry | FDA. https://www.fda.gov/regulatory-information/search-fda-guidance-documents/enhancing-diversity-clinical-trial-populations-eligibility-criteria-enrollment-practices-and-trial .

El-Rayes, B. F. et al. Impact of race, age, and socioeconomic status on participation in pancreatic cancer clinical trials. Pancreas 39 , 967–971 (2010).

Sharrocks, K., Spicer, J., Camidge, D. R. & Papa, S. The impact of socioeconomic status on access to cancer clinical trials. Br. J. Cancer 111 , 1684–1687 (2014).

Kwiatkowski, K., Coe, K., Bailar, J. C. & Swanson, G. M. Inclusion of minorities and women in cancer clinical trials, a decade later: Have we improved? Cancer 119 , 2956–2963 (2013).

Unger, J. M. et al. Patient income level and cancer clinical trial participation. J. Clin. Oncol. 31 , 536–542 (2013).

Noor, A. M. et al. Effect of patient socioeconomic status on access to early-phase cancer trials. J. Clin. Oncol. 31 , 224–230 (2013).

Penberthy, L. et al. Barriers to therapeutic clinical trials enrollment: differences between African-American and white cancer patients identified at the time of eligibility assessment. Clin. Trials 9 , 788–797 (2012).

Battaglia, T. A. et al. Translating research into practice: Protocol for a community-engaged, stepped wedge randomized trial to reduce disparities in breast cancer treatment through a regional patient navigation collaborative. Contemp. Clin. Trials 93 , 106007 (2020).

Cartmell, K. B. et al. Patient barriers to cancer clinical trial participation and navigator activities to assist. Adv. Cancer Res. 146 , 139–166 (2020).

Fouad, M. N. et al. Patient Navigation As a Model to Increase Participation of African Americans in Cancer Clinical Trials. J. Oncol. Pr. 12 , 556–563 (2016).

The Lancet Oncology. Racial disparities in cancer care: can we close the gap? Lancet Oncol. 22 , 1643 (2021).

Liu, R. et al. Evaluating eligibility criteria of oncology trials using real-world data and AI. Nature 592 , 629–633 (2021).

Download references

Acknowledgements

This work was not funded. This work has previously been presented in part in poster format at the San Antonio Breast Cancer Symposium in December 2021 and the ASCO QI symposium in October 2022, as well as in an online-only abstract at the 2022 ASCO meeting in June 2022. We would like to acknowledge and thank all the patients whose deidentified data was used in this study.

Author information

Authors and affiliations.

Baylor College of Medicine, Lester and Sue Smith Breast Center, Houston, TX, USA

Emily L. Podany, Shaun Bulsara, Katherine Sanchez, Kristen Otte, Matthew J. Ellis & Maryam Kinik

Washington University in St. Louis, St. Louis, MO, USA

Emily L. Podany

The Institute for Proteogenomic Discovery, Houston, TX, USA

Matthew J. Ellis

You can also search for this author in PubMed   Google Scholar

Contributions

E.P. performed data collection, data cleaning, data analysis, and wrote the manuscript. S.B. performed data analysis and reviewed the manuscript. K.S. performed data cleaning and reviewed the manuscript. K.O. performed data collection and reviewed the manuscript. M.E. performed data collection and reviewed the manuscript. M.K. performed data collection, data cleaning, study planning, IRB submission, data analysis, and reviewed the manuscript.

Corresponding author

Correspondence to Emily L. Podany .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplemental tables 1 and 2, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Podany, E.L., Bulsara, S., Sanchez, K. et al. Breast cancer clinical trial participation among diverse patients at a comprehensive cancer center. npj Breast Cancer 10 , 70 (2024). https://doi.org/10.1038/s41523-024-00672-0

Download citation

Received : 01 September 2023

Accepted : 10 July 2024

Published : 03 August 2024

DOI : https://doi.org/10.1038/s41523-024-00672-0

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing: Cancer newsletter — what matters in cancer research, free to your inbox weekly.

breast cancer classification research paper

IMAGES

  1. Breast Cancer Classification using Deep Learned Features Boosted with

    breast cancer classification research paper

  2. (PDF) Breast Cancer Classification using Local Directional Ternary Patterns

    breast cancer classification research paper

  3. Molecular classification of breast cancer

    breast cancer classification research paper

  4. (PDF) Molecular Classification of Breast Cancer

    breast cancer classification research paper

  5. (PDF) Breast Cancer Classification Enhancement Based on Entropy Method

    breast cancer classification research paper

  6. (PDF) A Federated Learning Framework for Breast Cancer

    breast cancer classification research paper

COMMENTS

  1. Breast Cancer Dataset, Classification and Detection Using Deep Learning

    The paper starts by reviewing public datasets related to breast cancer diagnosis. ... To this end, Patil et al. took a multi-instance learning approach in a weakly supervised manner for the classification of breast cancer histology images. As shown in Figure 7, each input image is partitioned into multiple smaller patches. Feeding these patches ...

  2. Breast Cancer Type Classification Using Machine Learning

    Traditionally, classification of breast cancer patients into those with TNBC and non-TNB has been largely determined by immunohistochemical staining [16,17]. ... As noted above, the research content of this paper was based on a binary classification model with application to pattern recognition classification problem . Under this approach 90% ...

  3. Breast Cancer—Epidemiology, Risk Factors, Classification, Prognostic

    Breast cancer incidence and death rates have increased over the last three decades. Between 1990 and 2016 breast cancer incidence has more than doubled in 60/102 countries (e.g., Afghanistan, Philippines, Brazil, Argentina), whereas deaths have doubled in 43/102 countries (e.g., Yemen, Paraguay, Libya, Saudi Arabia) .

  4. Classification and diagnostic prediction of breast cancer ...

    According to World Health Organization (WHO) reports, the cancer death ratio is as high as 9.2 million for lung cancer, 1.7 million for skin cancer, and 627,000 for breast cancer 3,4.

  5. Classification with 2-D convolutional neural networks for breast cancer

    Breast cancer is the most common cancer in women. Classification of cancer/non-cancer patients with clinical records requires high sensitivity and specificity for an acceptable diagnosis test. The ...

  6. Breast cancer classification using machine learning

    Breast cancer classification using machine learning. Abstract: During their life, among 8% of women are diagnosed with Breast cancer (BC), after lung cancer, BC is the second popular cause of death in both developed and undeveloped worlds. BC is characterized by the mutation of genes, constant pain, changes in the size, color (redness), skin ...

  7. Deep Learning to Improve Breast Cancer Detection on Screening ...

    1. Set learning rate to 10 −3 and train the last layer for 3 epochs. 2. Set learning rate to 10 −4, unfreeze the top layers and train for 10 epochs, where the top layer number is set to 46 for ...

  8. Computer-aided diagnosis for breast cancer classification using deep

    Hamed et al. proposed using machine learning-based models in their research work for the classification task of breast cancer. They claimed that the physicians' average accuracy obtained in the detection and classification of breast cancer is around 79%, while the accuracy obtained by their proposed model is 91% [17]. Tiwari et al. used the ...

  9. Classification Prediction of Breast Cancer Based on Machine Learning

    Breast cancer is the most common and deadly type of cancer in the world. Based on machine learning algorithms such as XGBoost, random forest, logistic regression, and K-nearest neighbor, this paper establishes different models to classify and predict breast cancer, so as to provide a reference for the early diagnosis of breast cancer.

  10. Machine Learning Algorithms For Breast Cancer ...

    The main objective of this research paper is to predict and diagnosis breast cancer, using machine-learning algorithms, and find out the most effective whit respect to confusion matrix, accuracy and precision. ... demonstrates the use of various supervised machine learning algorithms in classification of breast cancer from using 3D images and ...

  11. Breast Cancer Classification Using Deep Learning Approaches and

    Convolutional Neural Network (CNN) models are a type of deep learning architecture introduced to achieve the correct classification of breast cancer. This paper has a two-fold purpose. The first aim is to investigate the various deep learning models in classifying breast cancer histopathology images. This study identified the most accurate models in terms of the binary, four, and eight ...

  12. An update on the pathological classification of breast cancer

    The World Health Organization (WHO) series on the Classification of Tumours (also known as the WHO Blue Books) is regarded as the gold standard for the diagnosis of tumours and provide indispensable international standards for classification of breast tumours worldwide. 20 The 5 th edition (2019) WHO Classification of Breast Tumours continues ...

  13. Breast Cancer—Epidemiology, Classification, Pathogenesis and Treatment

    Breast cancer is the most common malignant tumor in women in the world. Breast cancer patients account for as much as 36% of oncological patients. An estimated 2.089 million women were diagnosed with breast cancer in 2018 [, ]. The incidence of this malignant tumor is increasing in all regions of the world, but the highest incidence occurs in ...

  14. The current state of breast cancer classification

    Therefore, the current pathological report of breast carcinoma should include the histopathological classi fication of the tumors [1] and their histopathological grade, and the immunohistochemical parameters (ER, PgR, HER2 and Ki67) that would allow the treating physicians to tailor properly the systemic interventions.

  15. An update on the pathological classification of breast cancer

    Breast cancer (BC) is a heterogeneous disease, encompassing a diverse spectrum of tumours with varying morphological, biological, and clinical phenotypes. Although tumours may show phenotypic overlap, they often display different biological behaviour and response to therapy. Advances in high-throughput molecular techniques and bioinformatics ...

  16. Breast Cancer Classification Using Machine Learning Techniques: A

    Artificial intelligence (AI) has been utilized for diagnosis early, rapidly, and accurately breast tumors. The objective of this paper is to review recent studies for classifying these tumors ...

  17. Multiclass classification of breast cancer histopathology ...

    According to Global Cancer Statistics 2020, breast cancer is the most common malignancy and the primary cause of cancer-related mortalities in the female population worldwide 1.Specifically, 2.26 ...

  18. Frontiers

    To train this model, they used a learning rate of 0.01 and 60 epochs. Transfer learning is effective in detecting breast cancer by categorizing mammogram images of the breast with general accuracy, sensitivity, specificity, precision, F-score, and accuracy of 98.96, 97.83, 99.13, 97.35, 97.6.%, and 95%, respectively.

  19. Breast Cancer Detection and Prediction using Machine Learning

    According to the world health organization (WHO) Breast cancer is the most frequent cancer among women, impacting 2.1 million women each year, and also causes the greatest number of cancer-related ...

  20. Machine Learning Techniques for Breast Cancer Prediction

    Conclusion Breast cancer is the important field of research and technology helps to reduce mortality rate caused by breast cancer. ... we put forth a model for the classification of data relating to breast cancer. In this paper we applied six ML classification techniques: DT, KNN, SVM, RF, NB, and LR along with ensemble techniques on WDBC ...

  21. Study on Breast Cancer Classification Prediction based on XGBoost

    Machine learning plays an important role in cancer prediction, this paper realizes the prediction of breast cancer classification by constructing the Extreme Gradient Boosting (XGBoost) algorithm, and compares and analyzes it with other algorithms, and comprehensively applies several indexes such as Accuracy, Precision, F1_Score, Hamming_Loss, etc., for the evaluation of the algorithm ...

  22. Breast Cancer Classification through Meta-Learning Ensemble Technique

    The experiment demonstrates the effectiveness of the proposed multi-model scheme with meta-learning for breast cancer classification. Breast cancer is a severe health concern, and research efforts focus on improving diagnosis and treatment. ML and DL techniques have shown great potential in accurately diagnosing breast cancer [1,17]. The ...

  23. Breast Cancer Detection and Classification

    Breast Cancer is more common hence, identification of BC and detection of region of breast affected is more important. Mammography screening images two views CC and MLO are widely use in diagnosis process. This paper presents the method to detect cancer region and classify normal and cancerous patient. Pre-processing operation perform on the input Mammogram image and undesirable part removed ...

  24. Breast cancer clinical trial participation among diverse ...

    Black cancer patients in the United States have both increased overall cancer mortality and increased cancer-specific mortality 1,2,3. In breast cancer, Black women have a 41% higher risk of dying ...