Machine Learning Algorithms for Breast Cancer Detection in Mammography Images: A Comparative Study

Breast tumor is the most common type of cancer in women worldwide, representing approximately 12% ofreported new cases and 6.5% of cancer deaths in 2018. Mammography screening are extremely important forearly detection of breast cancer. The assessment of mammograms is a complex task with significant variabilitydue to professional experience and human errors, an opportunity for assisting tools to improve both reliabilityand accuracy. The usage of deep learning in medical image analysis have increased, assisting specialists inearly detection, diagnosis, treatment or prognosis of diseases. In this article, we compare the performance ofXGBoost and VGG16 in the task of breast cancer detection by using digital mammograms from CBIS-DDSMdataset. In addition, we perform a comparison of prediction accuracy between full mammogram imagesand patches extracted from original images based on ROI annotated by experts. Moreover, we also performexperiments with transfer learning and data augmentation to exploit data diversity, and the ability to extractfeatures and learn from raw unprocessed data. Experimental results show that XGBoost achieves 68.29% inAUC, while VGG16 achieves approximately the same performance of 68.24% in AUC