Breast Cancer Severity Degree Predication Using Data Mining Techniques in the Gaza Strip








Abstract

Data mining has become a fundamental methodology for computing applications in the domain area of medicine. Data mining is defined as the procedure that finds the valuable data from raw information sets by investigating and compressing them by considering alternate points of view. Medical data mining is a set of methods that extract valuable and novel information from human services databases to help doctors to get best diagnosis. In this area, cancer disease growth and diabetes are the top mortal disease in Gaza strip during the last few years. Therefore, data mining can be the part mostly utilized, as these include extravagant and drawn out tests. As an extension to the previous researches related to the discovery of breast cancer, we proposed a model to help in resolving the difficulty of determining the degree of risk for the disease and to get best practices, abatement time and expense with the objective of advancing well-being, based on data collected from hospitals in the Gaza Strip. The model is applying classification techniques such as Support vector machine, artificial neural networks and k-nearest neighbors on the collected breast cancer data, which in turn predicts the severity of breast cancer. We also applied association rules to see what the top attributes related to high severity breast cancer are. After evaluation and testing using the mentioned classification techniques on the breast cancer dataset, we obtained an accuracy of 77%, which is an accepted rate of prediction for the severity of breast cancer. Additionally, we were able to list the most related attributes to high severity of breast cancer.


Modules


Algorithms

SVM algorithm


Software And Hardware

• Hardware: Processor: i3 ,i5 RAM: 4GB Hard disk: 16 GB • Software: operating System : Windws2000/XP/7/8/10 Anaconda,jupyter,spyder,flask Frontend :-python Backend:- MYSQL