Project-6: Breast-Cancer Wisconsin Diagnostic
Aim of the Project
Predict if the patients have Breast Cancer or not. And also classify that it is Malignant / Benign using UCI dataset
Life Cycle of the Project
Collected the dataset from Kaggle open source. Performed Data Cleaning using Pandas and Seaborn. Used Pandas, matplotlib & Numpy for Data Pre-processing, to decrease the redundancy, by taking care of the missing values, and duplicates. Used Label Encoder to handle the imbalanced dataset, in order to avoid the OVERFITTING. And use Standard-Scaler, to regularize the dataset. Trained the data on 5 algorithms, in order to get the model with best accuracy as an Output. Trained the model using below mentioned algorithms:
- Logistic Regression
- K Nearest Neighbors Classifier
- Support Vector Classifier (SVC)
- Decision Tree Classifier
- Random Forest Classifier
Results from the Project
Check out the Detail Project Overview on GitHub Repository
Technologies Used | Python | Seaborn | Numpy | Pandas | Scikit Learn | Flask | Matplotlib | Numpy |
Model Performance
- Model Accuracy =
- Precision = 97.82
- Recall = 95.71
- F-1 Score = 96.77
- ROC-AUC Score = 97.14%