Commit 153dc0ef authored by J D N S De Silva's avatar J D N S De Silva

RF Model train File

parent 568213fe
{
"cells": [
{
"cell_type": "markdown",
"id": "b1456aec",
"metadata": {},
"source": [
"## Title - **Stroke Prediction**\n",
"## Used Algorithm - **Random Forest Classification**\n",
"## Dataset - \n",
"* The dataset link is [Stroke Prediction Dataset](https://www.kaggle.com/datasets/fedesoriano/stroke-prediction-dataset) and it is from kaggle online website\n",
"* The usability of the dataset is 10.00\n",
"* This dataset has 11 parameters and 5110 instances\n",
"\n",
"## Accuracy\n",
"* Accuracy on Isolation Forest : 90%\n",
"* Accuracy on Random Forest : 93%"
]
},
{
"cell_type": "markdown",
"id": "8a19db0e",
"metadata": {},
"source": [
"### Importing the dependancies"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "1123a658",
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"import seaborn as sns"
]
},
{
"cell_type": "markdown",
"id": "2a902969",
"metadata": {},
"source": [
"### Data collecting process"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "111d33c0",
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"data = pd.read_csv('..//Trained models//healthcare-dataset-stroke-data.csv')"
]
},
{
"cell_type": "markdown",
"id": "9fc14b95",
"metadata": {},
"source": [
"### Explore the dataset"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "740f14ff",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Index(['id', 'gender', 'age', 'hypertension', 'heart_disease', 'ever_married',\n",
" 'work_type', 'Residence_type', 'avg_glucose_level', 'bmi',\n",
" 'smoking_status', 'stroke'],\n",
" dtype='object')\n"
]
},
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>id</th>\n",
" <th>gender</th>\n",
" <th>age</th>\n",
" <th>hypertension</th>\n",
" <th>heart_disease</th>\n",
" <th>ever_married</th>\n",
" <th>work_type</th>\n",
" <th>Residence_type</th>\n",
" <th>avg_glucose_level</th>\n",
" <th>bmi</th>\n",
" <th>smoking_status</th>\n",
" <th>stroke</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>9046</td>\n",
" <td>Male</td>\n",
" <td>67.0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>Yes</td>\n",
" <td>Private</td>\n",
" <td>Urban</td>\n",
" <td>228.69</td>\n",
" <td>36.6</td>\n",
" <td>formerly smoked</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>51676</td>\n",
" <td>Female</td>\n",
" <td>61.0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>Yes</td>\n",
" <td>Self-employed</td>\n",
" <td>Rural</td>\n",
" <td>202.21</td>\n",
" <td>NaN</td>\n",
" <td>never smoked</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>31112</td>\n",
" <td>Male</td>\n",
" <td>80.0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>Yes</td>\n",
" <td>Private</td>\n",
" <td>Rural</td>\n",
" <td>105.92</td>\n",
" <td>32.5</td>\n",
" <td>never smoked</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>60182</td>\n",
" <td>Female</td>\n",
" <td>49.0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>Yes</td>\n",
" <td>Private</td>\n",
" <td>Urban</td>\n",
" <td>171.23</td>\n",
" <td>34.4</td>\n",
" <td>smokes</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>1665</td>\n",
" <td>Female</td>\n",
" <td>79.0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>Yes</td>\n",
" <td>Self-employed</td>\n",
" <td>Rural</td>\n",
" <td>174.12</td>\n",
" <td>24.0</td>\n",
" <td>never smoked</td>\n",
" <td>1</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" id gender age hypertension heart_disease ever_married \\\n",
"0 9046 Male 67.0 0 1 Yes \n",
"1 51676 Female 61.0 0 0 Yes \n",
"2 31112 Male 80.0 0 1 Yes \n",
"3 60182 Female 49.0 0 0 Yes \n",
"4 1665 Female 79.0 1 0 Yes \n",
"\n",
" work_type Residence_type avg_glucose_level bmi smoking_status \\\n",
"0 Private Urban 228.69 36.6 formerly smoked \n",
"1 Self-employed Rural 202.21 NaN never smoked \n",
"2 Private Rural 105.92 32.5 never smoked \n",
"3 Private Urban 171.23 34.4 smokes \n",
"4 Self-employed Rural 174.12 24.0 never smoked \n",
"\n",
" stroke \n",
"0 1 \n",
"1 1 \n",
"2 1 \n",
"3 1 \n",
"4 1 "
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"print(data.columns)\n",
"data.head()"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "7c8d2381",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(5110, 12)\n",
" id age hypertension heart_disease \\\n",
"count 5110.000000 5110.000000 5110.000000 5110.000000 \n",
"mean 36517.829354 43.226614 0.097456 0.054012 \n",
"std 21161.721625 22.612647 0.296607 0.226063 \n",
"min 67.000000 0.080000 0.000000 0.000000 \n",
"25% 17741.250000 25.000000 0.000000 0.000000 \n",
"50% 36932.000000 45.000000 0.000000 0.000000 \n",
"75% 54682.000000 61.000000 0.000000 0.000000 \n",
"max 72940.000000 82.000000 1.000000 1.000000 \n",
"\n",
" avg_glucose_level bmi stroke \n",
"count 5110.000000 4909.000000 5110.000000 \n",
"mean 106.147677 28.893237 0.048728 \n",
"std 45.283560 7.854067 0.215320 \n",
"min 55.120000 10.300000 0.000000 \n",
"25% 77.245000 23.500000 0.000000 \n",
"50% 91.885000 28.100000 0.000000 \n",
"75% 114.090000 33.100000 0.000000 \n",
"max 271.740000 97.600000 1.000000 \n"
]
}
],
"source": [
"# Print the shape of the data\n",
"# data = data.sample(frac=0.1, random_state = 48)\n",
"print(data.shape)\n",
"print(data.describe())"
]
},
{
"cell_type": "markdown",
"id": "401acdab",
"metadata": {},
"source": [
"#### update null values"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "016ecaa0",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<class 'pandas.core.frame.DataFrame'>\n",
"RangeIndex: 5110 entries, 0 to 5109\n",
"Data columns (total 12 columns):\n",
" # Column Non-Null Count Dtype \n",
"--- ------ -------------- ----- \n",
" 0 id 5110 non-null int64 \n",
" 1 gender 5110 non-null object \n",
" 2 age 5110 non-null float64\n",
" 3 hypertension 5110 non-null int64 \n",
" 4 heart_disease 5110 non-null int64 \n",
" 5 ever_married 5110 non-null object \n",
" 6 work_type 5110 non-null object \n",
" 7 Residence_type 5110 non-null object \n",
" 8 avg_glucose_level 5110 non-null float64\n",
" 9 bmi 5110 non-null float64\n",
" 10 smoking_status 5110 non-null object \n",
" 11 stroke 5110 non-null int64 \n",
"dtypes: float64(3), int64(4), object(5)\n",
"memory usage: 479.2+ KB\n"
]
},
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>id</th>\n",
" <th>gender</th>\n",
" <th>age</th>\n",
" <th>hypertension</th>\n",
" <th>heart_disease</th>\n",
" <th>ever_married</th>\n",
" <th>work_type</th>\n",
" <th>Residence_type</th>\n",
" <th>avg_glucose_level</th>\n",
" <th>bmi</th>\n",
" <th>smoking_status</th>\n",
" <th>stroke</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>9046</td>\n",
" <td>Male</td>\n",
" <td>67.0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>Yes</td>\n",
" <td>Private</td>\n",
" <td>Urban</td>\n",
" <td>228.69</td>\n",
" <td>36.600000</td>\n",
" <td>formerly smoked</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>51676</td>\n",
" <td>Female</td>\n",
" <td>61.0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>Yes</td>\n",
" <td>Self-employed</td>\n",
" <td>Rural</td>\n",
" <td>202.21</td>\n",
" <td>28.893237</td>\n",
" <td>never smoked</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>31112</td>\n",
" <td>Male</td>\n",
" <td>80.0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>Yes</td>\n",
" <td>Private</td>\n",
" <td>Rural</td>\n",
" <td>105.92</td>\n",
" <td>32.500000</td>\n",
" <td>never smoked</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>60182</td>\n",
" <td>Female</td>\n",
" <td>49.0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>Yes</td>\n",
" <td>Private</td>\n",
" <td>Urban</td>\n",
" <td>171.23</td>\n",
" <td>34.400000</td>\n",
" <td>smokes</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>1665</td>\n",
" <td>Female</td>\n",
" <td>79.0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>Yes</td>\n",
" <td>Self-employed</td>\n",
" <td>Rural</td>\n",
" <td>174.12</td>\n",
" <td>24.000000</td>\n",
" <td>never smoked</td>\n",
" <td>1</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" id gender age hypertension heart_disease ever_married \\\n",
"0 9046 Male 67.0 0 1 Yes \n",
"1 51676 Female 61.0 0 0 Yes \n",
"2 31112 Male 80.0 0 1 Yes \n",
"3 60182 Female 49.0 0 0 Yes \n",
"4 1665 Female 79.0 1 0 Yes \n",
"\n",
" work_type Residence_type avg_glucose_level bmi \\\n",
"0 Private Urban 228.69 36.600000 \n",
"1 Self-employed Rural 202.21 28.893237 \n",
"2 Private Rural 105.92 32.500000 \n",
"3 Private Urban 171.23 34.400000 \n",
"4 Self-employed Rural 174.12 24.000000 \n",
"\n",
" smoking_status stroke \n",
"0 formerly smoked 1 \n",
"1 never smoked 1 \n",
"2 never smoked 1 \n",
"3 smokes 1 \n",
"4 never smoked 1 "
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df = data\n",
"df['bmi']=df['bmi'].fillna(df['bmi'].mean())\n",
"df.info()\n",
"df.head()"
]
},
{
"cell_type": "markdown",
"id": "67f97afc",
"metadata": {},
"source": [
"### Update string data into integer data"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "74074e6f",
"metadata": {},
"outputs": [],
"source": [
"############Gender###############################\n",
"\n",
"data = df.replace(to_replace =\"Male\",\n",
" value =\"0\")\n",
"data = data.replace(to_replace =\"Female\",\n",
" value =\"1\")\n",
"data = data.replace(to_replace =\"Other\",\n",
" value =\"2\")\n",
"\n",
"############ever married#########################\n",
"\n",
"data = data.replace(to_replace =\"Yes\",\n",
" value =\"1\")\n",
"data = data.replace(to_replace =\"No\",\n",
" value =\"0\")\n",
"\n",
"############work type############################\n",
"\n",
"data = data.replace(to_replace =\"Private\",\n",
" value =\"1\")\n",
"data = data.replace(to_replace =\"Self-employed\",\n",
" value =\"0\")\n",
"data = data.replace(to_replace =\"Govt_job\",\n",
" value =\"2\")\n",
"data = data.replace(to_replace =\"children\",\n",
" value =\"3\")\n",
"data = data.replace(to_replace =\"Never_worked\",\n",
" value =\"4\")\n",
"\n",
"############Residence type#######################\n",
"\n",
"data = data.replace(to_replace =\"Urban\",\n",
" value =\"1\")\n",
"data = data.replace(to_replace =\"Rural\",\n",
" value =\"0\")\n",
"\n",
"############smoking status#######################\n",
"\n",
"data = data.replace(to_replace =\"smokes\",\n",
" value =\"0\")\n",
"data = data.replace(to_replace =\"never smoked\",\n",
" value =\"1\")\n",
"data = data.replace(to_replace =\"formerly smoked\",\n",
" value =\"2\")\n",
"data = data.replace(to_replace =\"Unknown\",\n",
" value =\"3\")"
]
},
{
"cell_type": "markdown",
"id": "791343e1",
"metadata": {},
"source": [
"##### Updated dataset"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "d21cf5f8",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>id</th>\n",
" <th>gender</th>\n",
" <th>age</th>\n",
" <th>hypertension</th>\n",
" <th>heart_disease</th>\n",
" <th>ever_married</th>\n",
" <th>work_type</th>\n",
" <th>Residence_type</th>\n",
" <th>avg_glucose_level</th>\n",
" <th>bmi</th>\n",
" <th>smoking_status</th>\n",
" <th>stroke</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>9046</td>\n",
" <td>0</td>\n",
" <td>67.0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>228.69</td>\n",
" <td>36.600000</td>\n",
" <td>2</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>51676</td>\n",
" <td>1</td>\n",
" <td>61.0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>202.21</td>\n",
" <td>28.893237</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>31112</td>\n",
" <td>0</td>\n",
" <td>80.0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>105.92</td>\n",
" <td>32.500000</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>60182</td>\n",
" <td>1</td>\n",
" <td>49.0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>171.23</td>\n",
" <td>34.400000</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>1665</td>\n",
" <td>1</td>\n",
" <td>79.0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>174.12</td>\n",
" <td>24.000000</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>...</th>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5105</th>\n",
" <td>18234</td>\n",
" <td>1</td>\n",
" <td>80.0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>83.75</td>\n",
" <td>28.893237</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5106</th>\n",
" <td>44873</td>\n",
" <td>1</td>\n",
" <td>81.0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>125.20</td>\n",
" <td>40.000000</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5107</th>\n",
" <td>19723</td>\n",
" <td>1</td>\n",
" <td>35.0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>82.99</td>\n",
" <td>30.600000</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5108</th>\n",
" <td>37544</td>\n",
" <td>0</td>\n",
" <td>51.0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>166.29</td>\n",
" <td>25.600000</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5109</th>\n",
" <td>44679</td>\n",
" <td>1</td>\n",
" <td>44.0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>2</td>\n",
" <td>1</td>\n",
" <td>85.28</td>\n",
" <td>26.200000</td>\n",
" <td>3</td>\n",
" <td>0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>5110 rows × 12 columns</p>\n",
"</div>"
],
"text/plain": [
" id gender age hypertension heart_disease ever_married work_type \\\n",
"0 9046 0 67.0 0 1 1 1 \n",
"1 51676 1 61.0 0 0 1 0 \n",
"2 31112 0 80.0 0 1 1 1 \n",
"3 60182 1 49.0 0 0 1 1 \n",
"4 1665 1 79.0 1 0 1 0 \n",
"... ... ... ... ... ... ... ... \n",
"5105 18234 1 80.0 1 0 1 1 \n",
"5106 44873 1 81.0 0 0 1 0 \n",
"5107 19723 1 35.0 0 0 1 0 \n",
"5108 37544 0 51.0 0 0 1 1 \n",
"5109 44679 1 44.0 0 0 1 2 \n",
"\n",
" Residence_type avg_glucose_level bmi smoking_status stroke \n",
"0 1 228.69 36.600000 2 1 \n",
"1 0 202.21 28.893237 1 1 \n",
"2 0 105.92 32.500000 1 1 \n",
"3 1 171.23 34.400000 0 1 \n",
"4 0 174.12 24.000000 1 1 \n",
"... ... ... ... ... ... \n",
"5105 1 83.75 28.893237 1 0 \n",
"5106 1 125.20 40.000000 1 0 \n",
"5107 0 82.99 30.600000 1 0 \n",
"5108 0 166.29 25.600000 2 0 \n",
"5109 1 85.28 26.200000 3 0 \n",
"\n",
"[5110 rows x 12 columns]"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"data"
]
},
{
"cell_type": "markdown",
"id": "988e3c0e",
"metadata": {},
"source": [
"### Checking the distribution of target variable\n"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "dd942b3b",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0.051224027977782347\n",
"Stroke Cases: 249\n",
"Not_Stroke: 4861\n"
]
}
],
"source": [
"# Determine number of fraud cases in dataset\n",
"\n",
"Stroke = data[data['stroke'] == 1]\n",
"Not_Stroke = data[data['stroke'] == 0]\n",
"\n",
"outlier_fraction = len(Stroke)/float(len(Not_Stroke))\n",
"print(outlier_fraction)\n",
"\n",
"print('Stroke Cases: {}'.format(len(data[data['stroke'] == 1])))\n",
"print('Not_Stroke: {}'.format(len(data[data['stroke'] == 0])))"
]
},
{
"cell_type": "markdown",
"id": "6e5ac5b6",
"metadata": {},
"source": [
"### Visualizing the dataset"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "cf5c2e23",
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAqoAAAJdCAYAAAD6ElXLAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8QVMy6AAAACXBIWXMAAAsTAAALEwEAmpwYAAA6lElEQVR4nO3dd7hlZXn38e+PAUTAhhqDdA1KUOnSxIYNKxYiWGLEMkFFLK8mJsYSTDMkGgs6jgTQWDAq6pgQwQoqiDNDGYpCeEFlxDcRKSIqOHPu94+9jm4Op8Gcs9da53w/Xvs6e6317GffezvM3OdeT0lVIUmSJHXNRm0HIEmSJE3GRFWSJEmdZKIqSZKkTjJRlSRJUieZqEqSJKmTTFQlSZLUSSaqkiRJ2mBJDklyWZIrkrxpkuv3SPLFJBcmuSTJkTP26TqqkiRJ2hBJlgCXA08A1gIrgedV1aVDbf4SuEdV/XmS+wKXAb9fVbdO1a8VVUmSJG2ofYErqurKJvE8BTh0QpsC7pYkwJbAdcC66To1UZUkSdKG2ga4euh4bXNu2PuBPwSuAS4CXlNVY9N1uvFcRqj58Ztrr3R8xgSb3/+RbYegHvE/IN0RaTuAjhoUwTTRrbes7cQXM9+5wqb3feCfAkuHTi2vquVDx5N9DxNjehJwAXAw8EDgy0m+WVU/n+p9TVQlSZI0rSYpXT5Nk7XAdkPH2zKonA47EviHGkyQuiLJVcAuwHen6tREVZIkqe/G1rcdwUpg5yQ7AT8GjgCeP6HNj4DHAd9Mcj/gwcCV03VqoipJkqQNUlXrkhwNnA4sAU6sqkuSHNVcXwa8Azg5yUUMhgr8eVVdO12/Lk/VA45RvT3HqOqO8D8g3RGdGHDYQY5RnVxnxqj+z2Xz+lfdJvd7cCuf01n/kiRJ6iRv/UuSJPXd2LSrPPWWFVVJkiR1khVVSZKknpth3fzesqIqSZKkTrKiKkmS1HeOUZUkSZJGx4qqJElS3zlGVZIkSRodK6qSJEl9N7a+7QjmhRVVSZIkdZIVVUmSpL5zjKokSZI0OlZUJUmS+m6BrqNqoipJktRzbqEqSZIkjZAVVUmSpL5boLf+rahKkiSpk6yoSpIk9Z1jVCVJkqTRsaIqSZLUd26hKkmSJI2OFVVJkqS+c4yqJEmSNDpWVCVJkvrOdVQlSZKk0TFRHZEkZ09x/uQkh406HkmStIDU2Pw+WmKiOiJVdWDbMUiSJPWJY1RHJMkvqmrLJAHeBxwMXAWk3cgkSVLvOUZVc+RZwIOBhwEvB6y0SpIkTcJEdfQeBXyyqtZX1TXA1yZrlGRpklVJVp3w0U+ONkJJktQrVevn9dEWb/23o2ZsULUcWA7wm2uvnLG9JEnSQmNFdfTOAo5IsiTJ1sBj2w5IkiT13AKd9W9FdfQ+x2Ai1UXA5cCZ7YYjSZLUTSaqI1JVWzY/Czi65XAkSdJCskBn/ZuoSpIk9V2Lt+fnk2NUJUmS1ElWVCVJkvpurL0lpOaTFVVJkiR1khVVSZKkvnOMqiRJkjQ6VlQlSZL6boEuT2VFVZIkSZ1kRVWSJKnvHKMqSZIkjY4VVUmSpL5zjKokSZI0OlZUJUmS+s6KqiRJkjQ6VlQlSZJ6rmp92yHMCyuqkiRJ6iQrqpIkSX3nGFVJkiRpckkOSXJZkiuSvGmS629MckHzuDjJ+iRbTdenFVVJkqS+a3lnqiRLgOOBJwBrgZVJVlTVpeNtquo44Lim/dOB11XVddP1a0VVkiRJG2pf4IqqurKqbgVOAQ6dpv3zgE/O1KkVVUmSpL5rf4zqNsDVQ8drgf0ma5hkc+AQ4OiZOjVRlSRJ6rt5vvWfZCmwdOjU8qpaPtxksqim6O7pwLdnuu0PJqqSJEmaQZOULp+myVpgu6HjbYFrpmh7BLO47Q8mqpIkSf3X/q3/lcDOSXYCfswgGX3+xEZJ7gE8GnjhbDo1UZUkSdIGqap1SY4GTgeWACdW1SVJjmquL2uaPgs4o6punk2/JqqSJEl91/LyVABVdRpw2oRzyyYcnwycPNs+XZ5KkiRJnWRFVZIkqe/aH6M6L6yoSpIkqZOsqPbA5vd/ZNshdM4vr/lm2yF00rv3fmvbIXTSLVMu5be43ZSFWYHZULvfuqTtEDrp4k3989JpVlQlSZKk0bGiKkmS1HcdmPU/H6yoSpIkqZOsqEqSJPWdY1QlSZKk0bGiKkmS1HeOUZUkSZJGx4qqJElS3zlGVZIkSRodK6qSJEl95xhVSZIkaXSsqEqSJPXdAh2jaqIqSZLUdws0UfXWvyRJkjrJiqokSVLfVbUdwbywoipJkqROsqIqSZLUd45RlSRJkkbHiqokSVLfWVGVJEmSRseKqiRJUt+5haokSZI0OlZUJUmS+s4xqpIkSdLoWFGVJEnqO3emkiRJkkbHiqokSVLfOUZVkiRJGh0rqpIkSX1nRVWSJEkaHSuqkiRJfefOVJIkSdLoWFGVJEnquRpzHVVNIcnnk6xOckmSpc25lya5PMk3knw4yfub8/dN8tkkK5vHI9qNXpIkqZusqM6Nl1TVdUnuCqxM8p/AW4C9gJuArwEXNm3fA7y7qr6VZHvgdOAP2whakiQtEAt01r+J6tw4JsmzmufbAX8MnFlV1wEk+TTwoOb644Fdk4y/9u5J7lZVNw132FRmlwJstOQebLTRFvP8ESRJUm8t0MlUJqobKMljGCSfB1TVL5N8A7iMqaukGzVtfzVdv1W1HFgOsMmm2yzMgSeSJEnTcIzqhrsHcH2TpO4C7A9sDjw6yb2SbAw8Z6j9GcDR4wdJ9hhlsJIkaQEaq/l9tMREdcN9Cdg4yRrgHcB3gB8DfwecC3wFuBS4sWl/DLBPkjVJLgWOGn3IkiRJ3eet/w1UVbcAT554PsmqqlreVFQ/x6CSSlVdCxw+2iglSdKCtkAnU1lRnT9vT3IBcDFwFfD5VqORJEnqGSuq86Sq3tB2DJIkaZGwoipJkiSNjhVVSZKkvquFuZKlFVVJkiR1khVVSZKkvnOMqiRJkjQ6VlQlSZL6rsXdo+aTFVVJkiRtsCSHJLksyRVJ3jRFm8ckuSDJJUnOnKlPK6qSJEl9V+2OUU2yBDgeeAKwFliZZEVVXTrU5p7AB4BDqupHSX5vpn6tqEqSJGlD7QtcUVVXVtWtwCnAoRPaPB84tap+BFBV/ztTpyaqkiRJfTdW8/uY2TbA1UPHa5tzwx4E3CvJN5KsTvKimTr11r8kSZKmlWQpsHTo1PKqWj7cZJKXTcxwNwb2Bh4H3BU4J8l3quryqd7XRFWSJKnnap7XUW2S0uXTNFkLbDd0vC1wzSRtrq2qm4Gbk5wF7A5Mmah661+SJEkbaiWwc5KdkmwKHAGsmNDmC8Ajk2ycZHNgP+B703VqRVWSJKnvWl5HtarWJTkaOB1YApxYVZckOaq5vqyqvpfkS8AaYAw4oaounq5fE1VJkqS+a3l5KoCqOg04bcK5ZROOjwOOm22f3vqXJElSJ1lRlSRJ6ju3UJUkSZJGx4qqJElS383z8lRtsaIqSZKkTrKiKkmS1HeOUZUkSZJGx4qqJElS33VgHdX5YEVVkiRJnWRFVZIkqe8coypJkiSNjhVVSZKknqsFuo6qiap66d17v7XtEDrpdauPbTuETjpwtxe3HUIn7bjJvdoOoZO+t/H6tkPopBt/86u2Q9AiZKIqSZLUd45RlSRJkkbHiqokSVLfWVGVJEmSRseKqiRJUt+5M5UkSZI0OlZUJUmS+m6BjlE1UZUkSeq5WqCJqrf+JUmS1ElWVCVJkvrOiqokSZI0OlZUJUmS+m7M5akkSZKkkbGiKkmS1HeOUZUkSZJGx4qqJElS31lRlSRJkkbHiqokSVLPVVlRlSRJkkbGiqokSVLfOUZVkiRJGh0rqpIkSX1nRVWSJEkaHSuqkiRJPVdWVCVJkqTRsaIqSZLUd1ZUJUmSpNGxoipJktR3Y20HMD9MVCVJknrOyVSSJEnSCFlRlSRJ6jsrqvMnyY5JLh7h++2R5Cnz0O+xSR4/1/1KkiQtRouuoppkY2APYB/gtLnsu6reOpf9SZIkzcoCnUzViYpqY0mSDye5JMkZSR6S5Lzxi0l2TrK6ef6DJO9M8t3m8QfN+fsm+WySlc3jEc35tydZnuQM4KPAscDhSS5IcniSLZKc2Lzm/CSHNq97cZJTk3wpyX8n+cfm/JIkJye5OMlFSV7XnD85yWHN88c1fV3U9H2Xodj/Osl5zbVdRvYNS5Ik9UiXKqo7A8+rqpcn+XdgT+DGJHtU1QXAkcDJQ+1/XlX7JnkR8C/A04D3AO+uqm8l2R44HfjDpv3ewEFV9askLwb2qaqjAZL8HfC1qnpJknsC303yleZ1ezSx3AJcluR9wO8B21TVQ5vX33P4gyTZrIn1cVV1eZKPAq9o4gS4tqr2SvJK4A3Ay+70tyZJkhY9Z/3Pv6uahBRgNbAjcAJwZJIlwOHAJ4baf3Lo5wHN88cD709yAbACuHuSuzXXVlTVr6Z47ycCb2pe9w1gM2D75tpXq+rGqvo1cCmwA3Al8IAk70tyCPDzCf09uPk8lzfHHwEeNXT91Amf83aSLE2yKsmqsbGbpwhbkiRp4epSRfWWoefrgbsCnwXeBnwNWF1VPxtqU5M83wg4YGJCmgRgumwvwHOq6rIJr9tvkrg2rqrrk+wOPAl4FfBc4CUT+pvOeJ/rmeL/g6paDiwH2GTTbRbmr0mSJGluOEZ19Joq5unAB4GTJlw+fOjnOc3zM4Cjxxsk2WOKrm8C7jZ0fDrw6jQZbZI9p4sryX2Ajarqs8BbgL0mNPk+sOP42Fngj4Ezp+tTkiRJt9WliupUPg48m0ESOuwuSc5lkGw/rzl3DHB8kjUMPttZwFGT9Pl1fner/++BdzAYP7qmSVZ/wGDM61S2AU5KMp7o/8Xwxar6dZIjgU83qwysBJbN+EklSZLuhIU6RjVV3f5gSd4A3KOq3jJ07gcMJkNd21pgI+St/9v7260f23YInfS61ce2HUInHbjbi9sOoZN23ORebYfQSb+u9W2H0Ek3rp9qmsfidtaPvzrTcL+RuO5Zj57XXGGrz5054+ds5u28B1gCnFBV/zDh+mOALwBXNadOrapp/+HqdEU1yeeABwIHtx2LJElSZ7U8RrWZ+H488ARgLbAyyYqqunRC029W1XR3rW+j04lqVT1rivM7jjgUSZIkTW1f4IqquhIgySnAoQxWTLrTOj2ZSpIkSTOrsfl9zMI2wNVDx2ubcxMdkOTCJP+V5CEzdWqiKkmSpGkNr+/ePJZObDLJyyaOmz0P2KGqdgfeB3x+pvft9K1/SZIkzcI8j1EdXt99CmuB7YaOtwWumdDHz4een5bkA0nuM93keCuqkiRJ2lArgZ2T7JRkU+AIBruE/laS3x9as35fBnnoz27X0xArqpIkST03y3Gk8/f+VeuSHM1gE6UlwIlVdUmSo5rry4DDgFckWQf8CjiiZlgn1URVkiRJG6yqTgNOm3Bu2dDz9wPvvyN9mqhKkiT1XcsV1fniGFVJkiR1khVVSZKknmt7jOp8MVGVJEnquYWaqHrrX5IkSZ1kRVWSJKnnrKhKkiRJI2RFVZIkqe8qbUcwL6yoSpIkqZOsqEqSJPWcY1QlSZKkEbKiKkmS1HM15hhVSZIkaWSsqEqSJPWcY1QlSZKkEbKiKkmS1HPlOqqSJEnS6FhRlSRJ6jnHqEqSJEkjZEVVkiSp51xHVZIkSRohK6o9UG0H0EG3+K1M6sDdXtx2CJ109pqT2w6hkw7b65i2Q+ikc264vO0QOulhd9+h7RA0jVqg/yxaUZUkSVInWVGVJEnquYU6RtVEVZIkqecWaqLqrX9JkiR1khVVSZKknnMylSRJkjRCVlQlSZJ6zjGqkiRJ0ghZUZUkSeq5KiuqkiRJ0shYUZUkSeq5Gms7gvlhRVWSJEmdZEVVkiSp58YcoypJkiSNjhVVSZKknnPWvyRJkjRCVlQlSZJ6zp2pJEmSpBGyoipJktRzVW1HMD+sqEqSJKmTrKhKkiT1nGNUJUmSpBGyoipJktRzC3VnKhNVSZKknnPBf0mSJGmErKhKkiT1nMtTSZIkSSNkRVWSJKnnFupkKiuqkiRJ6iQrqpIkST3nrH9JkiRpCkkOSXJZkiuSvGmadg9Psj7JYTP1aUVVkiSp59qe9Z9kCXA88ARgLbAyyYqqunSSdu8ETp9Nv61UVJPsmOTieeh3jyRPuYOv+UGS+zTPz57rmCRJkhaBfYErqurKqroVOAU4dJJ2rwY+C/zvbDpdMLf+k2wM7AHcoUR1WFUdOGcBSZIkjchYZV4fs7ANcPXQ8drm3G8l2QZ4FrBstp+rzUR1SZIPJ7kkyRlJ7prkgUm+lGR1km8m2QUgydOTnJvk/CRfSXK/5vzbkyxPcgbwUeBY4PAkFyQ5fLI3TXLv5v3OT/IhIEPXftH83DrJWU0/Fyd5ZHP+iUnOSXJekk8n2bI5/9YkK5u2y5OkOX9MkkuTrElySnNuiyQnNu3PTzLZbxskWZpkVZJVY2M3z803LkmSdCcM5yXNY+nEJpO8bOKAhH8B/ryq1s/2fdtMVHcGjq+qhwA3AM8BlgOvrqq9gTcAH2jafgvYv6r2ZFBK/rOhfvYGDq2q5wNvBT5VVXtU1aemeN+3Ad9q+loBbD9Jm+cDp1fVHsDuwAXN8IC/Ah5fVXsBq4DXN+3fX1UPr6qHAncFntacfxOwZ1XtBhzVnHsz8LWqejjwWOC4JFtMDKCqllfVPlW1z0Yb3e6yJEnSb1Vlnh+/y0uax/IJIawFths63ha4ZkKbfYBTkvwAOAz4QJJnTve52pxMdVVVXdA8Xw3sCBwIfLopSALcpfm5LfCpJFsDmwJXDfWzoqp+dQfe91HAswGq6j+TXD9Jm5XAiUk2AT5fVRckeTSwK/DtJr5NgXOa9o9N8mfA5sBWwCXAF4E1wMeTfB74fNP2icAzkryhOd6MQbL8vTvwGSRJkrpkJbBzkp2AHwNHMCj8/VZV7TT+PMnJwH9U1een67TNRPWWoefrgfsBNzRVzIneB7yrqlYkeQzw9qFrd+a++LRz46rqrCSPAp4K/FuS44DrgS9X1fOG2ybZjEHld5+qujrJ2xkknzSvfxTwDOAtSR7CoDT+nKq67E7ELUmSdDtt70xVVeuSHM1gNv8S4MSquiTJUc31WY9LHdalyVQ/B65K8kcAGdi9uXYPBtk5wJ9M08dNwN1meJ+zgBc07/Fk4F4TGyTZAfjfqvow8K/AXsB3gEck+YOmzeZJHsTvktJrmzGrhzXXNwK2q6qvMxiqcE9gSwb/B756aBzrnjPEK0mS1HlVdVpVPaiqHlhVf9ucWzZZklpVL66qz8zUZ5cSVRgkkC9NciGD2+fjE43ezmBIwDeBa6d5/deBXaebTAX8NfCoJOcxuA3/o0naPIbBuNTzGYydfU9V/RR4MfDJJGsYJK67VNUNwIeBixjc3l/Z9LEE+FiSi4DzgXc3bd8BbAKsaZboesc0n0eSJGlGNc+PtqTaXiFWM9p40238P2mCt239mLZD6KQVv7l65kaL0NlrTm47hE46bK9j2g6hk759w+Vth9BJD7v7Dm2H0ElfX/vlTuxd+p37P3tec4X9rzm1lc/pzlSSJEk91/YY1fmyYBPVJEcCr5lw+ttV9ao24pEkSdIds2AT1ao6CTip7TgkSZLmW1lRlSRJUheNtR3APOnarH9JkiQJsKIqSZLUe8XCvPVvRVWSJEmdZEVVkiSp58YW6IrrVlQlSZLUSVZUJUmSem7MMaqSJEnS6FhRlSRJ6jln/UuSJEkjZEVVkiSp59yZSpIkSRohK6qSJEk95xhVSZIkaYSsqEqSJPWcY1QlSZKkEbKiKkmS1HNWVCVJkqQRsqIqSZLUc876lyRJkkbIiqokSVLPjS3MgqoVVUmSJHWTFVVJkqSeG1ugY1RNVCVJknqu2g5gnnjrX5IkSZ1kRVW9dFMW6tLGG2bHTe7VdgiddNhex7QdQid95rz3th1CJx2+92vbDqGTrlv/y7ZD0DQW6r+KVlQlSZLUSVZUJUmSem4sC3MylRVVSZIkdZIVVUmSpJ5z1r8kSZI0QlZUJUmSes5Z/5IkSdIIWVGVJEnqubGFOenfiqokSZK6yYqqJElSz42xMEuqVlQlSZLUSVZUJUmSes51VCVJkqQRsqIqSZLUc876lyRJkkbIiqokSVLPuTOVJEmSNEJWVCVJknpuoc76N1GVJEnqOSdTSZIkSSNkRVWSJKnnnEwlSZIkTSHJIUkuS3JFkjdNcv3QJGuSXJBkVZKDZurTiqokSVLPtV1RTbIEOB54ArAWWJlkRVVdOtTsq8CKqqokuwH/DuwyXb9WVCVJkrSh9gWuqKorq+pW4BTg0OEGVfWLqhpfoGALZrFYgRVVSZKknqv2Z/1vA1w9dLwW2G9ioyTPAv4e+D3gqTN1akVVkiRJ00qytBlXOv5YOrHJJC+7XcW0qj5XVbsAzwTeMdP7WlGVJEnqufkeo1pVy4Hl0zRZC2w3dLwtcM00/Z2V5IFJ7lNV107VzoqqJEmSNtRKYOckOyXZFDgCWDHcIMkfJEnzfC9gU+Bn03VqRVWSJKnn2p71X1XrkhwNnA4sAU6sqkuSHNVcXwY8B3hRkt8AvwIOH5pcNSkTVUmSJG2wqjoNOG3CuWVDz98JvPOO9GmiKkmS1HMzrvPUU45RlSRJUid1MlFN8pgk/9F2HMOS7Jjk4q73KUmSFp+xzO+jLZ1MVCVJkqRZJapJPp9kdZJLmgVfX5HkH4euvzjJ+5rnb0ny/SRfTvLJJG+Ypt+HJ1mT5Jwkx01WXUzy9uE+klycZMfm+Yua11+Y5N+aczsk+Wpz/qtJtm/O/1Hz2guTnNWcW9K878qm/Z/O8vuY9HVJPpXkKUPtTk7ynDvzPsML646N3TybsCRJ0iI1Ns+Ptsy2ovqSqtob2Ac4BjgVePbQ9cOBTyXZh8HSA3s21/eZod+TgKOq6gBg/R0JPMlDgDcDB1fV7sBrmkvvBz5aVbsBHwfe25x/K/Ckpu0zmnMvBW6sqocDDwdenmSnWbz9VK87hcF3QbOG2OMYzH67w+9TVcurap+q2mejjbaYRUiSJEkLy2wT1WOSXAh8h8GuAzsBVybZP8m9gQcD3wYOAr5QVb+qqpuAL07VYZJ7AnerqrObU5+4g7EfDHxmfDeDqrquOX/AUF//1sREE9/JSV7OYH0vgCcyWM/rAuBc4N7AzrN476le91/AwUnuAjwZOKuqfrUB7yNJkjSjhVpRnXF5qiSPAR4PHFBVv0zyDWAz4FPAc4HvA5+rqhrfbWCWZtt2HbdNqDcbev1sVmMogKo6Ksl+wFOBC5Ls0fTx6qo6fZaxjJvydc338yQGldVPTtd+fAiDJEmSbm82FdV7ANc3SeouwP7N+VOBZwLPY5C0AnwLeHqSzZJsySApnFRVXQ/clGS8vyOmaPoDYC/47XZb47fMvwo8t6nokmSr5vzZQ329oImJJA+sqnOr6q3AtQwqw6cDr0iySdPmQUlmc599utedAhwJPLJpN1N7SZKkDVLz/GjLbBb8/xJwVJI1wGUMbv9TVdcnuRTYtaq+25xbmWQFcCHwQ2AVcOM0fb8U+HCSm4FvTNH2s/zutvlK4PLmvS5J8rfAmUnWA+cDL2YwhvbEJG8EfsogaQQ4LsnODKqbX21iXAPsCJzXVIN/yiD5nskJ07zuDOCjwIqqunUW7SVJkjZIm0tIzafMsMXqHe8w2bKqfpFkc+AsYGlVnTdd2+b5m4Ctq+o1k7VdzDbedJuFuuHEnfb6+z+q7RA66ar6ZdshdNKtdYfmai4anznvvTM3WoQO3/u1bYfQSdet9++XyXxj7Vc6kSL+4w4vnNdc4c9++LFWPud8bKG6PMmuDMaSfmSqJLXx1CR/0cTxQwYVUUmSJN0BbU54mk9znqhW1fMnnktyPPCICaffU1Un8bvxrZ2Q5GEMVgsYdktV7ddGPJIkSYvVfFRUb6eqXjWK95kLVXURsEfbcUiSJM3WQh0j6BaqkiRJ6qSRVFQlSZI0f8YWaE3ViqokSZI6yYqqJElSzy3UWf9WVCVJktRJVlQlSZJ6bmGOULWiKkmSpI6yoipJktRzjlGVJEmSRsiKqiRJUs+Npe0I5ocVVUmSJHWSFVVJkqSec2cqSZIkaYSsqEqSJPXcwqynWlGVJElSR1lRlSRJ6rmFuo6qiaokSVLPOZlKkiRJGiErqpIkST23MOupVlQlSZLUUVZUJUmSem6hTqayoipJkqROsqIqSZLUc876lyRJkkbIiqokSVLPLcx6qolqL6TtADpo91uXtB1CJ31v4/Vth9BJ59xwedshdNLhe7+27RA66VOr/6XtEDpp6wcc0nYIWoRMVCVJknrOWf+SJEnSCFlRlSRJ6rlaoKNUrahKkiSpk6yoSpIk9ZxjVCVJkqQRsqIqSZLUc+5MJUmSJI2QFVVJkqSeW5j1VCuqkiRJ6igrqpIkST3nGFVJkiRphKyoSpIk9ZzrqEqSJKmTap7/NxtJDklyWZIrkrxpkusvSLKmeZydZPeZ+jRRlSRJ0gZJsgQ4HngysCvwvCS7Tmh2FfDoqtoNeAewfKZ+vfUvSZLUcx249b8vcEVVXQmQ5BTgUODS8QZVdfZQ++8A287UqRVVSZIkTSvJ0iSrhh5LJzTZBrh66Hhtc24qLwX+a6b3taIqSZLUc7MdR3qn+69azvS36jPZyyZtmDyWQaJ60Ezva6IqSZKkDbUW2G7oeFvgmomNkuwGnAA8uap+NlOnJqqSJEk914ExqiuBnZPsBPwYOAJ4/nCDJNsDpwJ/XFWXz6ZTE1VJkiRtkKpal+Ro4HRgCXBiVV2S5Kjm+jLgrcC9gQ8kAVhXVftM16+JqiRJUs+NVftbqFbVacBpE84tG3r+MuBld6RPZ/1LkiSpk6yoSpIk9Vz79dT5YUVVkiRJnWRFVZIkqefGFmhN1YqqJEmSOsmKqiRJUs/N985UbbGiKkmSpE6yoipJktRzHdiZal5YUZUkSVInmahuoCQ7Jrn4Tr72/kk+M9cxSZKkxWWMmtdHW7z136KqugY4rO04JEmSusiK6tzYOMlHkqxJ8pkkmyf5QZK/S3JOklVJ9kpyepL/m+Qo2LBqrCRJ0ria5/+1xUR1bjwYWF5VuwE/B17ZnL+6qg4AvgmczKB6uj9w7EwdJlnaJLirxsZunp+oJUnSgjA2z4+2mKjOjaur6tvN848BBzXPVzQ/LwLOraqbquqnwK+T3HO6DqtqeVXtU1X7bLTRFvMStCRJUpc5RnVuTKyJjx/f0vwcG3o+fux3L0mS5kSVC/5ratsnOaB5/jzgW20GI0mStBCYqM6N7wF/kmQNsBXwwZbjkSRJi4jLU2lSVfUDYNdJLu041OZkBpOpxo/Hr10LPHS+YpMkSeozE1VJkqSecwtVSZIkaYSsqEqSJPVcm4vyzycrqpIkSeokK6qSJEk91+bM/PlkRVWSJEmdZEVVkiSp59yZSpIkSRohK6qSJEk95zqqkiRJ0ghZUZUkSeo511GVJEmSRsiKqiRJUs+5jqokSZI0QlZUJUmSem6hrqNqoipJktRz3vqXJEmSRsiKqiRJUs+5PJUkSZI0QlZUJUmSem5sgU6msqIqSZKkTrKiKkmS1HMLs55qRVWSJEkdZUVVkiSp51xHVZIkSRohK6qSJEk9Z0VVkiRJGiErqpIkST1XrqMqSZIkjY4V1R5I0nYInXPxpmNth9BJN/7mV22H0EkPu/sObYfQSdet/2XbIXTS1g84pO0QOuknV36p7RA0DceoSpIkSSNkRVWSJKnnyoqqJEmSNDpWVCVJknrOWf+SJEnSFJIckuSyJFckedMk13dJck6SW5K8YTZ9WlGVJEnqubZn/SdZAhwPPAFYC6xMsqKqLh1qdh1wDPDM2fZrRVWSJKnnqmpeH7OwL3BFVV1ZVbcCpwCHTojxf6tqJfCb2X4uE1VJkiRNK8nSJKuGHksnNNkGuHroeG1zboN461+SJKnn5vvWf1UtB5ZP02Sy3Yk2OCgrqpIkSdpQa4Htho63Ba7Z0E6tqEqSJPVcBxb8XwnsnGQn4MfAEcDzN7RTE1VJkiRtkKpal+Ro4HRgCXBiVV2S5Kjm+rIkvw+sAu4OjCV5LbBrVf18qn5NVCVJknpurAML/lfVacBpE84tG3r+/xgMCZg1x6hKkiSpk6yoSpIk9VwHxqjOCyuqkiRJ6iQrqpIkST3XhTGq88GKqiRJkjrJiqokSVLPOUZVkiRJGiErqpIkST3nGFVJkiRphKyoSpIk9ZxjVCVJkqQRsqIqSZLUc45RlSRJkkbIiqokSVLPOUZVkiRJGiErqpIkST1XNdZ2CPPCRFWSJKnnxrz1v3gleW2Sze/ga3ZMcvF8xSRJkrTQmajOzmuBSRPVJEtGG4okSdJtVdW8PtpiojpBki2S/GeSC5NcnORtwP2Bryf5etPmF0mOTXIucECS1zdtL07y2kn6fECS85M8PMkDk3wpyeok30yyy2g/oSRJUj84RvX2DgGuqaqnAiS5B3Ak8NiqurZpswVwcVW9NcnezfX9gADnJjkTuL55/YOBU4Ajq+qCJF8Fjqqq/06yH/AB4OARfj5JkrTAOEZ18bgIeHySdyZ5ZFXdOEmb9cBnm+cHAZ+rqpur6hfAqcAjm2v3Bb4AvLBJUrcEDgQ+neQC4EPA1pMFkWRpklVJVo2tv3nOPpwkSVJfWFGdoKoub6qkTwH+PskZkzT7dVWtb55nmu5uBK4GHgFcwuAXgxuqao9ZxLEcWA6w6V22XZi/JkmSpDnR5jjS+WRFdYIk9wd+WVUfA/4J2Au4CbjbFC85C3hmks2TbAE8C/hmc+1W4JnAi5I8v6p+DlyV5I+a90qS3efv00iSJPWXFdXbexhwXJIx4DfAK4ADgP9K8pOqeuxw46o6L8nJwHebUydU1flJdmyu35zkacCXk9wMvAD4YJK/AjZhMH71whF8LkmStECNLdCKqonqBFV1OnD6hNOrgPcNtdlywmveBbxrwrkfAA9tnt8APHzo8iFzFrAkSdICZaIqSZLUc+Wsf0mSJGl0rKhKkiT1nLP+JUmSpBGyoipJktRz7kwlSZIkjZAVVUmSpJ5zjKokSZI0QlZUJUmSem6h7kxlRVWSJEmdZEVVkiSp5xbqGFUTVUmSpJ5zeSpJkiRphKyoSpIk9dxCvfVvRVWSJEmdZEVVkiSp51yeSpIkSRohK6qSJEk9V876lyRJkkbHiqokSVLPOUZVkiRJGiErqpIkST3nOqqSJEnSCFlRlSRJ6jln/UuSJEkjZEVVkiSp5xyjKkmSJE0hySFJLktyRZI3TXI9Sd7bXF+TZK+Z+rSiKkmS1HNtV1STLAGOB54ArAVWJllRVZcONXsysHPz2A/4YPNzSlZUJUmStKH2Ba6oqiur6lbgFODQCW0OBT5aA98B7plk6+k6NVGVJEnquZrnR5KlSVYNPZZOCGEb4Oqh47XNuTva5ja89d8Dt96yNm3HMC7J0qpa3nYcXeP3Mjm/l8n5vdye38nk/F4m5/dye+tu/fEocoXpvvPJ3n/ieITZtLkNK6q6oyb+BqUBv5fJ+b1Mzu/l9vxOJuf3Mjm/l+5ZC2w3dLwtcM2daHMbJqqSJEnaUCuBnZPslGRT4AhgxYQ2K4AXNbP/9wdurKqfTNept/4lSZK0QapqXZKjgdOBJcCJVXVJkqOa68uA04CnAFcAvwSOnKlfE1XdUY4Jmpzfy+T8Xibn93J7fieT83uZnN9LB1XVaQyS0eFzy4aeF/CqO9Jn2l53S5IkSZqMY1QlSZLUSSaqkiRJ6iQTVUmSJHWSiaq0AZJs0XYM6o8kd03y4LbjkKS+cNa/ppTk2dNdr6pTRxVL1yQ5EDgB2BLYPsnuwJ9W1Svbjax9SR4EvBHYgaG/Y6rq4NaC6oAkTwf+CdgU2CnJHsCxVfWMVgNrSZKLmHxHmjCYHLzbiENqXZJ/r6rnTvLdLNrvZFjzd8sHgftV1UOT7AY8o6r+puXQNI+c9a8pJTmpefp7wIHA15rjxwLfqKppE9mFLMm5wGHAiqraszl3cVU9tN3I2pfkQmAZsBpYP36+qla3FlQHJFkNHMzgv53xPzNrFmvykWSH6a5X1Q9HFUtXJNm6qn4y1XezGL+TYUnOZPBL8If8e3fxsKKqKVXVkQBJ/gPYdXz3iCRbA8e3GVsXVNXVyW22LV4/VdtFZl1VfbDtIDpoXVXdOOHPzKI1nHQ1idnOVfWVJHdlkf7bNP537Ph3k+TuLNLvYgqbV9V3J/w3tK6tYDQajlHVbOw4YYuz/wEe1FYwHXF1c/u/kmya5A3A99oOqiO+mOSVSbZOstX4o+2gOuDiJM8HliTZOcn7gLPbDqptSV4OfAb4UHNqW+DzrQXUAUn+NMn/AGsY3JlYDaxqN6pOuDbJA2mGRSQ5DJh2+031n7f+NaMk7wd2Bj7J4C+II4ArqurVrQbWoiT3Ad4DPJ7B+LEzgNdU1c9aDawDklw1yemqqgeMPJgOSbI58GbgiQz+zJwOvKOqft1qYC1LcgGwL3Du0O3ci6rqYa0G1qIk/w0cUFXXth1LlyR5AIMdqQ4ErgeuAl6w2IdELHQmqpqVZmLVI5vDs6rqc23GI/VZkiXAFlX187ZjaVuSc6tqvyTnV9WeSTYGzlusY3cBknwJeHZV/bLtWLokyd5VtbpZbWWjqropydOr6ottx6b5Y6Iq3QlJ3jvJ6RuBVVX1hVHH0yVJNgFeATyqOfUNBpMfftNaUB2Q5BPAUQzGMq8G7gG8q6qOazWwliX5R+AG4EXAq4FXApdW1ZvbjKtNSfYETgLOBW4ZP19Vx7QWVAckOQ/4k6q6qDk+AnhdVe3XbmSaTyaqmlKSb1XVQUluYvKlUu7eUmitS7Ic2AX4dHPqOcAlwHbAlVX12pZCa12SE4BNgI80p/4YWF9VL2svqvYluaCq9kjyAmBv4M+B1Yu5cgiQZCPgpdx2SMQJtYj/cUryXeBbwEXA2Pj5qvrIlC9aBJpb/58BXgAcxOCXm6dV1Y2tBqZ5ZaIq3QlJvgY8sarWNccbMxin+gTgoqratc342pTkwqrafaZzi02SS4A9gE8A76+qM/1eIMmzgNOq6pYZGy8SSc6uqgPbjqOLmrVUPw9cDTyzqn7VbkSab876l+6cbYDhXam2AO5fVesZulW3SK1vZuYCv62CuHTXYFb7Dxj8WTmrWZJp0Y9RBZ4BXJ7k35I8tfmlb7H7epKlrpwxkOSiJGuSrGFQUd0K2BE4tzmnBcyKqnQnJHkp8FcMxl+GwXjMv2OwMsLbq+qN7UXXriSPYzC+7koG380OwJFV9fVWA+ugJBuPV+UXs2Zc85OBwxnc0v3yYh4q0qyccbt/nBfryhluDrG4mahKd1KS+zMYf/l9BlWytVV1VrtRdUOSuwAPZpCoft/bugNJngo8BNhs/FxVHdteRN3RJKuHAEcCj6yq+7YcUmuaTQ9eySBpL+CbwDJvc0OzXfX4CjTfrKoL24xH889EVboTkrwMeA2DxckvAPYHzlnM+9knObiqvtYsZXY7VXXqqGPqkiTLgM0ZbEF8AoMteL9bVS9tNbCWJTmEwdrMj2Vwh+JTwBmLudKc5N8ZDAv5eHPqecA9q+q57UXVviSvAV4OjP9d8ixgeVW9r72oNN9MVKU7IclFwMOB7zQzuXcB/rqqDm85tNYk+euqeluSkya5XFX1kpEH1SFJ1lTVbkM/twROraonth1bm5KcApwC/JeV9wEnJE6uGY96QFXd3BxvwaBAsKhXzljoHLQu3Tm/rqpfJyHJXarq+0ke3HZQbaqqtzU/j2w7lo4av237y2bYyM+AnVqMpxOq6ohmDOIjga80t703rqqbWg6tTecn2b+qvgOQZD/g2y3H1AXhthMz1zfntICZqEp3ztok92SwTMqXk1wPXNNqRB3R3J47CbgJ+DCwF/Cmqjqj1cDa9x/Nn5njgPMYjD08odWIOiDJy4GlDGZyP5DBcJplwOPajKsNzZ2aYrAO8YuS/Kg53gG4tM3YOuJEBjP9x3dGfCbwr+2Fo1Hw1r+0gZI8msEuQ1+qqlvbjqdt47cokzwJeBXwFuCkqtqr5dA6o5lstpkLlQ82QgD2Bc6tqj2bcxdV1cNaDawFzm6fWrMxxP7ArxlMMguD7bzPbzUwzTsrqtIGqqoz246hY8ZvxT2FQYJ6YZJFf3suyebA/wG2r6qXJ9k+ySOr6j/ajq1lt1TVreN/RJp1VBdlBWUxJ6IzqaqxJP9cVQcwuCOhRcIF/yXNtdVJzmCQqJ6e5G4MbQO5iJ3EYDOIA5rjtcDftBdOZ5yZ5C+BuyZ5AoNtib/YckzqpjOSPMdffBcXb/1LmlPNLbo9gCur6oYk9wa2qapFvYNMklVVtU+S84ducTuTe/Dn5aXAExlU408HTij/cdIESW5isGb1OgZDAMJgRZG7txqY5pW3/iXNqeYW3f8Au7od5m3c2sxoL4Bmm9lFvxxTVY0xmHT34bZjUbdV1d3ajkGj5z8ikuZUkncy2ArzUn63lEwBi33XrrcBXwK2S/Jx4BHAi1uNqEVDM9wn5dqYmijJV6vqcTOd08LirX9JcyrJZcBuLt5+e80wiP0Z3LL8TlVd23JIrXGGu2YryWYMdnX7OvAYfjdh8+4MNor4w5ZC0whYUZU0165ksA6kieqQJI8ALqiq/0zyQuAvk7xnsSZks/3cSc5pZnpr8fpT4LXA/YHVNGNTGazV/P72wtIoOOtf0lz7JXBBkg8lee/4o+2gOuCDDHal2h14I/BD4KPthtQLm7UdgNpVVe+pqp2AvwX2aJ6fxOCX4nNaDU7zzkRV0lxbAbwDOJtB9WP8sdita2ayHwq8t6reAzg5ZGaOT9O4w6rq50kOAp4AnMzgF0AtYN76lzSnquojzez27avqsrbj6ZCbkvwF8ELgUUmWMBgiIWl2xidnPhVYVlVfSPL2FuPRCFhRlTSnkjwduIDBDHeS7JFkRatBdcPhDMbtvrSq/h+wDXBcuyH1gou7a9yPk3wIeC5wWrMVsXnMAuesf0lzKslq4GDgG4t973bNTrMCwM5V9ZWmGr9xVd3UXHtoVV3cboTqgmYb4kOAi6rqv5NsDTysqs5oOTTNI2/9S5pr66rqxgm7HC7a34iTfKuqDmp21Rn+HtxVB0jycmApsBXwQGBbYBnwOACTVI2rql8Cpw4d/wT4SXsRaRRMVCXNtYuTPB9YkmRn4BgGE6sWpao6qPnpxKnJvQrYFzgXoKmU/V67IUnqCsd2SJprrwYewmA85ieAG4HXtBpRi5JsNd2j7fg64JaqunX8oNl2d9FW4CXdlhVVSXPtqVX1ZuDN4yeS/BHw6fZCatVqBolXgO2B65vn9wR+BOzUWmTdcGaSvwTumuQJwCuBL7Yck6SOcDKVpDmV5Lyq2mumc4tNkmXAiqo6rTl+MvD4qvo/7UbWriQbAS8FnsgggT8dOKH8x0kSJqqS5kiTeD2FwdIxnxq6dHdg16rat5XAOiLJ6qrae8K5VVW1T1sxdU0zFGLbqlrTdiySusFb/5LmyjXAKuAZ3HYnqpuA17USUbdcm+SvgI8xGArwQuBn7YbUviTfYPBnZmMG6+/+NMmZVfX6NuOS1A1WVCXNmWa3pY9W1QvajqVrmmrh24BHMUhUzwKOrarrWg2sZUnOr6o9k7wM2K6q3pZkTVXt1nZsktpnRVXSnKmq9UnunWTT4ZncgiYhnXL1gyTvq6pXjzCkrti4Wbj9uQxNwJMkMFGVNPd+CHy72Tb15vGTVfWu9kLqhUe0HUBLjmUwgerbVbUyyQOA/245JkkdYaIqaa5d0zw2AlzkXtOqqk8ztHRZVV0JPKe9iCR1iWNUJc2LJFtU1c0ztxQs3iW8kmwLvI9BRbmAbwGvqaq1rQYmqRPcmUrSnEpyQJJLge81x7sn+UDLYfVB2g6gJScBK4D7A9swWOz/pFYjktQZJqqS5tq/AE+iWXqpqi5kMNN9UWt255ru3HtGGE6X3LeqTqqqdc3jZOC+bQclqRtMVCXNuaq6esKp9a0E0i1/Md25JkFbjK5N8sIkS5qH68tK+i0nU0maa1cnORCoJJsCx9AMA1iMhnbs2ibJe4cu3R1Y105UnfIS4P3AuxmMUT27OSdJTqaSNLeS3IfBbezHM7hrczqDyTGLskqWZHdgDwbLML116NJNwNer6vo24pKkPjBRlaR55o5dU0vyEQa/yNzQHN8L+OeqsqoqyTGqkuZWkgck+WKSnyb53yRfaBZxX7Sqaj1w72YohG5rt/EkFaCpMO/ZXjiSusQxqpLm2ieA44FnNcdHAJ8E9mstom5wx67JbZTkXuNDIJJshf82SWr4l4GkuZaq+reh448lObq1aLrDHbsm98/A2Uk+0xz/EfC3LcYjqUMcoyppTiX5B+AG4BQGs7gPB+7CoMpKVV3XWnDqpCS7Agcz2PTgq1V1acshSeoIE1VJcyrJVdNcrqpalONVk9wX+DPgIcBm4+er6uDWguqAJNtPdr6qfjTqWCR1j7f+Jc2pqtqp7Rg66uPAp4CnAUcBfwL8tNWIuuE/GVTeAe4K7ARcxiChl7TIOetf0pxKsirJK5Pcs+1YOubeVfWvwG+q6sxm+aX92w6qbVX1sKrarXnsDOwLfKvtuCR1g4mqpLl2BLANsCrJKUmelCRtB9UBv2l+/iTJU5PsCWzbZkBdVFXnAQ9vOw5J3eAYVUnzIslGDG5zfxAYA04E3rNYJ1MleRrwTWA74H0MtlD966pa0WpgLUvy+qHDjYC9GFSfn9RSSJI6xERV0pxLshuD/dqfzGAL1Y8DBwF/XFV7tBiaOibJ24YO1wE/AD5bVb9uJyJJXWKiKmlOJVnNYHmqE4BTq+qWoWunVtWz24qtTUkexKC6fL+qemiTzD+jqv6m5dAkqbNMVCXNqWZNzD2BHRhaWaSqjm0tqA5IcibwRuBDVbVnc+7iqnpou5G1I8kX+d1s/9upqmeMMBxJHeXyVJLm2rsYVFTPA26ZvumisnlVfXfCvLJ1bQXTAf/UdgCSus9EVdJc27aqDmk7iA66NskDaaqISQ4DftJuSO2pqjPbjkFS95moSpprZyd5WFVd1HYgHfMqYDmwS5IfA1cBL2g3pPYluYjbDwG4EVgF/E1V/Wz0UUnqCseoSpoTQwnHxsDOwJUMbv2Hwdapu7UYXuuS3AU4DNgR2Ar4OYPvZbGP3f1HYD3wiebUEQz+zNwIHFRVT28rNknts6Iqaa48re0AOu4L/G7s7jXthtIpj6iqRwwdX5Tk21X1iCQvbC0qSZ1goippTlTVD9uOoeMcuzu5LZPsV1XnAiTZF9iyubaYJ5tJwkRVkkbFsbuTexlwYpLx5PQm4KVJtgD+vr2wJHWBY1QlaR45dnd2ktyDwb9JN0w4/ydV9ZF2opLUNhNVSZpHSXaY7rpDJqaX5Lyq2qvtOCS1w1v/kjSPTEQ3WGZuImmh2qjtACRJmoa3/aRFzERVktRlVlSlRcxEVZLUZd9uOwBJ7XEylSSpNUleP8npG4HVVXXBiMOR1DFWVCVJbdoHOArYpnksBR4DfDjJn7UYl6QOsKIqSWpNktOB51TVL5rjLYHPAM9iUFXdtc34JLXLiqokqU3bA7cOHf8G2KGqfsVgYwRJi5jrqEqS2vQJ4DtJvtAcPx34ZLOF6qXthSWpC7z1L0lqVZK9gYMYLEX1rapa1XJIkjrCRFWS1Jok7wE+VVVntx2LpO5xjKokqU3nAX+V5IokxyXZp+2AJHWHFVVJUuuSbAU8BzgC2L6qdm45JEkdYEVVktQFfwDsAuwIfL/dUCR1hRVVSVJrkrwTeDbwf4FPAZ+rqhtaDUpSZ7g8lSSpTVcBBwIPAO4C7JaEqjqr3bAkdYGJqiSpTeuBrwHbAhcA+wPnAAe3GJOkjnCMqiSpTccADwd+WFWPBfYEftpuSJK6wkRVktSmX1fVrwGS3KWqvg88uOWYJHWEt/4lSW1am+SewOeBLye5Hrim1YgkdYaz/iVJnZDk0cA9gC9V1a1txyOpfSaqkiRJ6iTHqEqSJKmTTFQlSZLUSSaqkiRJ6iQTVUmSJHWSiaokSZI66f8Diqhxex7dLmkAAAAASUVORK5CYII=\n",
"text/plain": [
"<Figure size 864x648 with 2 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"# Correlation matrix\n",
"corrmat = data.corr()\n",
"fig = plt.figure(figsize = (12, 9))\n",
"\n",
"sns.heatmap(corrmat, vmax = .8, square = True)\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"id": "b9ba5dd2",
"metadata": {},
"source": [
"### Splitting the features and target"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "ad95615d",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(5110, 10)\n",
"(5110,)\n"
]
}
],
"source": [
"#seperating the X and the Y from the dataset\n",
"X=data.drop(['stroke','id'], axis=1)\n",
"Y=data[\"stroke\"]\n",
"print(X.shape)\n",
"print(Y.shape)\n",
"#getting just the values for the sake of processing (its a numpy array with no columns)\n",
"X_data=X.values\n",
"Y_data=Y.values"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "987d85cc",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([['0', 67.0, 0, ..., 228.69, 36.6, '2'],\n",
" ['1', 61.0, 0, ..., 202.21, 28.893236911794673, '1'],\n",
" ['0', 80.0, 0, ..., 105.92, 32.5, '1'],\n",
" ...,\n",
" ['1', 35.0, 0, ..., 82.99, 30.6, '1'],\n",
" ['0', 51.0, 0, ..., 166.29, 25.6, '2'],\n",
" ['1', 44.0, 0, ..., 85.28, 26.2, '3']], dtype=object)"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"X_data"
]
},
{
"cell_type": "markdown",
"id": "acdf7977",
"metadata": {},
"source": [
"### Splittng data into train and test"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "5c0ce1f4",
"metadata": {},
"outputs": [],
"source": [
"# Using Skicit-learn to split data into training and testing sets\n",
"from sklearn.model_selection import train_test_split\n",
"# Split the data into training and testing sets\n",
"X_train, X_test, Y_train, Y_test = train_test_split(X_data, Y_data, test_size = 0.2, random_state = 42)"
]
},
{
"cell_type": "markdown",
"id": "967d0e29",
"metadata": {},
"source": [
"### Model training - Random Forest"
]
},
{
"cell_type": "markdown",
"id": "b036fd27",
"metadata": {},
"source": [
"#### ISOLATION FOREST"
]
},
{
"cell_type": "code",
"execution_count": 22,
"id": "b41d16ee",
"metadata": {},
"outputs": [],
"source": [
"from sklearn.metrics import classification_report, accuracy_score,precision_score,recall_score,f1_score,matthews_corrcoef\n",
"from sklearn.metrics import confusion_matrix"
]
},
{
"cell_type": "code",
"execution_count": 23,
"id": "53a3cd66",
"metadata": {},
"outputs": [],
"source": [
"#Building another model/classifier ISOLATION FOREST\n",
"from sklearn.ensemble import IsolationForest\n",
"ifc=IsolationForest(max_samples=len(X_train),\n",
" contamination=outlier_fraction,random_state=1)\n",
"ifc.fit(X_train)\n",
"scores_pred = ifc.decision_function(X_train)\n",
"y_pred = ifc.predict(X_test)\n",
"\n",
"\n",
"# Reshape the prediction values to 0 for valid, 1 for fraud. \n",
"y_pred[y_pred == 1] = 0\n",
"y_pred[y_pred == -1] = 1\n",
"\n",
"n_errors = (y_pred != Y_test).sum()"
]
},
{
"cell_type": "markdown",
"id": "ab450ea4",
"metadata": {},
"source": [
"### Model eveuation"
]
},
{
"cell_type": "code",
"execution_count": 24,
"id": "60c5a8a9",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"the Model used is Isolation Forest\n",
"The accuracy is 0.9090019569471625\n",
"The precision is 0.2459016393442623\n",
"The recall is 0.24193548387096775\n",
"The F1-Score is 0.24390243902439024\n",
"The Matthews correlation coefficient is0.19550087369423888\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAqEAAALJCAYAAACEBfppAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8QVMy6AAAACXBIWXMAAAsTAAALEwEAmpwYAAA4k0lEQVR4nO3debhuZV038O8PUBBBZRBFcEAlDRzIGZxTA02FTF8pLUIKS19zytJyLkrLuSQ9Doiz4BCYbwpRKg4JqKiAISQOCMrgAJIy7d/7x7OObo777LPB86xn77M/H6/n2s9zP2ut+97bC64f33vd96ruDgAAjGmzWQ8AAIDVRxEKAMDoFKEAAIxOEQoAwOgUoQAAjE4RCgDA6BShwEZXVTeoqg9X1Y+q6uhf4jpPqKrjNubYZqWq7l9VZ856HADLRdknFFavqvrdJM9KcscklyY5Nclh3f2pX/K6v5fkaUn26e6rftlxLndV1Ul27+6zZz0WgJVCEgqrVFU9K8lrkvxtkpsluVWSw5PsvxEuf+skX1sNBehSVNUWsx4DwHKjCIVVqKpunOSlSZ7a3R/s7su6+8ru/nB3P2c4Zsuqek1VnTe8XlNVWw7fPaiqzq2qZ1fVBVV1flUdPHz3kiQvTPL4qvpxVR1SVS+uqnfO6/82VdVri7Oq+oOq+npVXVpV51TVE+a1f2reeftU1cnDNP/JVbXPvO8+XlV/XVWfHq5zXFXtuJ7ff+34/3ze+A+oqkdU1deq6vtV9Zfzjr9XVX22qn44HPtPVXX94btPDod9afh9Hz/v+n9RVd9NcsTatuGc2w193G34fIuquqiqHvTL/P8KsJIoQmF12jvJVkk+tMgxf5XkPkn2SnLXJPdK8vx53988yY2T7JLkkCSvr6rtuvtFmaSr7+vubbr7LYsNpKpumOR1SR7e3dsm2SeT2wLWPW77JB8Zjt0hyauSfKSqdph32O8mOTjJTkmun+TPFun65pn8DXbJpGh+U5InJrl7kvsneWFV3XY49uokz0yyYyZ/u4ckeUqSdPcDhmPuOvy+75t3/e0zSYUPnd9xd/9Pkr9I8q6q2jrJEUne1t0fX2S8AJsURSisTjskuWgD0+VPSPLS7r6guy9M8pIkvzfv+yuH76/s7v+X5MdJ7nAdxzOX5E5VdYPuPr+7T1/gmN9MclZ3v6O7r+ru9yT57ySPmnfMEd39te7+SZKjMimg1+fKTO5/vTLJezMpMF/b3ZcO/Z+e5C5J0t2f7+7/Gvr9RpI3JnngEn6nF3X35cN4rqG735TkrCSfS7JzJkU/wKqhCIXV6eIkO27gXsVbJPnmvM/fHNp+do11itj/TbLNtR1Id1+W5PFJ/jjJ+VX1kaq64xLGs3ZMu8z7/N1rMZ6Lu/vq4f3aIvF7877/ydrzq+pXqupfq+q7VXVJJknvglP981zY3T/dwDFvSnKnJP/Y3Zdv4FiATYoiFFanzyb5aZIDFjnmvEymkte61dB2XVyWZOt5n28+/8vu/lh3PyyTRPC/MynONjSetWP6znUc07Xxz5mMa/fuvlGSv0xSGzhn0a1HqmqbTBaGvSXJi4fbDQBWDUUorELd/aNM7oN8/bAgZ+uqul5VPbyq/n447D1Jnl9VNx0W+LwwyTvXd80NODXJA6rqVsOiqOet/aKqblZVjx7uDb08k2n9qxe4xv9L8itV9btVtUVVPT7JHkn+9TqO6drYNsklSX48pLR/ss7330ty2184a3GvTfL57v7DTO51fcMvPUqAFUQRCqtUd78qkz1Cn5/kwiTfTvJ/k/zLcMjfJDklyZeTfCXJF4a269LX8UneN1zr87lm4bhZkmdnknR+P5N7LZ+ywDUuTvLI4diLk/x5kkd290XXZUzX0p9lsujp0kxS2vet8/2Lkxw5rJ7/Pxu6WFXtn2S/TG5BSCb/P9xt7a4AAKuBzeoBABidJBQAgNEpQgEAGJ0iFACA0SlCAQAY3WIbVc/UlRd93YopYEl2uPVDZz0EYIW45LKvb2iP36lbDjXO9Xa87cz/DpJQAABGpwgFAGB0ilAAAEa3bO8JBQDYJM0t9GTi1UcSCgDA6CShAABj6rlZj2BZkIQCAPALqurpVXVaVZ1eVc8Y2ravquOr6qzh53bzjn9eVZ1dVWdW1b4bur4iFACAa6iqOyX5oyT3SnLXJI+sqt2TPDfJCd29e5IThs+pqj2SHJhkzyT7JTm8qjZfrA/T8QAAY5pbEdPxv5rkv7r7f5Okqj6R5LeS7J/kQcMxRyb5eJK/GNrf292XJzmnqs7OpID97Po6kIQCALCu05I8oKp2qKqtkzwiyS2T3Ky7z0+S4edOw/G7JPn2vPPPHdrWSxIKADCiXgYLk6rq0CSHzmta091r1n7o7q9W1cuTHJ/kx0m+lOSqxS65QNuijydVhAIArDJDwblmA8e8JclbkqSq/jaTdPN7VbVzd59fVTsnuWA4/NxMktK1dk1y3mLXNx0PAMAvqKqdhp+3SvKYJO9JcmySg4ZDDkpyzPD+2CQHVtWWVbVbkt2TnLTY9SWhAABjWhkLk5LkA1W1Q5Irkzy1u39QVS9LclRVHZLkW0kelyTdfXpVHZXkjEym7Z/a3Ys+Gqq6F52un5krL/r68hwYsOzscOuHznoIwApxyWVfX+jexVFdce5XZl7jXH/XO8/87yAJBQAY0zJYmLQcuCcUAIDRKUIBABid6XgAgDHNLbpeZ9WQhAIAMDpJKADAmCxMSiIJBQBgBhShAACMznQ8AMCYVs4Tk6ZKEgoAwOgkoQAAI2oLk5JIQgEAmAFFKAAAozMdDwAwJguTkkhCAQCYAUUoAACjMx0PADAmq+OTSEIBAJgBSSgAwJjmrp71CJYFSSgAAKNThAIAMDrT8QAAY7IwKYkkFACAGZCEAgCMyROTkkhCAQCYAUUoAACjMx0PADAmC5OSSEIBAJgBSSgAwJgsTEoiCQUAYAYUoQAAjM50PADAiLqvnvUQlgVJKAAAo5OEAgCMyRZNSSShAADMgCIUAIDRmY4HABiTfUKTSEIBAJgBSSgAwJgsTEoiCQUAYAYUoQAAjM50PADAmOY8MSmRhAIAMAOKUAAARmc6HgBgTFbHJ5GEAgAwA5JQAIAxeWJSEkkoAAAzoAgFAGB0puMBAMZkYVISSSgAADMgCQUAGJOFSUkkoQAAzIAiFACA0ZmOBwAYk+n4JJJQAABmQBIKADCi7qtnPYRlQRIKAMDoFKEAAIzOdDwAwJgsTEoiCQUAYAYkoQAAY/Ls+CSSUAAAZkARCgDA6BShAABjmpub/WsJquqZVXV6VZ1WVe+pqq2qavuqOr6qzhp+bjfv+OdV1dlVdWZV7buh6ytCAQC4hqraJcmfJrlHd98pyeZJDkzy3CQndPfuSU4YPqeq9hi+3zPJfkkOr6rNF+tDEQoAwEK2SHKDqtoiydZJzkuyf5Ijh++PTHLA8H7/JO/t7su7+5wkZye512IXV4QCAIyp52b+qqpDq+qUea9DrzHE7u8keUWSbyU5P8mPuvu4JDfr7vOHY85PstNwyi5Jvj3vEucObetliyYAgFWmu9ckWbO+74d7PfdPsluSHyY5uqqeuMgla6FuFhuDIhQAYEwr44lJD01yTndfmCRV9cEk+yT5XlXt3N3nV9XOSS4Yjj83yS3nnb9rJtP362U6HgCAdX0ryX2qauuqqiQPSfLVJMcmOWg45qAkxwzvj01yYFVtWVW7Jdk9yUmLdSAJBQDgGrr7c1X1/iRfSHJVki9mMn2/TZKjquqQTArVxw3Hn15VRyU5Yzj+qd199WJ9KEIBAMa0Qh7b2d0vSvKidZovzyQVXej4w5IcttTrm44HAGB0klAAgDGtjIVJUycJBQBgdIpQAABGZzoeAGBMpuOTSEIBAJgBSSgAwJhWyBZN0yYJBQBgdIpQAABGZzoeAGBMFiYlkYQCADADklAAgDFZmJREEgoAwAwoQgEAGJ3peACAMVmYlEQSCgDADEhCAQDGZGFSEkkoAAAzoAgFAGB0puMBAMZkYVISSSgAADOgCAUAYHSm4wEAxmQ6PokkFACAGZCEAgCMqXvWI1gWJKEAAIxOEQoAwOhMxwMAjMnCpCSSUAAAZkASCgAwJkloEkkoAAAzoAgFAGB0puMBAMbUpuMTSSgAADMgCQUAGJOFSUkkoQAAzIAiFACA0ZmOBwAYU/esR7AsSEIBABidJBQAYEwWJiWRhAIAMAOKUAAARmc6HgBgTKbjk0hCAQCYAUkoAMCYPDs+iSQUAIAZUIQCADA60/EAACPqOU9MSiShAADMgCIUAIDRmY4HABiTfUKTSEIBAJgBSSgAwJjsE5pEEgoAwAwoQgEAGJ3peACAMdknNIkkFACAGZCEAgCMyRZNSSShAADMgCIUAIDRmY4HABiT6fgkklAAAGZAEgoAMKa2RVMiCQUAYB1VdYeqOnXe65KqekZVbV9Vx1fVWcPP7ead87yqOruqzqyqfTfUhyIUAIBr6O4zu3uv7t4ryd2T/G+SDyV5bpITunv3JCcMn1NVeyQ5MMmeSfZLcnhVbb5YH4pQAIAxzc3N/nXtPCTJ/3T3N5Psn+TIof3IJAcM7/dP8t7uvry7z0lydpJ7LXZRRSgAAIs5MMl7hvc36+7zk2T4udPQvkuSb88759yhbb0sTAIAGNMyeHZ8VR2a5NB5TWu6e80Cx10/yaOTPG9Dl1ygbdFfVBEKALDKDAXnLxSdC3h4ki909/eGz9+rqp27+/yq2jnJBUP7uUluOe+8XZOct9iFTccDALA+v5OfT8UnybFJDhreH5TkmHntB1bVllW1W5Ldk5y02IUVoSwr7zjqX3LAE/84+z/hyXnH+z6UJPnYf5yY/Z/w5Nz5fo/IaV/92i+cc/53L8g9H/pbOeLd7x97uMAystlmm+XEz3w4R73/zT9re/If/34+/8V/z+dO/mhe+jd/McPRwTw9N/vXElTV1kkeluSD85pfluRhVXXW8N3LkqS7T09yVJIzknw0yVO7++rFrm86nmXjrK9/Ix849qN5z5tfk+ttcb388bOfnwfsc6/c/ra3zmv+9gV5yT+8bsHzXv66Nbn/fe4x8miB5eZPnnpwvnbm/2TbbbdJktz/AffJIx75sOx970fkiiuuyI433WHGI4SVpbv/N8kO67RdnMlq+YWOPyzJYUu9/lST0Kq6X1UdPLy/6RDPwoK+/o1v5y573jE32GqrbLHF5rnHXnfOCZ/8TG53m1tlt1vvuuA5J3zyM9n1FjfP7Xa79cijBZaTW9zi5tl3vwfnyLe972dth/zhE/LqV74hV1xxRZLkogsvntXwgAVMrQitqhcl+Yv8fDXV9ZK8c1r9sfLd/ra3zue/dFp++KNL8pOf/jQnfvbkfPd7F673+P/9yU/z1ncenac86QkjjhJYjl729y/IC//qZZmbt//h7XffLfvsc8/8x8c/mP/30ffkbne7ywxHCPPM9exfy8A0p+N/K8mvJflCknT3eVW17RT7Y4W73W1ulSc94XH5o2f8Zba+wQ3yK7e/bTbffP0PW3j9W96R33v8b2XrrW8w4iiB5Wa//X49F114cU499bTc7/73/ln7Fltsnpvc5Eb59Qc9Jne/+13ytnf8Y+6y5wNnOFJgvmkWoVd0d1dVJ0lV3XBDJ8zfs+rwV/5N/vD3f2eKw2M5+u1H7ZvfftTkcbOvecPbcvOddlzvsV85/cwc/5+fyqsOf0su/fFlqapsef3r53cf++ixhgssA/fe++55+G8+JA/b90HZaqsts+222+RNb3lVzvvOd3PssR9Lknz+819Oz81lhx23z8UXfX/GI2a162v/xKJN0jSL0KOq6o1JblJVf5TkSUnevNgJ8/esuvKiry+PrJhRXfyDH2aH7W6S8797QU74xKfzzje+ar3Hvv2fX/Gz969/yzuz9Q22UoDCKvSSF/1DXvKif0iS3O/+986fPv2P8keHPCtPOuR388AH7p1Pnfi53P72u+V617+eAhSWkakVod39iqp6WJJLktwhyQuTfHJa/bFpeOZf/k1+eMkl2WKLLfJXz35KbnyjbfPvn/h0/u7V/5zv//BHecpzXpQ77n7brHn1khffAavUO95+dA5/w8vzXyf/W6644sr88aHPmfWQgHmqezqBY1W9tbufNO/zNkmO6e4Fl/WvSxIKLNUOt37orIcArBCXXPb1hR4vOarLDvv9mdc4N/yrt8/87zDNLZq+U1X/nCRVtV2S42J1PAAAme50/Auq6uVV9YYkd0/ysu7+wLT6AwBYEZb4xKJN3UYvQqvqMfM+npTkBcPPrqrHdPcHFz4TAIDVYhpJ6KPW+fzFTDaqf1SSzjWfPwoAwCq00YvQ7j54Y18TAGCTsUyeWDRr03xs565V9aGquqCqvldVH6iqhR8ADgDAqjLN1fFHJDk2yS2S7JLkw0MbAMDqNTc3+9cyMM0i9KbdfUR3XzW83pbkplPsDwCAFWKaRehFVfXEqtp8eD0xycVT7A8AgBVims+Of1KSf0ry6kxWxX9maAMAWL0sTEoypSK0qjZP8rfd/ehpXB8AgJVtKkVod19dVTetqut39xXT6AMAYEXyxKQk052O/0aST1fVsUkuW9vY3a+aYp8AAKwA0yxCzxtemyXZdmhzEwQAAFMtQs/o7qPnN1TV46bYHwDA8mdhUpLpbtH0vCW2AQCwymz0JLSqHp7kEUl2qarXzfvqRkmu2tj9AQCsJL1Mnlg0a9OYjj8vySlJHp3k8/PaL03yzCn0BwDACrPRi9Du/lKSL1XVu7v7yqq6XpI7JflOd/9gY/cHAMDKs9HvCa2qN1TVnkMBeuMkX0ry9iRfrKrf2dj9AQCsKHM9+9cyMI2FSffv7tOH9wcn+Vp33znJ3ZP8+RT6AwBghZlGETr/CUkPS/IvSdLd351CXwAArEDTWJj0w6p6ZJLvJLlvkkOSpKq2SHKDKfQHALByLJPp8FmbRhH65CSvS3LzJM+Yl4A+JMlHptAfAAArzDRWx38tyX4LtH8sycc2dn8AACtK2yc0me4TkwAAYEGKUAAARjeNe0KTJFW1W3efs6E2AIBVxcKkJNNNQj+wQNv7p9gfAAArxEZPQqvqjkn2THLjqnrMvK9ulGSrjd0fAMBK0pLQJNOZjr9DkkcmuUmSR81rvzTJH02hPwAAVphpbNF0TJJjqmrv7v7sxr4+AAAr39QWJiX5dlV9KJOnJnWSTyV5enefO8U+AQCWN9PxSaa7MOmIJMcmuUWSXZJ8eGgDAGCVm2YSulN3zy8631ZVz5hifwAAy9+cJyYl001CL6yqJ1bV5sPriUkunmJ/AACsENMsQp+U5P8k+W6S85M8dmgDAGCVm9p0fHd/K8mjp3V9AIAVycKkJNPZrP6Fi3zd3f3XG7tPAABWlmkkoZct0HbDJIck2SGJIhQAWL0koUmms1n9K9e+r6ptkzw9ycFJ3pvkles7DwCA1WMq94RW1fZJnpXkCUmOTHK37v7BNPoCAGDlmcY9of+Q5DFJ1iS5c3f/eGP3AQCwUnWbjk+ms0XTszN5StLzk5xXVZcMr0ur6pIp9AcAwAozjXtCp7n3KADAymZhUpLpblYPAAALUoQCADC6qT0xCQCABZiOTyIJBQBgBhShAACMznQ8AMCI2nR8EkkoAAAzIAkFABiTJDSJJBQAgBlQhAIAMDrT8QAAY5qb9QCWB0koAACjU4QCAIyo53rmr6WoqptU1fur6r+r6qtVtXdVbV9Vx1fVWcPP7eYd/7yqOruqzqyqfTd0fUUoAAALeW2Sj3b3HZPcNclXkzw3yQndvXuSE4bPqao9khyYZM8k+yU5vKo2X+ziilAAAK6hqm6U5AFJ3pIk3X1Fd/8wyf5JjhwOOzLJAcP7/ZO8t7sv7+5zkpyd5F6L9aEIBQAY01zP/FVVh1bVKfNeh64zytsmuTDJEVX1xap6c1XdMMnNuvv8JBl+7jQcv0uSb887/9yhbb2sjgcAWGW6e02SNYscskWSuyV5Wnd/rqpem2HqfT1qoW4WG4MkFABgTHPL4LVh5yY5t7s/N3x+fyZF6feqauckGX5eMO/4W847f9ck5y3WgSIUAIBr6O7vJvl2Vd1haHpIkjOSHJvkoKHtoCTHDO+PTXJgVW1ZVbsl2T3JSYv1YToeAICFPC3Ju6rq+km+nuTgTALMo6rqkCTfSvK4JOnu06vqqEwK1auSPLW7r17s4opQAIARLXWfzlnr7lOT3GOBrx6ynuMPS3LYUq9vOh4AgNFJQgEAxuTZ8UkkoQAAzIAiFACA0ZmOBwAY0UpZmDRtklAAAEanCAUAYHSm4wEAxmR1fBJJKAAAMyAJBQAYUUtCk0hCAQCYAUUoAACjMx0PADAm0/FJJKEAAMyAJBQAYEQWJk1IQgEAGJ0iFACA0ZmOBwAYk+n4JJJQAABmQBIKADAiC5MmJKEAAIxOEQoAwOhMxwMAjMh0/IQkFACA0UlCAQBGJAmdkIQCADA6RSgAAKMzHQ8AMKauWY9gWZCEAgAwOkkoAMCILEyakIQCADA6RSgAAKMzHQ8AMKKeszApkYQCADADilAAAEZnOh4AYERWx09IQgEAGJ0kFABgRO2JSUkkoQAAzIAiFACA0ZmOBwAYkYVJE5JQAABGJwkFABiRJyZNSEIBABidIhQAgNGZjgcAGFH3rEewPEhCAQAYnSQUAGBEFiZNSEIBABidIhQAgNGZjgcAGJHp+AlJKAAAo5OEAgCMyBZNE5JQAABGpwgFAGB0puMBAEZkYdKEJBQAgNFJQgEARtQtCU0koQAAzIAiFACA0ZmOBwAYUc/NegTLgyQUAIDRKUIBABid6XgAgBHNWR2fRBIKAMACquobVfWVqjq1qk4Z2ravquOr6qzh53bzjn9eVZ1dVWdW1b4bur4iFABgRN0189e18ODu3qu77zF8fm6SE7p79yQnDJ9TVXskOTDJnkn2S3J4VW2+2IUVoQAALNX+SY4c3h+Z5IB57e/t7su7+5wkZye512IXUoQCAKwyVXVoVZ0y73XoAod1kuOq6vPzvr9Zd5+fJMPPnYb2XZJ8e9655w5t62VhEgDAiHpu9guTuntNkjUbOOy+3X1eVe2U5Piq+u9Fjl3ol+rFLi4JBQDgF3T3ecPPC5J8KJPp9e9V1c5JMvy8YDj83CS3nHf6rknOW+z6GyxCq+pxVbXt8P75VfXBqrrbtf1FAABIumf/2pCquuG8+u+GSX4jyWlJjk1y0HDYQUmOGd4fm+TAqtqyqnZLsnuSkxbrYynT8S/o7qOr6n5J9k3yiiT/nOTeSzgXAICV52ZJPlRVyaRefHd3f7SqTk5yVFUdkuRbSR6XJN19elUdleSMJFcleWp3X71YB0spQtde4DeT/HN3H1NVL74uvw0AAMtfd389yV0XaL84yUPWc85hSQ5bah9LKUK/U1VvTPLQJC+vqi3jXlIAgOtkOSxMWg6WUkz+nyQfS7Jfd/8wyfZJnjPNQQEAsGlbShK6c5KPdPflVfWgJHdJ8vZpDgoAYFPl2fETS0lCP5Dk6qq6fZK3JNktybunOioAADZpSylC57r7qiSPSfKa7n5mJukoAABcJ0uZjr+yqn4nye8nedTQdr3pDQkAYNPVpuOTLC0JPTjJ3kkO6+5zhg1I3zndYQEAsCnbYBLa3Wck+dN5n89J8rJpDgoAYFO1lCcWrQYbLEKravckf5dkjyRbrW3v7ttOcVwAAGzCljIdf0Qmj+m8KsmDM9me6R3THBQAAJu2pSxMukF3n1BV1d3fTPLiqjoxyYumPDYAgE2OfUInllKE/rSqNktyVlX93yTfSbLTdIcFAMCmbClF6DOSbJ3J4qS/TvLrSQ6a4pgAADZZtmiaWMrq+JOHtz/OZLsmAAD4pay3CK2qDydZ7yYC3f3oqYwIAIBN3mJJ6CtGGwUAwCphn9CJ9Rah3f2JJKmqGyb5SXfPDZ83T7LlOMMDAGBTtJR9Qk/IZGHSWjdI8u/TGQ4AAKvBUlbHb9XdP177obt/XFVbL3YCAAALs0/oxFKK0Muq6m7d/YUkqaq7J/nJdIeV7Hibh027C2AT8b9XXj7rIQBwLS11n9Cjq+q84fPOSR4/tREBAGzC7BM6saR9QqvqjknukKSS/Hd3Xzn1kQEAsMlaShKaoeg8bcpjAQBglVhSEQoAwMZhYdLEUrZoAgCAjWqDRWhNPLGqXjh8vlVV3Wv6QwMA2PT0MngtB0tJQg9PsneS3xk+X5rk9VMbEQAAm7yl3BN67+6+W1V9MUm6+wdVdf0pjwsAgE3YUorQK4fnxXeSVNVNk8xNdVQAAJsoC5MmljId/7okH0qyU1UdluRTSf52qqMCAGCTtpTN6t9VVZ9P8pBMNqs/oLu/OvWRAQBsgjwxaWKDRWhV3SrJ/yb58Py27v7WNAcGAMCmayn3hH4kk/tBK8lWSXZLcmaSPac4LgAANmFLmY6/8/zPVXW3JE+e2ogAADZhVndPXOsnJnX3F5LccwpjAQBglVjKPaHPmvdxsyR3S3Lh1EYEALAJ61iYlCztntBt572/KpN7RD8wneEAALAaLFqEDpvUb9PdzxlpPAAArALrLUKraovuvmpYiAQAwEYw17MewfKwWBJ6Uib3f55aVccmOTrJZWu/7O4PTnlsAABsopZyT+j2SS5O8uv5+X6hnUQRCgDAdbJYEbrTsDL+tPy8+FxLkAwAcB3MWR2fZPEidPMk2yQL/qUUoQAAXGeLFaHnd/dLRxsJAMAqYJ/QicWemOQvBADAVCxWhD5ktFEAALCqrHc6vru/P+ZAAABWg7lZD2CZWCwJBQCAqVjKPqEAAGwkFiZNSEIBABidIhQAgNGZjgcAGJGFSROSUAAARicJBQAYkSR0QhIKAMDoFKEAAIzOdDwAwIjsEzohCQUAYHSSUACAEc0JQpNIQgEAmAFFKAAAozMdDwAwojkLk5JIQgEAmAFFKADAiHoZvJaiqjavqi9W1b8On7evquOr6qzh53bzjn1eVZ1dVWdW1b5Lub4iFACAhTw9yVfnfX5ukhO6e/ckJwyfU1V7JDkwyZ5J9ktyeFVtvqGLK0IBALiGqto1yW8mefO85v2THDm8PzLJAfPa39vdl3f3OUnOTnKvDfWhCAUAGNHcMnhV1aFVdcq816HrDPM1Sf58OHytm3X3+Uky/NxpaN8lybfnHXfu0LYoq+MBAFaZ7l6TZM1C31XVI5Nc0N2fr6oHLeFyCy333+Ctp4pQAADmu2+SR1fVI5JsleRGVfXOJN+rqp27+/yq2jnJBcPx5ya55bzzd01y3oY6MR0PADCiuaqZvxbT3c/r7l27+zaZLDj6j+5+YpJjkxw0HHZQkmOG98cmObCqtqyq3ZLsnuSkDf0dJKEAACzFy5IcVVWHJPlWksclSXefXlVHJTkjyVVJntrdV2/oYopQAIARLXWfzuWguz+e5OPD+4uTPGQ9xx2W5LBrc23T8QAAjE4RCgDA6EzHAwCMaG7Dh6wKklAAAEYnCQUAGNHc4jskrRqSUAAARqcIBQBgdKbjAQBGNLfgo9ZXH0koAACjk4QCAIxoJT0xaZokoQAAjE4RCgDA6EzHAwCMyD6hE5JQAABGJwkFABiRZ8dPSEIBABidIhQAgNGZjgcAGJF9QickoQAAjE4SCgAwIls0TUhCAQAYnSIUAIDRmY4HABiRfUInJKEAAIxOEQoAwOhMxwMAjMh0/IQkFACA0UlCAQBG1PYJTSIJBQBgBhShAACMznQ8AMCILEyakIQCADA6SSgAwIgkoROSUAAARqcIBQBgdKbjAQBG1LMewDIhCQUAYHSSUACAEc15YlISSSgAADOgCAUAYHSm4wEARmSf0AlJKAAAo5OEAgCMSBI6IQkFAGB0ilAAAEZnOh4AYESemDQhCQUAYHSKUAAARmc6HgBgRB7bOSEJBQBgdJJQAIAR2Sd0QhIKAMDoFKEAAIzOdDwAwIjsEzohCQUAYHSSUACAEc3JQpNIQgEAmAFFKAAAozMdDwAwIvuETkhCAQAYnSQUAGBEliVNSEIBABidIhQAgNGZjgcAGJGFSROSUAAArqGqtqqqk6rqS1V1elW9ZGjfvqqOr6qzhp/bzTvneVV1dlWdWVX7bqgPRSgAwIjmavavJbg8ya93912T7JVkv6q6T5LnJjmhu3dPcsLwOVW1R5IDk+yZZL8kh1fV5ot1oAgFAOAaeuLHw8frDa9Osn+SI4f2I5McMLzfP8l7u/vy7j4nydlJ7rVYH4pQAIBVpqoOrapT5r0OXeCYzavq1CQXJDm+uz+X5GbdfX6SDD93Gg7fJcm3551+7tC2XhYmAQCMaG4Z7BTa3WuSrNnAMVcn2auqbpLkQ1V1p0UOX2iSf9FfVBIKAMB6dfcPk3w8k3s9v1dVOyfJ8POC4bBzk9xy3mm7JjlvsesqQgEARtTL4LUhVXXTIQFNVd0gyUOT/HeSY5McNBx2UJJjhvfHJjmwqrasqt2S7J7kpMX6MB0PAMC6dk5y5LDCfbMkR3X3v1bVZ5McVVWHJPlWksclSXefXlVHJTkjyVVJnjpM56+XIhQAgGvo7i8n+bUF2i9O8pD1nHNYksOW2ociFABgRJ6YNOGeUAAARqcIBQBgdKbjAQBGtBz2CV0OJKEAAIxOEgoAMCI56IQkFACA0SlCAQAYnel4AIAR2Sd0QhIKAMDoJKEAACOyRdOEJBQAgNEpQgEAGJ3peACAEZmMn5CEAgAwOkkoAMCIbNE0IQkFAGB0ilAAAEZnOh4AYERtaVISSSgAADMgCQUAGJGFSROSUAAARqcIBQBgdKbjAQBGNGdhUhJJKAAAMyAJBQAYkRx0QhIKAMDoFKEAAIzOdDwAwIgsTJqQhAIAMDpFKAAAozMdDwAwIo/tnJCEAgAwOkUoy9pmm22WEz99bN539JuSJEcc+bqc+JkP58TPfDhfPv0TOfEzH57xCIHl4E1rXpnzzv1STv3iCT9re+ELnpVvnnNKTjn5uJxy8nF5+H6/PsMRws/1MvjfcmA6nmXtT57yBznzzP/JtttukyQ5+KA//dl3f/O3z8sll1w6q6EBy8jb335UDj/8iBxxxGuv0f7a170pr3r1G2c0KmAxklCWrVvc4ubZd78H5+1HHrXg97/1mN/M+4/+15FHBSxHJ37qc/n+D34462EA14IilGXrZX///Lzw+S/P3Nwv3sK9z33vmQsvuChf/59vjD8wYMV4yp8cnC98/vi8ac0rc5Ob3HjWw4Ekk4VJs34tB1MtQqvqV6rqhKo6bfh8l6p6/jT7ZNOw734PzoUXXpxTTz1twe8f+7hH5f1Hux8UWL83vPHt+ZU77pO73+M38t3vXpB/+PsXznpIwDzTTkLflOR5Sa5Mku7+cpID13dwVR1aVadU1SlXXHnJlIfGcnaf+9w9D3/EQ/Ll0z+Rt77ttXnAA/fOmje/Mkmy+eab51GP3jcf/MBHZjxKYDm74IKLMjc3l+7Om9/yrtzznnvNekiQxMKktaZdhG7d3Set03bV+g7u7jXdfY/uvsf1r3ejKQ+N5ewlL35F9rjD/XKXPR+YJ/3B0/PJT3w2h/7hs5MkD3rwffO1r/1PzjvvuzMeJbCc3fzmO/3s/QH7Pzynn37mDEcDrGvaq+MvqqrbJZOSu6oem+T8KffJJu63H/vIfMBUPDDPO9/x+jzwAXtnxx23zze+fkpe8tJX5IEP3Cd3vese6e5885vn5k+e8hezHiYwT3VPL5KtqtsmWZNknyQ/SHJOkid09zc3dO6Nt7nd8siKgWXvsit+OushACvEVVd8p2Y9hoNu89szr3GO/MYHZv53mHYSul13P7Sqbphks+6+tKoelWSDRSgAAJuuqS9Mqqo7d/dlQwF6YBKr4wGAVWuue+av5WDaSehjk7y/qp6Q5H5Jfj/Jb0y5TwAAlrmpFqHd/fUh/fyXJN9O8hvd/ZNp9gkAwPI3lSK0qr6SXGMTqu2TbJ7kc1WV7r7LNPoFAFjulsdk+OxNKwl95JSuCwDAJmAqRej8LZiq6q5J7j98PLG7vzSNPgEAVoI5WWiS6T87/ulJ3pVkp+H1zqp62jT7BABg+Zv26vhDkty7uy9Lkqp6eZLPJvnHKfcLAMAyNu0itJJcPe/z1UMbAMCq1Kbjk0y/CH1rJiviPzR8PiDJW6bcJwAAy9zUitCq2izJ55J8IpON6ivJwd39xWn1CQDAyjC1IrS756rqld29d5IvTKsfAICVZG7WA1gmpv3s+OOq6reryn2gAAD8zLTvCX1WkhsmuaqqfprJlHx3942m3C8AwLJkn9CJaT87fttpXh8AgJVp2pvVn7CUNgAAVpepJKFVtVWSrZPsWFXb5ed7g94oyS2m0ScAwEpgn9CJaU3HPznJMzIpOD8/r/3SJK+fUp8AAKwQ0ypCP5PkqCSP7e5/rKqDkvx2km8kefeU+gQAWPZs0TQxrXtC35jk8qEAfUCSv0tyZJIfJVkzpT4BAFghplWEbt7d3x/ePz7Jmu7+QHe/IMntp9QnAAAbQVXdsqr+s6q+WlWnV9XTh/btq+r4qjpr+LndvHOeV1VnV9WZVbXvhvqYWhFaVWun+h+S5D/mfTftvUkBAJat7p75awmuSvLs7v7VJPdJ8tSq2iPJc5Oc0N27Jzlh+JzhuwOT7JlkvySHV9Xmi3UwrSL0PUk+UVXHJPlJkhOHAd4+kyl5AACWqe4+v7u/MLy/NMlXk+ySZP9MbrHM8POA4f3+Sd7b3Zd39zlJzk5yr8X6mEoq2d2HDfuB7pzkuP55yb1ZkqdNo08AgJVgpT0xqapuk+TXknwuyc26+/xkUqhW1U7DYbsk+a95p507tK3X1KbGu/u/Fmj72rT6AwBgaarq0CSHzmta092/sHi8qrZJ8oEkz+juS6pq3UN+dugCbYtW2+7PBABYZYaCc9Edi6rqepkUoO/q7g8Ozd+rqp2HFHTnJBcM7ecmueW803dNct5i15/qYzsBALimuWXw2pCaRJ5vSfLV7n7VvK+OTXLQ8P6gJMfMaz+wqrasqt2S7J7kpMX6kIQCALCu+yb5vSRfqapTh7a/TPKyJEdV1SFJvpXkcUnS3adX1VFJzshkZf1Tu/vqxTpQhAIAjGglPDu+uz+Vhe/zTCbbby50zmFJDltqH6bjAQAYnSIUAIDRmY4HABjRStsndFokoQAAjE4SCgAwoiU+u32TJwkFAGB0ilAAAEZnOh4AYERLeWLRaiAJBQBgdIpQAABGZzoeAGBEK+GxnWOQhAIAMDpJKADAiDwxaUISCgDA6BShAACMznQ8AMCIPLZzQhIKAMDoJKEAACOyMGlCEgoAwOgUoQAAjM50PADAiDwxaUISCgDA6CShAAAjmrNFUxJJKAAAM6AIBQBgdKbjAQBGZDJ+QhIKAMDoJKEAACPyxKQJSSgAAKNThAIAMDrT8QAAIzIdPyEJBQBgdJJQAIARtScmJZGEAgAwA4pQAABGZzoeAGBEFiZNSEIBABidIhQAgNGZjgcAGFGbjk8iCQUAYAYkoQAAI7JP6IQkFACA0SlCAQAYnel4AIAR2Sd0QhIKAMDoJKEAACOyMGlCEgoAwOgUoQAAjM50PADAiCxMmpCEAgAwOkkoAMCIPDt+QhIKAMDoFKEAAIzOdDwAwIjm7BOaRBIKAMAMSEIBAEZkYdKEJBQAgNEpQgEAGJ3peACAEVmYNCEJBQBgdJJQAIARWZg0IQkFAGB0ilAAAEZnOh4AYEQWJk1IQgEAuIaqemtVXVBVp81r276qjq+qs4af28377nlVdXZVnVlV+y6lD0UoAADreluS/dZpe26SE7p79yQnDJ9TVXskOTDJnsM5h1fV5hvqQBEKADCiXgb/2+AYuz+Z5PvrNO+f5Mjh/ZFJDpjX/t7uvry7z0lydpJ7bagPRSgAwCpTVYdW1SnzXocu4bSbdff5STL83Glo3yXJt+cdd+7QtigLkwAARrQcFiZ195okazbS5WqhLjZ0kiQUAICl+F5V7Zwkw88LhvZzk9xy3nG7JjlvQxdThAIAsBTHJjloeH9QkmPmtR9YVVtW1W5Jdk9y0oYuZjoeAGBEK+GxnVX1niQPSrJjVZ2b5EVJXpbkqKo6JMm3kjwuSbr79Ko6KskZSa5K8tTuvnqDffQyuC9hITfe5nbLc2DAsnPZFT+d9RCAFeKqK76z0P2Lo7rtjr828xrn6xd9ceZ/B0koAMCIuudmPYRlwT2hAACMThEKAMDoTMcDAIxobgUsTBqDJBQAgNFJQgEARrRcdyYamyQUAIDRKUIBABid6XgAgBFZmDQhCQUAYHSSUACAEVmYNCEJBQBgdIpQAABGZzoeAGBEc6bjk0hCAQCYAUUoAACjMx0PADCitk9oEkkoAAAzIAkFABiRfUInJKEAAIxOEQoAwOhMxwMAjGjOwqQkklAAAGZAEgoAMCILkyYkoQAAjE4RCgDA6EzHAwCMaM50fBJJKAAAMyAJBQAYkYVJE5JQAABGpwgFAGB0puMBAEbkiUkTklAAAEYnCQUAGJGFSROSUAAARqcIBQBgdKbjAQBG5IlJE5JQAABGJwkFABhR26IpiSQUAIAZUIQCADA60/EAACOyMGlCEgoAwOgUoQAAjM50PADAiDy2c0ISCgDA6CShAAAjsk/ohCQUAIDRKUIBABid6XgAgBFZmDQhCQUAYHSSUACAEUlCJyShAACMThEKAMDoTMcDAIzIZPyEJBQAgNGVm2NZSarq0O5eM+txAMuff1/A8iYJZaU5dNYDAFYM/76AZUwRCgDA6BShAACMThHKSuP+LmCp/PsCljELkwAAGJ0kFACA0SlCAQAYnSKUjaaquqpeOe/zn1XVizdwzgFVtcd6vrtDVX28qk6tqq9W1Zqhfa+qesR1GN/bquqx1/Y8YLaq6q+q6vSq+vLw74N7V9Uzqmrra3md21TVadMaJ3DtKELZmC5P8piq2vFanHNAkgWL0CSvS/Lq7t6ru381yT8O7XslWbAIrSqPooVNSFXtneSRSe7W3XdJ8tAk307yjCQLFqFVtfloAwSuM0UoG9NVmaxGfea6X1TVravqhCHJOKGqblVV+yR5dJJ/GNKN261z2s5Jzl37obu/UlXXT/LSJI8fznl8Vb24qtZU1XFJ3r5QXwuM56+HZHSzqnpOVZ08HP+Sjfj3AH55Oye5qLsvT5LuvijJY5PcIsl/VtV/JklV/biqXlpVn0uyd1U9q6pOG17PWPeiVXXbqvpiVd2zqm5XVR+tqs9X1YlVdcfxfj1YvRShbGyvT/KEqrrxOu3/lOTtQ5LxriSv6+7PJDk2yXOGtPN/1jnn1Un+o6r+raqeWVU36e4rkrwwyfuGc943HHv3JPt39+8u1Nf8i1bV3yfZKcnBmaQquye5VyYJ692r6gEb4e8AbBzHJbllVX2tqg6vqgd29+uSnJfkwd394OG4GyY5rbvvneQnmfzzfe8k90nyR1X1a2svWFV3SPKBJAd398mZ/Mfz07r77kn+LMnhY/1ysJopQtmouvuSJG9P8qfrfLV3kncP79+R5H5LuNYRSX41ydFJHpTkv6pqy/Ucfmx3/2QJfb0gyU26+8k92Z/sN4bXF5N8IckdMylKgWWgu3+cyX9kHprkwiTvq6o/WODQqzMpLJPJP/Mf6u7LhvM/mOT+w3c3TXJMkid296lVtU2SfZIcXVWnJnljJukrMGXun2MaXpNJQXfEIscsaYPa7j4vyVuTvHVYUHCn9Rx62RL7OjmTtHP77v5+kkryd939xqWMBxhfd1+d5ONJPl5VX0ly0AKH/XQ4Lpn8c70+P8rkntL7Jjk9kzDmh92910YbMLAkklA2uqG4OyrJIfOaP5PkwOH9E5J8anh/aZJtF7pOVe1XVdcb3t88yQ5JvrPYORvoK0k+muRlST5SVdsm+ViSJw1pSKpql6raaQm/JjCCYZeM+bMTeyX5Zhb/98AnkxxQVVtX1Q2T/FaSE4fvrshkQeTvV9XvDrM351TV44b+qqruuvF/E2BdilCm5ZVJ5q+S/9MkB1fVl5P8XpKnD+3vTfKcYYHAuguTfiPJaVX1pUyKxed093eT/GeSPdYuTFqg7/X1lSTp7qOTvCmT+1FPzGTq/rNDwvL+LF7gAuPaJsmRVXXG8M/0HklenMl9nP+2dmHSfN39hSRvS3JSks8leXN3f3He95dlsuL+mVW1fyb/sXrI8O+a05PsP9XfCEjisZ0AAMyAJBQAgNEpQgEAGJ0iFACA0SlCAQAYnSIUAIDRKUKBa6Wqrh62xzqtqo6uqq1/iWu9raoeO7x/c1XtscixD6qqfa5DH9+oqh03fGRSVX9QVf90bfsA4NpThALX1k+6e6/uvlMmG3//8fwvq2rz63LR7v7D7j5jkUMelMnjFQHYBChCgV/GiUluP6SU/1lV707ylaravKr+oapOrqovV9WTk589jeafho3HP5LkZ0+nqqqPV9U9hvf7VdUXqupLVXVCVd0mk2L3mUMKe/+qumlVfWDo4+Squu9w7g5VddzwAIQ3Zj2PcFy3jwW+f1RVfW64zr9X1c2G9gcOYzh1+G7bqtq5qj45LyG+/y/2CMB8nh0PXCdVtUWSh2fyKNQkuVeSO3X3OVV1aJIfdfc9q2rLJJ+uquOS/FqSOyS5c5KbJTkjyVvXue5NM3mi1QOGa23f3d+vqjck+XF3v2I47t1JXt3dn6qqW2XyVK1fTfKiJJ/q7pdW1W8mOXSBsf9CHwv8ip9Kcp/u7qr6wyR/nuTZSf4syVO7+9PD415/OvTxse4+bEiCr/MtCgCrhSIUuLZuUFWnDu9PTPKWTKbJT+ruc4b230hyl7X3eya5cZLdkzwgyXu6++ok51XVfyxw/fsk+eTaa3X399czjodm8vjWtZ9vVFXbDn08Zjj3I1X1g+vYx65J3ldVOye5fpK1v9unk7yqqt6V5IPdfW5VnZzkrVV1vST/0t2nLnA9AOYxHQ9cW2vvCd2ru5/W3VcM7ZfNO6aSPG3ecbt193HDdxt6VnAt4Zhk8u+vvef1sUt3X7oR+/jHJP/U3XdO8uQkWyVJd78syR8muUGS/6qqO3b3JzMpfr+T5B1V9ftLGD/AqqYIBabhY0n+ZEgGU1W/UlU3TPLJJAcO94zunOTBC5z72SQPrKrdhnPXTpVfmmTbeccdl+T/rv1QVXsNbz+Z5AlD28OTbHct+pjvxpkUlUly0Lx+btfdX+nulyc5Jckdq+rWSS7o7jdlkgzfbYHrATCPIhSYhjdncr/nF6rqtCRvzOT2nw8lOSvJV5L8c5JPrHtid1+YyT2WH6yqLyV53/DVh5P81tqFSUn+NMk9hoVPZ+Tnq/RfkuQBVfWFTG4L+Na16GO+Fyc5uqpOTHLRvPZnDIuPvpTkJ0n+LZOV+6dW1ReT/HaS1274TwSwulX3Uma9AABg45GEAgAwOkUoAACjU4QCADA6RSgAAKNThAIAMDpFKAAAo1OEAgAwuv8PPKxSJCWVeTsAAAAASUVORK5CYII=\n",
"text/plain": [
"<Figure size 864x864 with 2 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Isolation Forest: 93\n",
"0.9090019569471625\n",
" precision recall f1-score support\n",
"\n",
" 0 0.95 0.95 0.95 960\n",
" 1 0.25 0.24 0.24 62\n",
"\n",
" accuracy 0.91 1022\n",
" macro avg 0.60 0.60 0.60 1022\n",
"weighted avg 0.91 0.91 0.91 1022\n",
"\n"
]
},
{
"data": {
"text/plain": [
"<Figure size 648x504 with 0 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"#evaluation of the model\n",
"#printing every score of the classifier\n",
"#scoring in any thing\n",
"\n",
"from sklearn.metrics import confusion_matrix\n",
"n_outliers = len(Stroke)\n",
"print(\"the Model used is {}\".format(\"Isolation Forest\"))\n",
"acc= accuracy_score(Y_test,y_pred)\n",
"print(\"The accuracy is {}\".format(acc))\n",
"prec= precision_score(Y_test,y_pred)\n",
"print(\"The precision is {}\".format(prec))\n",
"rec= recall_score(Y_test,y_pred)\n",
"print(\"The recall is {}\".format(rec))\n",
"f1= f1_score(Y_test,y_pred)\n",
"print(\"The F1-Score is {}\".format(f1))\n",
"MCC=matthews_corrcoef(Y_test,y_pred)\n",
"print(\"The Matthews correlation coefficient is{}\".format(MCC))\n",
"\n",
"#printing the confusion matrix\n",
"LABELS = ['Not Stroke', 'Stroke']\n",
"conf_matrix = confusion_matrix(Y_test, y_pred)\n",
"plt.figure(figsize=(12, 12))\n",
"sns.heatmap(conf_matrix, xticklabels=LABELS,\n",
" yticklabels=LABELS, annot=True, fmt=\"d\");\n",
"plt.title(\"Confusion matrix\")\n",
"plt.ylabel('True class')\n",
"plt.xlabel('Predicted class')\n",
"plt.show()\n",
"\n",
"# Run classification metrics\n",
"plt.figure(figsize=(9, 7))\n",
"print('{}: {}'.format(\"Isolation Forest\", n_errors))\n",
"print(accuracy_score(Y_test, y_pred))\n",
"print(classification_report(Y_test, y_pred))"
]
},
{
"cell_type": "markdown",
"id": "084eda9a",
"metadata": {},
"source": [
"#### Random Forest Classifier"
]
},
{
"cell_type": "code",
"execution_count": 25,
"id": "c41d3b76",
"metadata": {},
"outputs": [],
"source": [
"# Building the Random Forest Classifier (RANDOM FOREST)\n",
"from sklearn.ensemble import RandomForestClassifier\n",
"# random forest model creation\n",
"rfc = RandomForestClassifier()\n",
"rfc.fit(X_train,Y_train)\n",
"# predictions\n",
"y_pred = rfc.predict(X_test)"
]
},
{
"cell_type": "markdown",
"id": "7db12efc",
"metadata": {},
"source": [
"### Model eveuation"
]
},
{
"cell_type": "code",
"execution_count": 26,
"id": "6284b00a",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The model used is Random Forest classifier\n",
"The accuracy is 0.9393346379647749\n",
"The precision is 0.0\n",
"The recall is 0.0\n",
"The F1-Score is 0.0\n",
"The Matthews correlation coefficient is 0.0\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"C:\\Users\\Mishane\\anaconda3\\lib\\site-packages\\sklearn\\metrics\\_classification.py:1245: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.\n",
" _warn_prf(average, modifier, msg_start, len(result))\n",
"C:\\Users\\Mishane\\anaconda3\\lib\\site-packages\\sklearn\\metrics\\_classification.py:870: RuntimeWarning: invalid value encountered in double_scalars\n",
" mcc = cov_ytyp / np.sqrt(cov_ytyt * cov_ypyp)\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAqEAAALJCAYAAACEBfppAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8QVMy6AAAACXBIWXMAAAsTAAALEwEAmpwYAAAzJElEQVR4nO3dd7ilVXk34N/DAAqKXZRiQSUq1qgx9hKMJUEhxoKRxBgV88UYsSbGGvORmC/FEjURK3YgqKAmkYSoYAcRIyVWDA6MFCsSFJjzfH/sPeQ4nJk54Oy1z8zct9e+zt7vfvde64zXzPXwe9613uruAADASNvNewIAAGx7FKEAAAynCAUAYDhFKAAAwylCAQAYThEKAMBwilBgs6uqnarqQ1X1w6o66uf4nidW1XGbc27zUlX3r6qvzHseACtF2ScUtl1V9VtJnpPkdkkuSnJqkkO7+5M/5/f+dpJnJrlPd1/+885zpauqTrJ3d3993nMB2FJIQmEbVVXPSfLqJH+R5CZJbp7kDUn23wxff4skX90WCtDlqKrt5z0HgJVGEQrboKq6bpJXJHlGd7+/uy/u7su6+0Pd/fzpOdeoqldX1bnTx6ur6hrT9x5UVaur6rlVdX5VramqJ0/f+7MkL03y+Kr6cVU9papeXlXvWjT+Lauq1xVnVfW7VfXNqrqoqs6qqicuOv7JRZ+7T1WdNG3zn1RV91n03ser6s+r6lPT7zmuqm60gd9/3fxfsGj+B1TVr1XVV6vqe1X1p4vOv2dVfaaqfjA993VVteP0vROmp31p+vs+ftH3/3FVfSfJ29Ydm37m1tMx7jZ9vXtVXVhVD/p5/n8F2JIoQmHbdO8k10zygY2c86Ik90py1yR3SXLPJC9e9P5Nk1w3yR5JnpLk9VV1/e5+WSbp6hHdfe3ufsvGJlJV10ry2iSP6O5dktwnk8sC1j/vBkk+Mj33hkn+LslHquqGi077rSRPTrJrkh2TPG8jQ980kz+DPTIpmt+U5KAkd09y/yQvrapbTc9dm+TZSW6UyZ/dvkn+IEm6+wHTc+4y/X2PWPT9N8gkFT548cDd/Y0kf5zk3VW1c5K3JXl7d398I/MF2KooQmHbdMMkF26iXf7EJK/o7vO7+4Ikf5bktxe9f9n0/cu6+5+T/DjJba/mfBaS3LGqduruNd19+hLn/HqSr3X3O7v78u5+b5L/SvLIRee8rbu/2t2XJDkykwJ6Qy7L5PrXy5K8L5MC8zXdfdF0/NOT3DlJuvsL3f3Z6bjfSvLGJA9cxu/0su7+6XQ+P6O735Tka0k+l2S3TIp+gG2GIhS2Td9NcqNNXKu4e5L/XvT6v6fHrviO9YrY/0ly7as6ke6+OMnjk/x+kjVV9ZGqut0y5rNuTnssev2dqzCf73b32unzdUXieYvev2Td56vqF6rqw1X1nar6USZJ75Kt/kUu6O6fbOKcNyW5Y5K/7+6fbuJcgK2KIhS2TZ9J8pMkB2zknHMzaSWvc/Ppsavj4iQ7L3p908VvdvdHu/tXM0kE/yuT4mxT81k3p3Ou5pyuin/IZF57d/d1kvxpktrEZza69UhVXTuThWFvSfLy6eUGANsMRShsg7r7h5lcB/n66YKcnatqh6p6RFX9v+lp703y4qq68XSBz0uTvGtD37kJpyZ5QFXdfLoo6oXr3qiqm1TVo6bXhv40k7b+2iW+45+T/EJV/VZVbV9Vj0+yT5IPX805XRW7JPlRkh9PU9r/s9775yW51ZU+tXGvSfKF7n5qJte6/uPPPUuALYgiFLZR3f13mewR+uIkFyT5dpI/TPLB6Sn/N8nJSf4zyZeTnDI9dnXG+rckR0y/6wv52cJxuyTPzSTp/F4m11r+wRLf8d0k+03P/W6SFyTZr7svvDpzuoqel8mip4sySWmPWO/9lyc5fLp6/nGb+rKq2j/JwzO5BCGZ/P9wt3W7AgBsC2xWDwDAcJJQAACGU4QCADCcIhQAgOEUoQAADLexjarn6rILv2nFFLAsO+1+/3lPAdhCXH7pOZva43fmVkKNs8ONbjX3PwdJKAAAwylCAQAYThEKAMBwK/aaUACArdLCUncm3vZIQgEAGE4SCgAwUi/MewYrgiQUAIDhFKEAAAynHQ8AMNKCdnwiCQUAYA4koQAAA7WFSUkkoQAAzIEiFACA4bTjAQBGsjApiSQUAIA5kIQCAIxkYVISSSgAAHOgCAUAYDjteACAkRbWznsGK4IkFACA4SShAAAjWZiURBIKAMAcKEIBABhOOx4AYCR3TEoiCQUAYA4koQAAA7WFSUkkoQAAzIEiFACA4bTjAQBGsjApiSQUAIA5UIQCADCcdjwAwEhWxyeRhAIAMAeSUACAkRbWznsGK4IkFACA4RShAAAMpx0PADCShUlJJKEAAMyBJBQAYCR3TEoiCQUAYA4UoQAADKcdDwAwkoVJSSShAADMgSQUAGAkC5OSSEIBAJgDRSgAAMNpxwMADNS9dt5TWBEkoQAADCcJBQAYyRZNSSShAADMgSIUAIDhtOMBAEayT2gSSSgAAHMgCQUAGMnCpCSSUAAA5kARCgDAcNrxAAAjLbhjUiIJBQBgDhShAAAMpx0PADCS1fFJJKEAAMyBJBQAYCR3TEoiCQUAYA4UoQAADKcdDwAwkoVJSSShAADMgSQUAGAkC5OSSEIBAJgDRSgAAMNpxwMAjKQdn0QSCgDAHEhCAQAG6l477ymsCJJQAACGU4QCADCcdjwAwEgWJiWRhAIAMAeSUACAkdw7PokkFACAOVCEAgAwnHY8AMBIFiYlkYQCADAHilAAAIbTjgcAGMnq+CSSUAAA5kASCgAwkoVJSSShAADMgSIUAIDhtOMBAEayMCmJJBQAgDmQhAIAjGRhUhJJKAAAc6AIBQBgOO14AICRtOOTSEIBAJgDSSgAwEi2aEoiCQUAYA4UoQAADKcdDwAwkoVJSSShAADMgSQUAGAkC5OSSEIBAJgDRSgAAMNpxwMAjGRhUhJJKAAAcyAJBQAYycKkJJJQAACWUFXPrqrTq+q0qnpvVV2zqm5QVf9WVV+b/rz+ovNfWFVfr6qvVNXDNvX9ilAAAH5GVe2R5I+S3KO775hkVZIDk/xJkuO7e+8kx09fp6r2mb5/hyQPT/KGqlq1sTEUoQAAIy0szP+xPNsn2amqtk+yc5Jzk+yf5PDp+4cnOWD6fP8k7+vun3b3WUm+nuSeG/tyRSgAwDamqg6uqpMXPQ5e/H53n5Pkb5KcnWRNkh9293FJbtLda6bnrEmy6/QjeyT59qKvWD09tkEWJgEAbGO6+7Akh23o/em1nvsn2SvJD5IcVVUHbeQra6lhNjYHRSgAwEhbxj6hD0lyVndfkCRV9f4k90lyXlXt1t1rqmq3JOdPz1+d5GaLPr9nJu37DdKOBwBgfWcnuVdV7VxVlWTfJGcmOTbJk6bnPCnJMdPnxyY5sKquUVV7Jdk7yec3NoAkFABgpN5ol3pF6O7PVdU/JTklyeVJvphJ+/7aSY6sqqdkUqg+dnr+6VV1ZJIzpuc/o7vXbmwMRSgAAFfS3S9L8rL1Dv80k1R0qfMPTXLocr9fOx4AgOEkoQAAI20ZC5NmThIKAMBwklAAgJEkoUkkoQAAzIEiFACA4bTjAQBGau34RBIKAMAcSEIBAEayMCmJJBQAgDlQhAIAMJx2PADASN3znsGKIAkFAGA4SSgAwEgWJiWRhAIAMAeKUAAAhtOOBwAYSTs+iSQUAIA5kIQCAIzk3vFJJKEAAMyBIhQAgOG04wEABuoFd0xKJKEAAMyBIhQAgOG04wEARrJPaBJJKAAAcyAJBQAYyT6hSSShAADMgSIUAIDhtOMBAEayT2gSSSgAAHMgCQUAGMkWTUkkoQAAzIEiFACA4bTjAQBG0o5PIgkFAGAOJKEAACO1LZoSSSgAAHOgCAUAYDjteACAkSxMSiIJBQBgDiShAAAjuXd8EkkoAABzoAgFAGA4RSgryjuP/GAOOOj3s/8Tn553HvGBK46/+6hjst+BT83+T3x6/vb1b7ni+JvecUQe8bjfy34HPjWf+twX5jFlYIV52EMflNNPOyH/dcYn84LnP2Pe04Er64X5P1YA14SyYnztm9/K0cf+a9775ldnh+13yO8/98V5wH3umfPOvzAf++Rn8/53vCE77rhjvvv9HyRJvnHWf+dfjv9EjnnXP+b8C7+Xpz7rhfnI+96cVatWzfcXAeZmu+22y2tfc2ge/mtPyOrVa/LZz/xzPvTh43LmmV+b99SA9cw0Ca2q+1XVk6fPb1xVe81yPLZs3/zWt3PnO9wuO13zmtl++1W5x13vlONP+HSO+OBH8pSDHpcdd9wxSXLD618vSfIfJ342j9j3gdlxxx2z5+43zc333D1fPvOrc/wNgHm75y/9Yr7xjW/lrLPOzmWXXZYjjzwmj3rkw+Y9LWAJMytCq+plSf44yQunh3ZI8q5ZjceW7za3ukW+8KXT8oMf/iiX/OQnOfEzJ+U7512Qb519Tr7wpdPyhKcdkt99xvPz5TO/kiQ5/4Lv5qY3ufEVn7/JrjfK+RdcOK/pAyvA7nvcNN9efe4Vr1efsya7737TOc4IlrDQ83+sALNsx/9Gkl9MckqSdPe5VbXLDMdjC3frW948v/fEx+Zph/xpdt5pp/zCbW6VVatWZe3atfnRRT/Oew57VU4786t53kv+Mv961NvSufJfokrNYebASlF15X8D2n26YUWaZTv+0p78ze8kqaprbeoDVXVwVZ1cVSe/+R3vneHUWKl+85EPy1Fve10Of8Nf57rX2SW3uNkeucmuN8pDHnjfVFXutM9tU1X5/g9+mJvc+Eb5znkXXPHZ886/MDe+8Q3nOHtg3s5ZvSY323P3K17vucduWbPmvDnOCK6sFxbm/lgJZlmEHllVb0xyvap6WpJ/T/LmjX2guw/r7nt09z2e+jtPmOHUWKnWLTpa853zc/wnPpVHPOSB+ZX73zuf/8KpSZJvnb06l11+ea5/vevmwfe7V/7l+E/k0ksvzepzv5OzV5+bO93+F+Y3eWDuTjr51NzmNnvllre8WXbYYYc87nH750MfPm7e0wKWMLN2fHf/TVX9apIfJbltkpcmOWFW47F1ePaf/t/84Ec/yvbbb58XPfcPct3r7JJH7/fQvPgvXpUDDvr97LDD9vmLFz83VZXb3OoWediv3D+PeuLTs/2qVXnRc/7AynjYxq1duzbPOuTF+eePvCerttsubz/8iJxxhgWLsBLVrK6Vqaq3dvfvLXp97STHdPe+y/n8ZRd+00U8wLLstPv95z0FYAtx+aXnzH3xwMWH/s7ca5xrvegdc/9zmGU7/pyq+ockqarrJzkuVscDAJDZtuNfUlV/VVX/mOTuSV7Z3UfPajwAgC3CCrlj0bxt9iK0qh696OXnk7xk+rOr6tHd/f7NPSYAAFuWWSShj1zv9Rcz2aj+kZls16QIBQDYxm32IrS7n7y5vxMAYKuxQu5YNG+zvG3nnlX1gao6v6rOq6qjq2rPWY0HAMCWY5ar49+W5NgkuyfZI8mHpscAALZdCwvzf6wAsyxCb9zdb+vuy6ePtye58QzHAwBgCzHLIvTCqjqoqlZNHwcl+e4MxwMAYAsxs31Ck/xektcleVUmq+I/PT0GALDtsjApyYyK0KpaleQvuvtRs/h+AAC2bDMpQrt7bVXduKp27O5LZzEGAMAWyR2Tksy2Hf+tJJ+qqmOTXLzuYHf/3QzHBABgCzDLIvTc6WO7JLtMj7kIAgCAmRahZ3T3UYsPVNVjZzgeAMDKZ2FSktlu0fTCZR4DAGAbs9mT0Kp6RJJfS7JHVb120VvXSXL55h4PAGBL0ivkjkXzNot2/LlJTk7yqCRfWHT8oiTPnsF4AABsYTZ7EdrdX0rypap6T3dfVlU7JLljknO6+/ubezwAALY8m/2a0Kr6x6q6w7QAvW6SLyV5R5IvVtUTNvd4AABblIWe/2MFmMXCpPt39+nT509O8tXuvlOSuyd5wQzGAwBgCzOLInTxHZJ+NckHk6S7vzODsQAA2ALNYmHSD6pqvyTnJLlvkqckSVVtn2SnGYwHALDlWCHt8HmbRRH69CSvTXLTJIcsSkD3TfKRGYwHAMAWZhar47+a5OFLHP9oko9u7vEAALYobZ/QZLZ3TAIAgCUpQgEAGG4W14QmSapqr+4+a1PHAAC2KRYmJZltEnr0Esf+aYbjAQCwhdjsSWhV3S7JHZJct6oeveit6yS55uYeDwBgS9KS0CSzacffNsl+Sa6X5JGLjl+U5GkzGA8AgC3MLLZoOibJMVV17+7+zOb+fgAAtnwzW5iU5NtV9YFM7prUST6Z5FndvXqGYwIArGza8UlmuzDpbUmOTbJ7kj2SfGh6DACAbdwsk9Bdu3tx0fn2qjpkhuMBAKx8C+6YlMw2Cb2gqg6qqlXTx0FJvjvD8QAA2ELMsgj9vSSPS/KdJGuSPGZ6DACAbdzM2vHdfXaSR83q+wEAtkgWJiWZzWb1L93I293df765xwQAYMsyiyT04iWOXSvJU5LcMIkiFADYdklCk8xms/q/Xfe8qnZJ8qwkT07yviR/u6HPAQCw7ZjJNaFVdYMkz0nyxCSHJ7lbd39/FmMBALDlmcU1oX+d5NFJDktyp+7+8eYeAwBgS9WtHZ/MZoum52Zyl6QXJzm3qn40fVxUVT+awXgAAGxhZnFN6Cz3HgUA2LJZmJRktpvVAwDAkhShAAAMN7M7JgEAsATt+CSSUAAA5kARCgDAcNrxAAADtXZ8EkkoAABzIAkFABhJEppEEgoAwBwoQgEAGE47HgBgpIV5T2BlkIQCADCcJBQAYCBbNE1IQgEAGE4RCgDAcNrxAAAjaccnkYQCADAHklAAgJFs0ZREEgoAwBwoQgEAGE47HgBgIPuETkhCAQAYThIKADCShUlJJKEAAMyBIhQAgOG04wEABrIwaUISCgDAcIpQAACG044HABjJ6vgkklAAAJZQVderqn+qqv+qqjOr6t5VdYOq+req+tr05/UXnf/Cqvp6VX2lqh62qe9XhAIADNQL838s02uS/Gt33y7JXZKcmeRPkhzf3XsnOX76OlW1T5IDk9whycOTvKGqVm3syxWhAAD8jKq6TpIHJHlLknT3pd39gyT7Jzl8etrhSQ6YPt8/yfu6+6fdfVaSrye558bGUIQCAGxjqurgqjp50ePg9U65VZILkrytqr5YVW+uqmsluUl3r0mS6c9dp+fvkeTbiz6/enpsgyxMAgAYaQUsTOruw5IctpFTtk9ytyTP7O7PVdVrMm29b0AtNczG5iAJBQBgfauTrO7uz01f/1MmRel5VbVbkkx/nr/o/Jst+vyeSc7d2ACKUACAgea9KGk5C5O6+ztJvl1Vt50e2jfJGUmOTfKk6bEnJTlm+vzYJAdW1TWqaq8keyf5/MbG0I4HAGApz0zy7qraMck3kzw5kwDzyKp6SpKzkzw2Sbr79Ko6MpNC9fIkz+jutRv7ckUoAABX0t2nJrnHEm/tu4HzD01y6HK/XxEKADDSCliYtBK4JhQAgOEkoQAAA12FOxZt1SShAAAMpwgFAGA47XgAgIG04yckoQAADCcJBQAYSBI6IQkFAGA4RSgAAMNpxwMAjNQ17xmsCJJQAACGk4QCAAxkYdKEJBQAgOEUoQAADKcdDwAwUC9YmJRIQgEAmANFKAAAw2nHAwAMZHX8hCQUAIDhJKEAAAO1OyYlkYQCADAHilAAAIbTjgcAGMjCpAlJKAAAw0lCAQAGcsekCUkoAADDKUIBABhOOx4AYKDuec9gZZCEAgAwnCQUAGAgC5MmJKEAAAynCAUAYDjteACAgbTjJyShAAAMJwkFABjIFk0TklAAAIZThAIAMJx2PADAQBYmTUhCAQAYThIKADBQtyQ0kYQCADAHilAAAIbTjgcAGKgX5j2DlUESCgDAcIpQAACG044HABhower4JJJQAADmQBIKADCQfUInJKEAAAynCAUAYDjteACAgXpBOz6RhAIAMAebLEKr6rFVtcv0+Yur6v1VdbfZTw0AYOvTPf/HSrCcJPQl3X1RVd0vycOSHJ7kH2Y7LQAAtmbLKULXTn/+epJ/6O5jkuw4uykBALC1W87CpHOq6o1JHpLkr6rqGnEtKQDA1WJh0sRyisnHJflokod39w+S3CDJ82c5KQAAtm7LSUJ3S/KR7v5pVT0oyZ2TvGOWkwIA2Fq5d/zEcpLQo5OsrarbJHlLkr2SvGemswIAYKu2nCJ0obsvT/LoJK/u7mdnko4CAMDVspx2/GVV9YQkv5PkkdNjO8xuSgAAW6/Wjk+yvCT0yUnuneTQ7j6rqvZK8q7ZTgsAgK3ZJpPQ7j4jyR8ten1WklfOclIAAFurlXLHonnbZBFaVXsn+csk+yS55rrj3X2rGc4LAICt2HLa8W/L5Dadlyd5cCbbM71zlpMCAGDrtpyFSTt19/FVVd3930leXlUnJnnZjOcGALDVsU/oxHKK0J9U1XZJvlZVf5jknCS7znZaAABszZZThB6SZOdMFif9eZJfSfKkGc4JAGCrZYumieWsjj9p+vTHmWzXBAAAP5cNFqFV9aEkG9xEoLsfNZMZAQCw1dtYEvo3w2YBALCNsE/oxAaL0O7+RJJU1bWSXNLdC9PXq5JcY8z0AADYGi1nn9DjM1mYtM5OSf59NtMBAGBbsJzV8dfs7h+ve9HdP66qnTf2AQAAlmaf0InlFKEXV9XduvuUJKmquye5ZLbTSm5+m/1mPQQAAHOy3H1Cj6qqc6evd0vy+JnNCABgK2af0Ill7RNaVbdLctskleS/uvuymc8MAICt1nKS0EyLztNmPBcAALYRyypCAQDYPCxMmljOFk0AALBZbbIIrYmDquql09c3r6p7zn5qAABbn14Bj5VgOUnoG5LcO8kTpq8vSvL6mc0IAICt3nKuCf3l7r5bVX0xSbr7+1W144znBQDAVmw5Rehl0/vFd5JU1Y2TLMx0VgAAWykLkyaW045/bZIPJNm1qg5N8skkfzHTWQEAsFVbzmb1766qLyTZN5PN6g/o7jNnPjMAgK2QOyZNbLIIraqbJ/mfJB9afKy7z57lxAAA2Hot55rQj2RyPWgluWaSvZJ8JckdZjgvAAC2Ystpx99p8euquluSp89sRgAAWzGruyeu8h2TuvuUJL80g7kAALCNWM41oc9Z9HK7JHdLcsHMZgQAsBXrWJiULO+a0F0WPb88k2tEj57NdAAA2BZstAidblJ/7e5+/qD5AACwDdhgEVpV23f35dOFSAAAbAYLPe8ZrAwbS0I/n8n1n6dW1bFJjkpy8bo3u/v9M54bAABbqeVcE3qDJN9N8iv53/1CO4kiFACAq2VjReiu05Xxp+V/i891BMkAAFfDgtXxSTZehK5Kcu1kyT8pRSgAAFfbxorQNd39imEzAQDYBtgndGJjd0zyJwQAwExsrAjdd9gsAADYpmywHd/d3xs5EQCAbcHCvCewQmwsCQUAgJlYzj6hAABsJhYmTUhCAQAYThEKAMBw2vEAAANZmDQhCQUAYDhJKADAQJLQCUkoAADDKUIBABhOOx4AYCD7hE5IQgEAGE4SCgAw0IIgNIkkFACAOVCEAgAwnHY8AMBACxYmJZGEAgAwB5JQAICBet4TWCEkoQAADKcIBQBgOO14AICBFuY9gRVCEgoAwHCKUAAAhtOOBwAYaKHsE5pIQgEAmANFKADAQL0CHstRVauq6otV9eHp6xtU1b9V1demP6+/6NwXVtXXq+orVfWw5Xy/IhQAgKU8K8mZi17/SZLju3vvJMdPX6eq9klyYJI7JHl4kjdU1apNfbkiFACAn1FVeyb59SRvXnR4/ySHT58fnuSARcff190/7e6zknw9yT03NYYiFABgoIUV8Kiqg6vq5EWPg9eb5quTvCA/u63pTbp7TZJMf+46Pb5Hkm8vOm/19NhGWR0PALCN6e7Dkhy21HtVtV+S87v7C1X1oGV83VLL/Td56akiFABgoIWVv0PTfZM8qqp+Lck1k1ynqt6V5Lyq2q2711TVbknOn56/OsnNFn1+zyTnbmoQ7XgAAK7Q3S/s7j27+5aZLDj6j+4+KMmxSZ40Pe1JSY6ZPj82yYFVdY2q2ivJ3kk+v6lxJKEAACzHK5McWVVPSXJ2kscmSXefXlVHJjkjyeVJntHdazf1ZYpQAICBFpa8hHJl6u6PJ/n49Pl3k+y7gfMOTXLoVflu7XgAAIaThAIADLTcOxZt7SShAAAMpwgFAGA47XgAgIG2gH1Ch5CEAgAwnCQUAGCghU2fsk2QhAIAMJwiFACA4bTjAQAGsk/ohCQUAIDhJKEAAAPZomlCEgoAwHCKUAAAhtOOBwAYyD6hE5JQAACGU4QCADCcdjwAwEDa8ROSUAAAhpOEAgAM1PYJTSIJBQBgDhShAAAMpx0PADCQhUkTklAAAIaThAIADCQJnZCEAgAwnCIUAIDhtOMBAAbqeU9ghZCEAgAwnCQUAGCgBXdMSiIJBQBgDhShAAAMpx0PADCQfUInJKEAAAwnCQUAGEgSOiEJBQBgOEUoAADDaccDAAzkjkkTklAAAIZThAIAMJx2PADAQG7bOSEJBQBgOEkoAMBA9gmdkIQCADCcIhQAgOG04wEABrJP6IQkFACA4SShAAADLchCk0hCAQCYA0UoAADDaccDAAxkn9AJSSgAAMNJQgEABrIsaUISCgDAcIpQAACG044HABjIwqQJSSgAAMNJQgEABlqoec9gZZCEAgAwnCIUAIDhtOMBAAZasFNoEkkoAABzIAkFABhIDjohCQUAYDhFKAAAw2nHAwAM5I5JE5JQAACGU4QCADCcdjwAwED2CZ2QhAIAMJwkFABgIDnohCQUAIDhFKEAAAynHQ8AMJB9QickoQAADCcJBQAYyBZNE5JQAACGU4QCADCcdjwAwECa8ROSUAAAhpOEAgAMZIumCUkoAADDKUIBABhOOx4AYKC2NCmJJBQAgDmQhAIADGRh0oQkFACA4RShAAAMpx0PADDQgoVJSSShAADMgSQUAGAgOeiEJBQAgOEUoQAADKcdDwAwkIVJE5JQAACGU4QCADCcdjwAwEBu2zkhCQUAYDhFKCvWda67S950+Kty4uc/nBM+96Hc/Zfukpe84nk58fMfzvGf+kDe+q7X5jrX3WXe0wRWmIc99EE5/bQT8l9nfDIveP4z5j0duJJeAf9bCRShrFh//soX5mP//snc/577Zd/7PTpf++o3c8LHPp0H3Xv/7Hvf38g3vv6tPPPZT5v3NIEVZLvttstrX3No9nvkQbnTXR6cxz/+gNz+9nvPe1rAEhShrEjX3uVaudd97pH3vPPoJMlll12WH/3wonziY5/O2rVrkySnnPyl7L77Tec5TWCFuecv/WK+8Y1v5ayzzs5ll12WI488Jo965MPmPS1gCYpQVqRb3PJm+e6F38ur33Bojjvh6PzNa1+RnXbe6WfOOfCgR+c//v3EOc0QWIl23+Om+fbqc694vfqcNf5jlRVnYQU8VoKZFqFV9QtVdXxVnTZ9feeqevEsx2TrsP2qVbnTXfbJ4W85Ig99wG/mkv+5JM989lOveP9Zz3161l6+Nkcf+aE5zhJYaarqSse6V8b1b8DPmnUS+qYkL0xyWZJ0938mOXBDJ1fVwVV1clWd/D+Xfn/GU2MlO/fc87Lm3PPyxS/8Z5Lkw8cclzvdeZ8kyWOfsH8e8rAH5hlPe8E8pwisQOesXpOb7bn7Fa/33GO3rFlz3hxnBFc270VJ28rCpJ27+/PrHbt8Qyd392HdfY/uvsfOO15/xlNjJbvg/Atz7urv5Na3uWWS5H4PvFe++pVv5MH73i9/+Kyn5nef8IxccslP5jtJYMU56eRTc5vb7JVb3vJm2WGHHfK4x+2fD334uHlPC1jCrDerv7Cqbp1MSu6qekySNTMek63Ei/740Lz+Tf8vO+y4Q87+1uoc8gcvyr987MjsuOMOed8H35IkOeWkL+WPn/Nnc54psFKsXbs2zzrkxfnnj7wnq7bbLm8//IicccZX5z0tYAk1y2tlqupWSQ5Lcp8k309yVpIndvd/b+qzu11vn5WRFQMr3gX/88N5TwHYQlx+6TlXvnB4sCfd8jfnXuMc/q2j5/7nMOsk9Prd/ZCqulaS7br7oqp6ZJJNFqEAAGy9Zr4wqaru1N0XTwvQA5NYHQ8AbLMWuuf+WAlmnYQ+Jsk/VdUTk9wvye8keeiMxwQAYIWbaRHa3d+cpp8fTPLtJA/t7ktmOSYAACvfTIrQqvpy8jObUN0gyaokn6uqdPedZzEuAMBKtzKa4fM3qyR0vxl9LwAAW4GZFKGLt2Cqqrskuf/05Ynd/aVZjAkAsCVYkIUmmf2945+V5N1Jdp0+3lVVz5zlmAAArHyzXh3/lCS/3N0XJ0lV/VWSzyT5+xmPCwDACjbrIrSSrF30eu30GADANqm145PMvgh9ayYr4j8wfX1AkrfMeEwAAFa4mRWhVbVdks8l+UQmG9VXkid39xdnNSYAAFuGmRWh3b1QVX/b3fdOcsqsxgEA2JIszHsCK8Ss7x1/XFX9ZlW5DhQAgCvM+prQ5yS5VpLLq+onmbTku7uvM+NxAQBWJPuETsz63vG7zPL7AQDYMs16s/rjl3MMAIBty0yS0Kq6ZpKdk9yoqq6f/90b9DpJdp/FmAAAWwL7hE7Mqh3/9CSHZFJwfmHR8YuSvH5GYwIAsIWYVRH66SRHJnlMd/99VT0pyW8m+VaS98xoTACAFc8WTROzuib0jUl+Oi1AH5DkL5McnuSHSQ6b0ZgAAGwGVXWzqvpYVZ1ZVadX1bOmx29QVf9WVV+b/rz+os+8sKq+XlVfqaqHbWqMWRWhq7r7e9Pnj09yWHcf3d0vSXKbGY0JAMDmcXmS53b37ZPcK8kzqmqfJH+S5Pju3jvJ8dPXmb53YJI7JHl4kjdU1aqNDTCzIrSq1rX6903yH4vem/XepAAAK1Z3z/2xjDmu6e5Tps8vSnJmkj2S7J9JdzvTnwdMn++f5H3d/dPuPivJ15Pcc2NjzKoIfW+ST1TVMUkuSXJiklTVbTJpyQMAMCdVdXBVnbzocfBGzr1lkl9M8rkkN+nuNcmkUE2y6/S0PZJ8e9HHVk+PbdBMUsnuPnS6H+huSY7r/y25t0vyzFmMCQCwJVgJd0zq7sOyjHU6VXXtJEcnOaS7f7SRO7Ev9cZGf9GZtca7+7NLHPvqrMYDAGDzqaodMilA393d758ePq+qduvuNVW1W5Lzp8dXJ7nZoo/vmeTcjX3/TO+YBADAlqcmkedbkpzZ3X+36K1jkzxp+vxJSY5ZdPzAqrpGVe2VZO8kn9/YGBYJAQAMtIXsE3rfJL+d5MtVder02J8meWWSI6vqKUnOTvLYJOnu06vqyCRnZLKy/hndvXZjAyhCAQD4Gd39ySx9nWcy2floqc8cmuTQ5Y6hCAUAGMi94ydcEwoAwHCKUAAAhtOOBwAYaCXsE7oSSEIBABhOEgoAMNBy7t2+LZCEAgAwnCIUAIDhtOMBAAbaQu6YNHOSUAAAhlOEAgAwnHY8AMBAbts5IQkFAGA4SSgAwEDumDQhCQUAYDhFKAAAw2nHAwAM5LadE5JQAACGk4QCAAxkYdKEJBQAgOEUoQAADKcdDwAwkDsmTUhCAQAYThIKADDQgi2akkhCAQCYA0UoAADDaccDAAykGT8hCQUAYDhJKADAQO6YNCEJBQBgOEUoAADDaccDAAykHT8hCQUAYDhJKADAQO2OSUkkoQAAzIEiFACA4bTjAQAGsjBpQhIKAMBwilAAAIbTjgcAGKi145NIQgEAmANJKADAQPYJnZCEAgAwnCIUAIDhtOMBAAayT+iEJBQAgOEkoQAAA1mYNCEJBQBgOEUoAADDaccDAAxkYdKEJBQAgOEkoQAAA7l3/IQkFACA4RShAAAMpx0PADDQgn1Ck0hCAQCYA0koAMBAFiZNSEIBABhOEQoAwHDa8QAAA1mYNCEJBQBgOEkoAMBAFiZNSEIBABhOEQoAwHDa8QAAA1mYNCEJBQBgOEUoAADDaccDAAxkdfyEJBQAgOEkoQAAA1mYNCEJBQBgOEUoAADDaccDAAxkYdKEJBQAgOEkoQAAA3UvzHsKK4IkFACA4RShAAAMpx0PADDQgoVJSSShAADMgSQUAGCgdsekJJJQAADmQBEKAMBw2vEAAANZmDQhCQUAYDhJKADAQBYmTUhCAQAYThEKAMBw2vEAAAMtaMcnkYQCADAHilAAAIbTjgcAGKjtE5pEEgoAwBxIQgEABrJP6IQkFACA4RShAAAMpx0PADDQgoVJSSShAADMgSQUAGAgC5MmJKEAAAynCAUAYDjteACAgRa045NIQgEAmANJKADAQBYmTUhCAQAYThEKAMBw2vEAAAO5Y9KEJBQAgOEkoQAAA1mYNCEJBQBgOEUoAADDaccDAAzkjkkTklAAAIaThAIADNS2aEoiCQUAYA4UoQAADKcdDwAwkIVJE5JQAACGU4QCADCcdjwAwEBu2zkhCQUAYDhJKADAQPYJnZCEAgAwnCIUAIDhtOMBAAayMGlCEgoAwHCSUACAgSShE5JQAACGU4QCAHAlVfXwqvpKVX29qv5kc3+/djwAwEBbQjO+qlYleX2SX02yOslJVXVsd5+xucaQhAIAsL57Jvl6d3+zuy9N8r4k+2/OAVZsErrmB2fUvOfAylNVB3f3YfOeB7Dy+feCleryS8+Ze41TVQcnOXjRocPW+/uyR5JvL3q9Oskvb845SELZ0hy86VMAkvj3Ajaouw/r7nsseqz/H2xLFcqb9UoCRSgAAOtbneRmi17vmeTczTmAIhQAgPWdlGTvqtqrqnZMcmCSYzfnACv2mlDYANd3Acvl3wu4mrr78qr6wyQfTbIqyVu7+/TNOUbZtR8AgNG04wEAGE4RCgDAcIpQNpuq6qr620Wvn1dVL9/EZw6oqn028N5tq+rjVXVqVZ1ZVYdNj9+1qn7taszv7VX1mKv6OWC+qupFVXV6Vf3n9N+DX66qQ6pq56v4PbesqtNmNU/gqlGEsjn9NMmjq+pGV+EzByRZsghN8tokr+ruu3b37ZP8/fT4XZMsWYRWlcV2sBWpqnsn2S/J3br7zkkekskG2ockWbIInd5uEFjhFKFsTpdnshr12eu/UVW3qKrjp0nG8VV186q6T5JHJfnrabpx6/U+tlsm+5QlSbr7y9NtIl6R5PHTzzy+ql5eVYdV1XFJ3rHUWEvM58+nyeh2VfX8qjppev6fbcY/D+Dnt1uSC7v7p0nS3RcmeUyS3ZN8rKo+liRV9eOqekVVfS7JvavqOVV12vRxyPpfWlW3qqovVtUvVdWtq+pfq+oLVXViVd1u3K8H2y5FKJvb65M8saquu97x1yV5xzTJeHeS13b3pzPZc+z507TzG+t95lVJ/qOq/qWqnl1V15vev/alSY6YfuaI6bl3T7J/d//WUmMt/tKq+n9Jdk3y5ExSlb0zuUfuXZPcvaoesBn+HIDN47gkN6uqr1bVG6rqgd392kw2zX5wdz94et61kpzW3b+c5JJM/n7/cpJ7JXlaVf3iui+sqtsmOTrJk7v7pEz+4/mZ3X33JM9L8oZRvxxsyxShbFbd/aMk70jyR+u9de8k75k+f2eS+y3ju96W5PZJjkryoCSfraprbOD0Y7v7kmWM9ZIk1+vup/dkf7KHTh9fTHJKkttlUpQCK0B3/ziT/8g8OMkFSY6oqt9d4tS1mRSWyeTv/Ae6++Lp59+f5P7T926c5JgkB3X3qVV17ST3SXJUVZ2a5I2ZpK/AjLl+jll4dSYF3ds2cs6yNqjt7nOTvDXJW6cLCu64gVMvXuZYJ2WSdt6gu7+Xyb1x/7K737ic+QDjdffaJB9P8vGq+nKSJy1x2k+m5yVL3/N6nR9mck3pfZOcnkkY84PuvutmmzCwLJJQNrtpcXdkkqcsOvzpTG75lSRPTPLJ6fOLkuyy1PdU1cOraofp85smuWGSczb2mU2MlST/muSVST5SVbtkcieI35umIamqPapq12X8msAA010yFncn7prkv7PxfwdOSHJAVe1cVddK8htJTpy+d2kmCyJ/p6p+a9q9OauqHjsdr6rqLpv/NwHWpwhlVv42yeJV8n+U5MlV9Z9JfjvJs6bH35fk+dMFAusvTHpoktOq6kuZFIvP7+7vJPlYkn3WLUxaYuwNjZUk6e6jkrwpk+tRT8ykdf+ZacLyT9l4gQuMde0kh1fVGdO/0/skeXkm13H+y7qFSYt19ylJ3p7k80k+l+TN3f3FRe9fnMmK+2dX1f6Z/MfqU6b/1pyeZP+Z/kZAErftBABgDiShAAAMpwgFAGA4RSgAAMMpQgEAGE4RCgDAcIpQ4CqpqrXT7bFOq6qjqmrnn+O73l5Vj5k+f3NV7bORcx9UVfe5GmN8q6putOkzk6r63ap63VUdA4CrThEKXFWXdPddu/uOmWz8/fuL36yqVVfnS7v7qd19xkZOeVAmt1cEYCugCAV+Hicmuc00pfxYVb0nyZeralVV/XVVnVRV/1lVT0+uuBvN66Ybj38kyRV3p6qqj1fVPabPH15Vp1TVl6rq+Kq6ZSbF7rOnKez9q+rGVXX0dIyTquq+08/esKqOm94A4Y3ZwC0c1x9jifcfWVWfm37Pv1fVTabHHzidw6nT93apqt2q6oRFCfH9rzwiAIu5dzxwtVTV9kkekcmtUJPknknu2N1nVdXBSX7Y3b9UVddI8qmqOi7JLya5bZI7JblJkjOSvHW9771xJne0esD0u27Q3d+rqn9M8uPu/pvpee9J8qru/mRV3TyTu2rdPsnLknyyu19RVb+e5OAl5n6lMZb4FT+Z5F7d3VX11CQvSPLcJM9L8ozu/tT0dq8/mY7x0e4+dJoEX+1LFAC2FYpQ4KraqapOnT4/MclbMmmTf767z5oef2iSO6+73jPJdZPsneQBSd7b3WuTnFtV/7HE998ryQnrvqu7v7eBeTwkk9u3rnt9naraZTrGo6ef/UhVff9qjrFnkiOqarckOyZZ97t9KsnfVdW7k7y/u1dX1UlJ3lpVOyT5YHefusT3AbCIdjxwVa27JvSu3f3M7r50evziRedUkmcuOm+v7j5u+t6m7hVcyzgnmfz7de9FY+zR3RdtxjH+PsnruvtOSZ6e5JpJ0t2vTPLUJDsl+WxV3a67T8ik+D0nyTur6neWMX+AbZoiFJiFjyb5P9NkMFX1C1V1rSQnJDlwes3obkkevMRnP5PkgVW11/Sz61rlFyXZZdF5xyX5w3Uvququ06cnJHni9Ngjklz/Koyx2HUzKSqT5EmLxrl1d3+5u/8qyclJbldVt0hyfne/KZNk+G5LfB8AiyhCgVl4cybXe55SVacleWMml/98IMnXknw5yT8k+cT6H+zuCzK5xvL9VfWlJEdM3/pQkt9YtzApyR8lucd04dMZ+d9V+n+W5AFVdUomlwWcfRXGWOzlSY6qqhOTXLjo+CHTxUdfSnJJkn/JZOX+qVX1xSS/meQ1m/4jAti2Vfdyul4AALD5SEIBABhOEQoAwHCKUAAAhlOEAgAwnCIUAIDhFKEAAAynCAUAYLj/D43LQH90EA/eAAAAAElFTkSuQmCC\n",
"text/plain": [
"<Figure size 864x864 with 2 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Random Forest: 62\n",
"0.9393346379647749\n",
" precision recall f1-score support\n",
"\n",
" 0 0.94 1.00 0.97 960\n",
" 1 0.00 0.00 0.00 62\n",
"\n",
" accuracy 0.94 1022\n",
" macro avg 0.47 0.50 0.48 1022\n",
"weighted avg 0.88 0.94 0.91 1022\n",
"\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"C:\\Users\\Mishane\\anaconda3\\lib\\site-packages\\sklearn\\metrics\\_classification.py:1245: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior.\n",
" _warn_prf(average, modifier, msg_start, len(result))\n",
"C:\\Users\\Mishane\\anaconda3\\lib\\site-packages\\sklearn\\metrics\\_classification.py:1245: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior.\n",
" _warn_prf(average, modifier, msg_start, len(result))\n",
"C:\\Users\\Mishane\\anaconda3\\lib\\site-packages\\sklearn\\metrics\\_classification.py:1245: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior.\n",
" _warn_prf(average, modifier, msg_start, len(result))\n"
]
},
{
"data": {
"text/plain": [
"<Figure size 648x504 with 0 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"#Evaluating the classifier\n",
"#printing every score of the classifier\n",
"#scoring in any thing\n",
"from sklearn.metrics import classification_report, accuracy_score,precision_score,recall_score,f1_score,matthews_corrcoef\n",
"from sklearn.metrics import confusion_matrix\n",
"n_outliers = len(Stroke)\n",
"n_errors = (y_pred != Y_test).sum()\n",
"print(\"The model used is Random Forest classifier\")\n",
"acc= accuracy_score(Y_test,y_pred)\n",
"print(\"The accuracy is {}\".format(acc))\n",
"prec= precision_score(Y_test,y_pred)\n",
"print(\"The precision is {}\".format(prec))\n",
"rec= recall_score(Y_test,y_pred)\n",
"print(\"The recall is {}\".format(rec))\n",
"f1= f1_score(Y_test,y_pred)\n",
"print(\"The F1-Score is {}\".format(f1))\n",
"MCC=matthews_corrcoef(Y_test,y_pred)\n",
"print(\"The Matthews correlation coefficient is {}\".format(MCC))\n",
"\n",
"\n",
"#printing the confusion matrix\n",
"LABELS = ['Not Stroke', 'Stroke']\n",
"conf_matrix = confusion_matrix(Y_test, y_pred)\n",
"plt.figure(figsize=(12, 12))\n",
"sns.heatmap(conf_matrix, xticklabels=LABELS, yticklabels=LABELS, annot=True, fmt=\"d\");\n",
"plt.title(\"Confusion matrix\")\n",
"plt.ylabel('True class')\n",
"plt.xlabel('Predicted class')\n",
"plt.show()\n",
"\n",
"# Run classification metrics\n",
"plt.figure(figsize=(9, 7))\n",
"print('{}: {}'.format(\"Random Forest\", n_errors))\n",
"print(accuracy_score(Y_test, y_pred))\n",
"print(classification_report(Y_test, y_pred))"
]
},
{
"cell_type": "markdown",
"id": "b07d65d8",
"metadata": {},
"source": [
"### Saving the model"
]
},
{
"cell_type": "code",
"execution_count": 29,
"id": "02faf4e6",
"metadata": {},
"outputs": [],
"source": [
"import pickle\n",
"filename = 'finalized_model_disease_classification_LR.sav'\n",
"pickle.dump(rfc, open(filename, 'wb'))"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.8"
},
"vscode": {
"interpreter": {
"hash": "d9a6414fa631c028c434667d182c0b79dc634ffcd06f52fc304061a2c0b9ef26"
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment