Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
2
2022-005
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Analytics
Analytics
CI / CD
Repository
Value Stream
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Kasuni IT19154954
2022-005
Commits
55b876db
Commit
55b876db
authored
Oct 09, 2022
by
Kasuni IT19154954
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Upload New File
parent
c24e0cd8
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
1053 additions
and
0 deletions
+1053
-0
Heart_Disease_Prediction.ipynb
Heart_Disease_Prediction.ipynb
+1053
-0
No files found.
Heart_Disease_Prediction.ipynb
0 → 100644
View file @
55b876db
{
"cells": [
{
"cell_type": "markdown",
"id": "cbe9d5a0",
"metadata": {},
"source": [
"# Importing Dependencies"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "c99e5801",
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"import pandas as pd\n",
"from sklearn.model_selection import train_test_split\n",
"from sklearn.linear_model import LogisticRegression\n",
"from sklearn.metrics import accuracy_score"
]
},
{
"cell_type": "markdown",
"id": "f7b729a7",
"metadata": {},
"source": [
"# Data collection and processing\n",
"## Loading the csv data to a pandas Dataframe"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "f65bcd5a",
"metadata": {},
"outputs": [],
"source": [
"heart_data = pd.read_csv('heart_disease_data.csv')"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "9124a24e",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>age</th>\n",
" <th>sex</th>\n",
" <th>cp</th>\n",
" <th>trestbps</th>\n",
" <th>chol</th>\n",
" <th>fbs</th>\n",
" <th>restecg</th>\n",
" <th>thalach</th>\n",
" <th>exang</th>\n",
" <th>oldpeak</th>\n",
" <th>slope</th>\n",
" <th>ca</th>\n",
" <th>thal</th>\n",
" <th>target</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>63</td>\n",
" <td>1</td>\n",
" <td>3</td>\n",
" <td>145</td>\n",
" <td>233</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>150</td>\n",
" <td>0</td>\n",
" <td>2.3</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>37</td>\n",
" <td>1</td>\n",
" <td>2</td>\n",
" <td>130</td>\n",
" <td>250</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>187</td>\n",
" <td>0</td>\n",
" <td>3.5</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>2</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>41</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>130</td>\n",
" <td>204</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>172</td>\n",
" <td>0</td>\n",
" <td>1.4</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" <td>2</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>56</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>120</td>\n",
" <td>236</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>178</td>\n",
" <td>0</td>\n",
" <td>0.8</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" <td>2</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>57</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>120</td>\n",
" <td>354</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>163</td>\n",
" <td>1</td>\n",
" <td>0.6</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" <td>2</td>\n",
" <td>1</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" age sex cp trestbps chol fbs restecg thalach exang oldpeak slope \\\n",
"0 63 1 3 145 233 1 0 150 0 2.3 0 \n",
"1 37 1 2 130 250 0 1 187 0 3.5 0 \n",
"2 41 0 1 130 204 0 0 172 0 1.4 2 \n",
"3 56 1 1 120 236 0 1 178 0 0.8 2 \n",
"4 57 0 0 120 354 0 1 163 1 0.6 2 \n",
"\n",
" ca thal target \n",
"0 0 1 1 \n",
"1 0 2 1 \n",
"2 0 2 1 \n",
"3 0 2 1 \n",
"4 0 2 1 "
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# PRINT FIRST ROWS OF THE DATASET\n",
"heart_data.head()"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "5954f17a",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>age</th>\n",
" <th>sex</th>\n",
" <th>cp</th>\n",
" <th>trestbps</th>\n",
" <th>chol</th>\n",
" <th>fbs</th>\n",
" <th>restecg</th>\n",
" <th>thalach</th>\n",
" <th>exang</th>\n",
" <th>oldpeak</th>\n",
" <th>slope</th>\n",
" <th>ca</th>\n",
" <th>thal</th>\n",
" <th>target</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>298</th>\n",
" <td>57</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>140</td>\n",
" <td>241</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>123</td>\n",
" <td>1</td>\n",
" <td>0.2</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>3</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>299</th>\n",
" <td>45</td>\n",
" <td>1</td>\n",
" <td>3</td>\n",
" <td>110</td>\n",
" <td>264</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>132</td>\n",
" <td>0</td>\n",
" <td>1.2</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>3</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>300</th>\n",
" <td>68</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>144</td>\n",
" <td>193</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>141</td>\n",
" <td>0</td>\n",
" <td>3.4</td>\n",
" <td>1</td>\n",
" <td>2</td>\n",
" <td>3</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>301</th>\n",
" <td>57</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>130</td>\n",
" <td>131</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>115</td>\n",
" <td>1</td>\n",
" <td>1.2</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>3</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>302</th>\n",
" <td>57</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>130</td>\n",
" <td>236</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>174</td>\n",
" <td>0</td>\n",
" <td>0.0</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" age sex cp trestbps chol fbs restecg thalach exang oldpeak \\\n",
"298 57 0 0 140 241 0 1 123 1 0.2 \n",
"299 45 1 3 110 264 0 1 132 0 1.2 \n",
"300 68 1 0 144 193 1 1 141 0 3.4 \n",
"301 57 1 0 130 131 0 1 115 1 1.2 \n",
"302 57 0 1 130 236 0 0 174 0 0.0 \n",
"\n",
" slope ca thal target \n",
"298 1 0 3 0 \n",
"299 1 0 3 0 \n",
"300 1 2 3 0 \n",
"301 1 1 3 0 \n",
"302 1 1 2 0 "
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# PRINT LAST FIVE ROWS OF THE DATASET\n",
"heart_data.tail()"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "ce7fe832",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(303, 14)"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# CHECK NO OF ROWS AND COLMUNS IN THE DATASET\n",
"heart_data.shape"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "9acfef84",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<class 'pandas.core.frame.DataFrame'>\n",
"RangeIndex: 303 entries, 0 to 302\n",
"Data columns (total 14 columns):\n",
" # Column Non-Null Count Dtype \n",
"--- ------ -------------- ----- \n",
" 0 age 303 non-null int64 \n",
" 1 sex 303 non-null int64 \n",
" 2 cp 303 non-null int64 \n",
" 3 trestbps 303 non-null int64 \n",
" 4 chol 303 non-null int64 \n",
" 5 fbs 303 non-null int64 \n",
" 6 restecg 303 non-null int64 \n",
" 7 thalach 303 non-null int64 \n",
" 8 exang 303 non-null int64 \n",
" 9 oldpeak 303 non-null float64\n",
" 10 slope 303 non-null int64 \n",
" 11 ca 303 non-null int64 \n",
" 12 thal 303 non-null int64 \n",
" 13 target 303 non-null int64 \n",
"dtypes: float64(1), int64(13)\n",
"memory usage: 33.3 KB\n"
]
}
],
"source": [
"# GETTING BASIC INFO ON THE DATASET\n",
"heart_data.info()"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "791164d0",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"age 0\n",
"sex 0\n",
"cp 0\n",
"trestbps 0\n",
"chol 0\n",
"fbs 0\n",
"restecg 0\n",
"thalach 0\n",
"exang 0\n",
"oldpeak 0\n",
"slope 0\n",
"ca 0\n",
"thal 0\n",
"target 0\n",
"dtype: int64"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#CHECKING FOR MISSING VALUES\n",
"heart_data.isnull().sum()"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "35c10cd1",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>age</th>\n",
" <th>sex</th>\n",
" <th>cp</th>\n",
" <th>trestbps</th>\n",
" <th>chol</th>\n",
" <th>fbs</th>\n",
" <th>restecg</th>\n",
" <th>thalach</th>\n",
" <th>exang</th>\n",
" <th>oldpeak</th>\n",
" <th>slope</th>\n",
" <th>ca</th>\n",
" <th>thal</th>\n",
" <th>target</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>count</th>\n",
" <td>303.000000</td>\n",
" <td>303.000000</td>\n",
" <td>303.000000</td>\n",
" <td>303.000000</td>\n",
" <td>303.000000</td>\n",
" <td>303.000000</td>\n",
" <td>303.000000</td>\n",
" <td>303.000000</td>\n",
" <td>303.000000</td>\n",
" <td>303.000000</td>\n",
" <td>303.000000</td>\n",
" <td>303.000000</td>\n",
" <td>303.000000</td>\n",
" <td>303.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>mean</th>\n",
" <td>54.366337</td>\n",
" <td>0.683168</td>\n",
" <td>0.966997</td>\n",
" <td>131.623762</td>\n",
" <td>246.264026</td>\n",
" <td>0.148515</td>\n",
" <td>0.528053</td>\n",
" <td>149.646865</td>\n",
" <td>0.326733</td>\n",
" <td>1.039604</td>\n",
" <td>1.399340</td>\n",
" <td>0.729373</td>\n",
" <td>2.313531</td>\n",
" <td>0.544554</td>\n",
" </tr>\n",
" <tr>\n",
" <th>std</th>\n",
" <td>9.082101</td>\n",
" <td>0.466011</td>\n",
" <td>1.032052</td>\n",
" <td>17.538143</td>\n",
" <td>51.830751</td>\n",
" <td>0.356198</td>\n",
" <td>0.525860</td>\n",
" <td>22.905161</td>\n",
" <td>0.469794</td>\n",
" <td>1.161075</td>\n",
" <td>0.616226</td>\n",
" <td>1.022606</td>\n",
" <td>0.612277</td>\n",
" <td>0.498835</td>\n",
" </tr>\n",
" <tr>\n",
" <th>min</th>\n",
" <td>29.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>94.000000</td>\n",
" <td>126.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>71.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>25%</th>\n",
" <td>47.500000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>120.000000</td>\n",
" <td>211.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>133.500000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>1.000000</td>\n",
" <td>0.000000</td>\n",
" <td>2.000000</td>\n",
" <td>0.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>50%</th>\n",
" <td>55.000000</td>\n",
" <td>1.000000</td>\n",
" <td>1.000000</td>\n",
" <td>130.000000</td>\n",
" <td>240.000000</td>\n",
" <td>0.000000</td>\n",
" <td>1.000000</td>\n",
" <td>153.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.800000</td>\n",
" <td>1.000000</td>\n",
" <td>0.000000</td>\n",
" <td>2.000000</td>\n",
" <td>1.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>75%</th>\n",
" <td>61.000000</td>\n",
" <td>1.000000</td>\n",
" <td>2.000000</td>\n",
" <td>140.000000</td>\n",
" <td>274.500000</td>\n",
" <td>0.000000</td>\n",
" <td>1.000000</td>\n",
" <td>166.000000</td>\n",
" <td>1.000000</td>\n",
" <td>1.600000</td>\n",
" <td>2.000000</td>\n",
" <td>1.000000</td>\n",
" <td>3.000000</td>\n",
" <td>1.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>max</th>\n",
" <td>77.000000</td>\n",
" <td>1.000000</td>\n",
" <td>3.000000</td>\n",
" <td>200.000000</td>\n",
" <td>564.000000</td>\n",
" <td>1.000000</td>\n",
" <td>2.000000</td>\n",
" <td>202.000000</td>\n",
" <td>1.000000</td>\n",
" <td>6.200000</td>\n",
" <td>2.000000</td>\n",
" <td>4.000000</td>\n",
" <td>3.000000</td>\n",
" <td>1.000000</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" age sex cp trestbps chol fbs \\\n",
"count 303.000000 303.000000 303.000000 303.000000 303.000000 303.000000 \n",
"mean 54.366337 0.683168 0.966997 131.623762 246.264026 0.148515 \n",
"std 9.082101 0.466011 1.032052 17.538143 51.830751 0.356198 \n",
"min 29.000000 0.000000 0.000000 94.000000 126.000000 0.000000 \n",
"25% 47.500000 0.000000 0.000000 120.000000 211.000000 0.000000 \n",
"50% 55.000000 1.000000 1.000000 130.000000 240.000000 0.000000 \n",
"75% 61.000000 1.000000 2.000000 140.000000 274.500000 0.000000 \n",
"max 77.000000 1.000000 3.000000 200.000000 564.000000 1.000000 \n",
"\n",
" restecg thalach exang oldpeak slope ca \\\n",
"count 303.000000 303.000000 303.000000 303.000000 303.000000 303.000000 \n",
"mean 0.528053 149.646865 0.326733 1.039604 1.399340 0.729373 \n",
"std 0.525860 22.905161 0.469794 1.161075 0.616226 1.022606 \n",
"min 0.000000 71.000000 0.000000 0.000000 0.000000 0.000000 \n",
"25% 0.000000 133.500000 0.000000 0.000000 1.000000 0.000000 \n",
"50% 1.000000 153.000000 0.000000 0.800000 1.000000 0.000000 \n",
"75% 1.000000 166.000000 1.000000 1.600000 2.000000 1.000000 \n",
"max 2.000000 202.000000 1.000000 6.200000 2.000000 4.000000 \n",
"\n",
" thal target \n",
"count 303.000000 303.000000 \n",
"mean 2.313531 0.544554 \n",
"std 0.612277 0.498835 \n",
"min 0.000000 0.000000 \n",
"25% 2.000000 0.000000 \n",
"50% 2.000000 1.000000 \n",
"75% 3.000000 1.000000 \n",
"max 3.000000 1.000000 "
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# STATISTICAL MEASURES ABOUT THE DATA\n",
"heart_data.describe()"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "25be35a3",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"1 165\n",
"0 138\n",
"Name: target, dtype: int64"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# CHECKING THE DISTRIBUTION OF TARGET VARIABLE\n",
"heart_data['target'].value_counts()"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "0985647e",
"metadata": {},
"outputs": [],
"source": [
"# 1--> Defective Heart\n",
"# 2--> Healthy"
]
},
{
"cell_type": "markdown",
"id": "9783fe5a",
"metadata": {},
"source": [
"## Spliting the Features and Target"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "b42d74b4",
"metadata": {},
"outputs": [],
"source": [
"X = heart_data.drop(columns='target',axis=1)\n",
"Y = heart_data['target']"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "73d640e5",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" age sex cp trestbps chol fbs restecg thalach exang oldpeak \\\n",
"0 63 1 3 145 233 1 0 150 0 2.3 \n",
"1 37 1 2 130 250 0 1 187 0 3.5 \n",
"2 41 0 1 130 204 0 0 172 0 1.4 \n",
"3 56 1 1 120 236 0 1 178 0 0.8 \n",
"4 57 0 0 120 354 0 1 163 1 0.6 \n",
".. ... ... .. ... ... ... ... ... ... ... \n",
"298 57 0 0 140 241 0 1 123 1 0.2 \n",
"299 45 1 3 110 264 0 1 132 0 1.2 \n",
"300 68 1 0 144 193 1 1 141 0 3.4 \n",
"301 57 1 0 130 131 0 1 115 1 1.2 \n",
"302 57 0 1 130 236 0 0 174 0 0.0 \n",
"\n",
" slope ca thal \n",
"0 0 0 1 \n",
"1 0 0 2 \n",
"2 2 0 2 \n",
"3 2 0 2 \n",
"4 2 0 2 \n",
".. ... .. ... \n",
"298 1 0 3 \n",
"299 1 0 3 \n",
"300 1 2 3 \n",
"301 1 1 3 \n",
"302 1 1 2 \n",
"\n",
"[303 rows x 13 columns]\n"
]
}
],
"source": [
"print(X)"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "6e358ee2",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0 1\n",
"1 1\n",
"2 1\n",
"3 1\n",
"4 1\n",
" ..\n",
"298 0\n",
"299 0\n",
"300 0\n",
"301 0\n",
"302 0\n",
"Name: target, Length: 303, dtype: int64\n"
]
}
],
"source": [
"print(Y)"
]
},
{
"cell_type": "markdown",
"id": "ac33cbba",
"metadata": {},
"source": [
"## Spliting the Data into Training Data & Test Data"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "0254915d",
"metadata": {},
"outputs": [],
"source": [
"X_train,X_test,Y_train,Y_test = train_test_split(X,Y,test_size=0.2,stratify=Y,random_state=2)"
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "fbf07416",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(303, 13) (242, 13) (61, 13)\n"
]
}
],
"source": [
"print(X.shape, X_train.shape, X_test.shape)"
]
},
{
"cell_type": "markdown",
"id": "c7923434",
"metadata": {},
"source": [
"## Model Training logistic Regression "
]
},
{
"cell_type": "code",
"execution_count": 18,
"id": "f019f978",
"metadata": {},
"outputs": [],
"source": [
"model = LogisticRegression()"
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "6250666e",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"C:\\Users\\SHEHAN\\anaconda3\\lib\\site-packages\\sklearn\\linear_model\\_logistic.py:814: ConvergenceWarning: lbfgs failed to converge (status=1):\n",
"STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.\n",
"\n",
"Increase the number of iterations (max_iter) or scale the data as shown in:\n",
" https://scikit-learn.org/stable/modules/preprocessing.html\n",
"Please also refer to the documentation for alternative solver options:\n",
" https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression\n",
" n_iter_i = _check_optimize_result(\n"
]
},
{
"data": {
"text/plain": [
"LogisticRegression()"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# TRAINING THE logistic Regression MODEL WITH TRAINING DATA\n",
"model.fit(X_train,Y_train)"
]
},
{
"cell_type": "markdown",
"id": "42117c2b",
"metadata": {},
"source": [
"## Model Evaluation Accuracy Score"
]
},
{
"cell_type": "code",
"execution_count": 20,
"id": "9569f252",
"metadata": {},
"outputs": [],
"source": [
"# ACCURACY ON TRAINING DATA\n",
"X_train_prediction = model.predict(X_train)\n",
"training_data_accuracy = accuracy_score(X_train_prediction,Y_train)"
]
},
{
"cell_type": "code",
"execution_count": 21,
"id": "6cc793e3",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Accuracy of Training Data: 0.871900826446281\n"
]
}
],
"source": [
"print('Accuracy of Training Data:',training_data_accuracy)"
]
},
{
"cell_type": "code",
"execution_count": 22,
"id": "d6e63403",
"metadata": {},
"outputs": [],
"source": [
"# ACCURACY ON THE TEST DATA\n",
"X_test_prediction = model.predict(X_test)\n",
"test_data_accuracy = accuracy_score(X_test_prediction,Y_test)"
]
},
{
"cell_type": "code",
"execution_count": 23,
"id": "0cf3a924",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Accuracy on the Test Data: 0.8524590163934426\n"
]
}
],
"source": [
"print('Accuracy on the Test Data:',test_data_accuracy)"
]
},
{
"cell_type": "markdown",
"id": "1bea8e84",
"metadata": {},
"source": [
"## Building the Predictive System"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a9b0f8fd",
"metadata": {},
"outputs": [],
"source": [
"input_data = (62,0,0,140,268,0,0,160,0,3.6,0,2,2)\n",
"\n",
"# CHANGE THE INPUT DATA TO A NUMPY ARRAY\n",
"input_data_as_numpy_array = np.asarray(input_data)\n",
"\n",
"# RESHAPE THE NUMPY ARRAY \n",
"input_data_reshaped = input_data_as_numpy_array.reshape(1,-1)\n",
"\n",
"prediction = model.predict(input_data_reshaped)\n",
"print(prediction)\n",
"\n",
"if (prediction[0]==0):\n",
" print('The person doesnt have heart disease')\n",
"else:\n",
" print('The person has heart disease')"
]
},
{
"cell_type": "markdown",
"id": "c908d135",
"metadata": {},
"source": [
"## Save Model in Pickle"
]
},
{
"cell_type": "code",
"execution_count": 27,
"id": "95690126",
"metadata": {},
"outputs": [],
"source": [
"import pickle"
]
},
{
"cell_type": "code",
"execution_count": 28,
"id": "231643c2",
"metadata": {},
"outputs": [],
"source": [
"filename = 'Heart_disease_trained_model.pkl'\n",
"pickle.dump(model,open(filename,'wb'))"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.12"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment