Merge branch 'master' into IT18500790

2d01e32f · Mihiranga G.L.V - IT18500790 · 130cf6f3 · 6941d901 · 2d01e32f · 2d01e32f
Commit 2d01e32f authored Oct 01, 2021 by Mihiranga G.L.V - IT18500790
11 changed files
--- a/README.md
+++ b/README.md
@@ -10,3 +10,37 @@ Some travelers do not know much about trains and attractive destinations to choo
 Passengers cannot get information about train stopping station, facilities, and the time duration of the trip. Railway passengers cannot get the needful and exact information about their visiting places.
 When passengers booking the ticket they cannot be booked which seat they want and there is no process to suggest the most preferred seating place. It may cause personal conflicts between closely seated passengers due to differences of personal interest.
 Difficult to find the visitor's attracted places that are located near the trip route. Therefore unpopular attractions are missed by many travelers. It is affected by traveling passengers and the tourism industry.
+**Individual research question**
+***IT18500790***
+Some travelers do not know much about trains and attractive destinations to choose from. At present, there is no application to guide tourists to the tourist destinations of their choice. Therefore passenger needs to spend more time for find the visiting places. It may cause them to waste their time which can spend on their enjoyment.
+***IT18085822***
+Passengers cannot get information about train stopping station, facilities, and the time duration of the trip. Railway passengers cannot get the needful and exact information about their visiting places.
+***IT18001280***
+Providing a facility to view available seats and suggest the most suitable seating place for a particular passenger. Providing the selection facility to choosing the seating place according to their choice.
+***IT18148282***
+Suggest the best visiting places according to the passenger's personal trip plan. Providing the most suitable visiting places suggestions by using passenger's relevant information.
+**Individual Objectives**
+***IT18500790***
+Providing the most suitable train plan to the passenger according to their needs by using a machine learning algorithm.
+***IT18085822***
+Machine learning-based chat-bot app to interact with the user 24 X 7, providing relevant information like train facilities, place information which suggest by trip schedule, etc. to the users according to the user queries.
+***IT18001280***
+Sequentially predict the most suitable seat for the passenger by using a machine learning algorithm.
+***IT18148282***
+Suggest the best places to visit for the passenger using a machine learning algorithm by gathering relevant data from the railway passengers.
+**Solution**
+Passengers can select their train schedule for traveling. But sometimes they miss out on places of their choice. This situation can reduce if predicting the trip schedule they want.
+Presently passenger wants to find some information about the particular train and system solves that issue by introducing a new machine learning-based chat-bot app for the users. Users can get information about a specific location by using the chat-bot application.
+Providing a facility to view available seats and suggest the most suitable seating place for a particular passenger. Providing the selection facility to choosing the seating place according to their choice.
+Suggest the best visiting places according to the passenger's personal trip plan. Providing the most suitable visiting places suggestions by using passenger's relevant information.
--- a/Railway_Passenger_Final.csv
+++ b/Railway_Passenger_Final.csv
--- a/Untitled.ipynb
+++ b/Untitled.ipynb
+{
+ "cells": [
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "b2b62b44",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import pandas as pd\n",
+    "import numpy as np\n",
+    "import matplotlib.pyplot as plt\n",
+    "import seaborn as sns\n",
+    "from tqdm.notebook import tqdm\n",
+    "from collections import Counter\n",
+    "from sklearn import metrics\n",
+    "from sklearn.linear_model import LogisticRegression\n",
+    "from sklearn.model_selection import train_test_split"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "11016a95",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import warnings\n",
+    "warnings.filterwarnings('ignore')"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "cec68912",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "data = pd.read_csv(\"Railway_Passenger_Final.csv\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "0870d81a",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>Age</th>\n",
+       "      <th>Country</th>\n",
+       "      <th>Disability</th>\n",
+       "      <th>Class</th>\n",
+       "      <th>GenderNo</th>\n",
+       "      <th>LineNo</th>\n",
+       "      <th>SeatLine</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>38</td>\n",
+       "      <td>169</td>\n",
+       "      <td>0</td>\n",
+       "      <td>1</td>\n",
+       "      <td>1</td>\n",
+       "      <td>3</td>\n",
+       "      <td>6</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>33</td>\n",
+       "      <td>165</td>\n",
+       "      <td>0</td>\n",
+       "      <td>2</td>\n",
+       "      <td>1</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>52</td>\n",
+       "      <td>186</td>\n",
+       "      <td>0</td>\n",
+       "      <td>1</td>\n",
+       "      <td>2</td>\n",
+       "      <td>6</td>\n",
+       "      <td>8</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>52</td>\n",
+       "      <td>165</td>\n",
+       "      <td>0</td>\n",
+       "      <td>2</td>\n",
+       "      <td>2</td>\n",
+       "      <td>6</td>\n",
+       "      <td>8</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4</th>\n",
+       "      <td>33</td>\n",
+       "      <td>165</td>\n",
+       "      <td>0</td>\n",
+       "      <td>2</td>\n",
+       "      <td>1</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>...</th>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>7387</th>\n",
+       "      <td>56</td>\n",
+       "      <td>9</td>\n",
+       "      <td>1</td>\n",
+       "      <td>2</td>\n",
+       "      <td>2</td>\n",
+       "      <td>1</td>\n",
+       "      <td>1</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>7388</th>\n",
+       "      <td>48</td>\n",
+       "      <td>165</td>\n",
+       "      <td>1</td>\n",
+       "      <td>1</td>\n",
+       "      <td>1</td>\n",
+       "      <td>5</td>\n",
+       "      <td>6</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>7389</th>\n",
+       "      <td>16</td>\n",
+       "      <td>77</td>\n",
+       "      <td>0</td>\n",
+       "      <td>1</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>10</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>7390</th>\n",
+       "      <td>14</td>\n",
+       "      <td>165</td>\n",
+       "      <td>0</td>\n",
+       "      <td>1</td>\n",
+       "      <td>1</td>\n",
+       "      <td>5</td>\n",
+       "      <td>4</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>7391</th>\n",
+       "      <td>38</td>\n",
+       "      <td>77</td>\n",
+       "      <td>0</td>\n",
+       "      <td>1</td>\n",
+       "      <td>1</td>\n",
+       "      <td>5</td>\n",
+       "      <td>9</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "<p>7392 rows × 7 columns</p>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "      Age  Country  Disability  Class  GenderNo  LineNo  SeatLine\n",
+       "0      38      169           0      1         1       3         6\n",
+       "1      33      165           0      2         1       1         4\n",
+       "2      52      186           0      1         2       6         8\n",
+       "3      52      165           0      2         2       6         8\n",
+       "4      33      165           0      2         1       1         4\n",
+       "...   ...      ...         ...    ...       ...     ...       ...\n",
+       "7387   56        9           1      2         2       1         1\n",
+       "7388   48      165           1      1         1       5         6\n",
+       "7389   16       77           0      1         1       4        10\n",
+       "7390   14      165           0      1         1       5         4\n",
+       "7391   38       77           0      1         1       5         9\n",
+       "\n",
+       "[7392 rows x 7 columns]"
+      ]
+     },
+     "execution_count": 4,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "data"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "dc782f98",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>Age</th>\n",
+       "      <th>Country</th>\n",
+       "      <th>Disability</th>\n",
+       "      <th>Class</th>\n",
+       "      <th>GenderNo</th>\n",
+       "      <th>LineNo</th>\n",
+       "      <th>SeatLine</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "Empty DataFrame\n",
+       "Columns: [Age, Country, Disability, Class, GenderNo, LineNo, SeatLine]\n",
+       "Index: []"
+      ]
+     },
+     "execution_count": 5,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "data[pd.isnull(data).any(axis=1)]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "b7d860da",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "Y = data.SeatLine.copy()\n",
+    "X = data.drop(['SeatLine'], axis=1)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "id": "b85ff8d6",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "4     1928\n",
+       "7     1842\n",
+       "5     1015\n",
+       "6      915\n",
+       "3      742\n",
+       "8      679\n",
+       "2       76\n",
+       "9       68\n",
+       "10      64\n",
+       "1       63\n",
+       "Name: SeatLine, dtype: int64"
+      ]
+     },
+     "execution_count": 7,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "data['SeatLine'].value_counts()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "id": "d498a8ce",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def pearson(X,Y):\n",
+    "    correlation_matrix = np.corrcoef(X,Y)\n",
+    "    return correlation_matrix[0,1]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "id": "c041cb4c",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "-0.0035806135588234236\n",
+      "-0.009099476534655794\n",
+      "0.01109999125101634\n",
+      "0.7410300176783485\n"
+     ]
+    }
+   ],
+   "source": [
+    "print(pearson(X.Age, Y))\n",
+    "print(pearson(X.Country, Y))\n",
+    "print(pearson(X.Disability, Y))\n",
+    "print(pearson(X.GenderNo, Y))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "id": "094a84fa",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "-0.09118144742552847\n",
+      "-0.627520395988806\n",
+      "0.0036254166615416767\n",
+      "0.6412361822996381\n"
+     ]
+    }
+   ],
+   "source": [
+    "print(np.cov(X.Age, Y)[0,1])\n",
+    "print(np.cov(X.Country, Y)[0,1])\n",
+    "print(np.cov(X.Disability, Y)[0,1])\n",
+    "print(np.cov(X.GenderNo, Y)[0,1])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 11,
+   "id": "c001994a",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>Age</th>\n",
+       "      <th>Country</th>\n",
+       "      <th>Disability</th>\n",
+       "      <th>Class</th>\n",
+       "      <th>GenderNo</th>\n",
+       "      <th>LineNo</th>\n",
+       "      <th>SeatLine</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>38</td>\n",
+       "      <td>169</td>\n",
+       "      <td>0</td>\n",
+       "      <td>1</td>\n",
+       "      <td>1</td>\n",
+       "      <td>3</td>\n",
+       "      <td>6</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>33</td>\n",
+       "      <td>165</td>\n",
+       "      <td>0</td>\n",
+       "      <td>2</td>\n",
+       "      <td>1</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>52</td>\n",
+       "      <td>186</td>\n",
+       "      <td>0</td>\n",
+       "      <td>1</td>\n",
+       "      <td>2</td>\n",
+       "      <td>6</td>\n",
+       "      <td>8</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>52</td>\n",
+       "      <td>165</td>\n",
+       "      <td>0</td>\n",
+       "      <td>2</td>\n",
+       "      <td>2</td>\n",
+       "      <td>6</td>\n",
+       "      <td>8</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4</th>\n",
+       "      <td>33</td>\n",
+       "      <td>165</td>\n",
+       "      <td>0</td>\n",
+       "      <td>2</td>\n",
+       "      <td>1</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "   Age  Country  Disability  Class  GenderNo  LineNo  SeatLine\n",
+       "0   38      169           0      1         1       3         6\n",
+       "1   33      165           0      2         1       1         4\n",
+       "2   52      186           0      1         2       6         8\n",
+       "3   52      165           0      2         2       6         8\n",
+       "4   33      165           0      2         1       1         4"
+      ]
+     },
+     "execution_count": 11,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "data.head()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 12,
+   "id": "0de26781",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "finalFeaturedDataset = data[['Age', 'Country','Disability','Class','GenderNo']]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 13,
+   "id": "0042371f",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from sklearn.preprocessing import MinMaxScaler\n",
+    "scaler = MinMaxScaler(feature_range=(0,1)) \n",
+    "\n",
+    "#assign scaler to column:\n",
+    "data = pd.DataFrame(scaler.fit_transform(finalFeaturedDataset), columns=finalFeaturedDataset.columns)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 14,
+   "id": "c5b4937c",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.20, random_state=123)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 15,
+   "id": "bb0a6886",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from sklearn.svm import SVC, LinearSVC"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 16,
+   "id": "da921c77",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "application/vnd.jupyter.widget-view+json": {
+       "model_id": "1dad2d9530bd43c7b6abdadac29fdfa4",
+       "version_major": 2,
+       "version_minor": 0
+      },
+      "text/plain": [
+       "  0%|          | 0/10 [00:00<?, ?it/s]"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    }
+   ],
+   "source": [
+    "#X_train, X_test, Y_train, Y_test,Y_pred\n",
+    "\n",
+    "linear_svc = LinearSVC()\n",
+    "for i in tqdm(range(10)):\n",
+    "    linear_svc.fit(X_train, Y_train)\n",
+    "\n",
+    "Y_pred = linear_svc.predict(X_test)\n",
+    "\n",
+    "acc_linear_svc = metrics.accuracy_score(Y_test, Y_pred)\n",
+    "pre_linear_svc = metrics.precision_score(Y_test,Y_pred, average='macro')\n",
+    "recall_linear_svc = metrics.recall_score(Y_test,Y_pred, average='macro')\n",
+    "f1_linear_svc = metrics.f1_score(Y_test,Y_pred, average='macro')"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 17,
+   "id": "a79ef29a",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Accuracy: 0.2738336713995943\n",
+      "Precision: 0.35832646520146516\n",
+      "Recall: 0.11236006475404055\n",
+      "f1-score: 0.06488542557685116\n"
+     ]
+    }
+   ],
+   "source": [
+    "print(\"Accuracy:\", metrics.accuracy_score(Y_test, Y_pred))\n",
+    "print(\"Precision:\", metrics.precision_score(Y_test, Y_pred,average='macro'))\n",
+    "print(\"Recall:\", metrics.recall_score(Y_test, Y_pred,average='macro'))\n",
+    "print(\"f1-score:\", metrics.f1_score(Y_test,Y_pred, average='macro'))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 18,
+   "id": "ee385170",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from sklearn.neighbors import KNeighborsClassifier"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 19,
+   "id": "a1bb8f06",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "application/vnd.jupyter.widget-view+json": {
+       "model_id": "0df9475381644df594ce15d28dfe72a5",
+       "version_major": 2,
+       "version_minor": 0
+      },
+      "text/plain": [
+       "  0%|          | 0/10 [00:00<?, ?it/s]"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    }
+   ],
+   "source": [
+    "#X_train, X_test, Y_train, Y_test\n",
+    "\n",
+    "knn = KNeighborsClassifier(n_neighbors = 2)\n",
+    "for i in tqdm(range(10)):\n",
+    "    knn.fit(X_train, Y_train)  \n",
+    "Y_pred = knn.predict(X_test)\n",
+    "\n",
+    "\n",
+    "acc_knn = metrics.accuracy_score(Y_test, Y_pred)\n",
+    "pre_knn = metrics.precision_score(Y_test,Y_pred, average='macro')\n",
+    "recall_knn = metrics.recall_score(Y_test,Y_pred, average='macro')\n",
+    "f1_knn = metrics.f1_score(Y_test,Y_pred, average='macro')"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 20,
+   "id": "f25572c9",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Accuracy: 0.9350912778904665\n",
+      "Precision: 0.8078615566477673\n",
+      "Recall: 0.717261850491169\n",
+      "f1-score: 0.7114544346026009\n"
+     ]
+    }
+   ],
+   "source": [
+    "print(\"Accuracy:\", metrics.accuracy_score(Y_test, Y_pred))\n",
+    "print(\"Precision:\", metrics.precision_score(Y_test, Y_pred,average='macro'))\n",
+    "print(\"Recall:\", metrics.recall_score(Y_test, Y_pred,average='macro'))\n",
+    "print(\"f1-score:\", metrics.f1_score(Y_test,Y_pred, average='macro'))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 21,
+   "id": "ac6748e8",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from sklearn.ensemble import RandomForestClassifier"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 22,
+   "id": "2f48f36e",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "application/vnd.jupyter.widget-view+json": {
+       "model_id": "d2d315141cb1411ebf65727eae04efc9",
+       "version_major": 2,
+       "version_minor": 0
+      },
+      "text/plain": [
+       "  0%|          | 0/10 [00:00<?, ?it/s]"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    }
+   ],
+   "source": [
+    "rf = RandomForestClassifier()\n",
+    "for i in tqdm(range(10)):\n",
+    "    rf.fit(X_train,Y_train)\n",
+    "Y_pred = rf.predict(X_test)\n",
+    "\n",
+    "acc_rf = metrics.accuracy_score(Y_test, Y_pred)\n",
+    "pre_rf = metrics.precision_score(Y_test,Y_pred, average='macro')\n",
+    "recall_rf = metrics.recall_score(Y_test,Y_pred, average='macro')\n",
+    "f1_rf = metrics.f1_score(Y_test,Y_pred, average='macro')"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 23,
+   "id": "fc1ebd91",
+   "metadata": {
+    "scrolled": true
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Accuracy: 0.9445571331981069\n",
+      "Precision: 0.7413751655908479\n",
+      "Recall: 0.7385777558448817\n",
+      "f1-score: 0.7368101892594321\n"
+     ]
+    }
+   ],
+   "source": [
+    "print(\"Accuracy:\", metrics.accuracy_score(Y_test, Y_pred))\n",
+    "print(\"Precision:\", metrics.precision_score(Y_test, Y_pred,average='macro'))\n",
+    "print(\"Recall:\", metrics.recall_score(Y_test, Y_pred,average='macro'))\n",
+    "print(\"f1-score:\", metrics.f1_score(Y_test,Y_pred, average='macro'))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 24,
+   "id": "e7086868",
+   "metadata": {},
+   "outputs": [
+    {
+     "ename": "NameError",
+     "evalue": "name 'acc_log' is not defined",
+     "output_type": "error",
+     "traceback": [
+      "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
+      "\u001b[1;31mNameError\u001b[0m                                 Traceback (most recent call last)",
+      "\u001b[1;32m<ipython-input-24-d117186c3a1d>\u001b[0m in \u001b[0;36m<module>\u001b[1;34m\u001b[0m\n\u001b[0;32m      1\u001b[0m results = pd.DataFrame({\n\u001b[0;32m      2\u001b[0m     \u001b[1;34m'Model'\u001b[0m\u001b[1;33m:\u001b[0m \u001b[1;33m[\u001b[0m\u001b[1;34m'Support Vector Machines'\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;34m'KNN'\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;34m'Logistic Regression'\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;34m'Random Forest'\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 3\u001b[1;33m     \u001b[1;34m'Accuracy'\u001b[0m\u001b[1;33m:\u001b[0m \u001b[1;33m[\u001b[0m\u001b[0macc_linear_svc\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0macc_knn\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0macc_log\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0macc_rf\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m      4\u001b[0m     \u001b[1;34m'Precission'\u001b[0m\u001b[1;33m:\u001b[0m \u001b[1;33m[\u001b[0m\u001b[0mpre_linear_svc\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mpre_knn\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mpre_log\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mpre_rf\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m      5\u001b[0m     \u001b[1;34m'Recall'\u001b[0m\u001b[1;33m:\u001b[0m \u001b[1;33m[\u001b[0m\u001b[0mrecall_linear_svc\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mrecall_knn\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mrecall_log\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mrecall_rf\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
+      "\u001b[1;31mNameError\u001b[0m: name 'acc_log' is not defined"
+     ]
+    }
+   ],
+   "source": [
+    "results = pd.DataFrame({\n",
+    "    'Model': ['Support Vector Machines', 'KNN', 'Random Forest'],\n",
+    "    'Accuracy': [acc_linear_svc, acc_knn, acc_rf],\n",
+    "    'Precission': [pre_linear_svc, pre_knn, pre_log, pre_rf],\n",
+    "    'Recall': [recall_linear_svc, recall_knn, recall_log, recall_rf],\n",
+    "    'F1-Score': [f1_linear_svc, f1_knn, f1_log, f1_rf]})\n",
+    "\n",
+    "result_df = results.sort_values(by='Accuracy', ascending=False)\n",
+    "result_df = result_df.set_index('Accuracy')\n",
+    "result_df"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "edd5c2db",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "results= pd.DataFrame({'Model': ['S V M', 'KNN', 'Logistic R','Random Forest'], 'Score': [acc_linear_svc, acc_knn, acc_log, acc_rf ]})\n",
+    "\n",
+    "ax = results.plot.bar(x='Model', y='Score', rot=90)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "f3148b25",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "#Save the Decision Tree strained modelusing pickle\n",
+    "import pickle\n",
+    "with open('ab_classifier_Random_forest', 'wb') as picklefile:\n",
+    "    pickle.dump(rf,picklefile)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "698412e2",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "with open('ab_classifier_Random_forest', 'rb') as training_model:\n",
+    "    model6 = pickle.load(training_model)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "f5964b95",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def start_questionnaire():\n",
+    "    my_predictors = []\n",
+    "    parameters=['Age', 'Count','Country','Disability','Class','GenderNo']\n",
+    "    \n",
+    "    print('Input Passenger Information:')\n",
+    "    \n",
+    "    age = input(\"Passenger age: >>> \") \n",
+    "    my_predictors.append(age)\n",
+    "    count = input(\"Passenger Count: >>> \") \n",
+    "    my_predictors.append(count)\n",
+    "    country = input(\"Passenger Country: >>> \") \n",
+    "    my_predictors.append(country)\n",
+    "    disability = input(\"Any Disability: >>> \")\n",
+    "    my_predictors.append(disability)\n",
+    "    classNo = input(\"Choice Class: >>> \")\n",
+    "    my_predictors.append(classNo)\n",
+    "    gender = input(\"Passenger Gender: >>> \")\n",
+    "    my_predictors.append(gender)\n",
+    "    \n",
+    "    my_data = dict(zip(parameters, my_predictors))\n",
+    "    my_df = pd.DataFrame(my_data, index=[0])\n",
+    "    scaler = MinMaxScaler(feature_range=(1,6))\n",
+    "    \n",
+    "    # assign scaler to column:\n",
+    "    my_df_scaled = pd.DataFrame(scaler.fit_transform(my_df), columns=my_df.columns)\n",
+    "    my_y_pred = model6.predict(my_df)\n",
+    "    print('\\n')\n",
+    "    print('Result:')\n",
+    "    print(my_y_pred);"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "b1b355f4",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "start_questionnaire()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e622dcf5",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.8.8"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/chat.py
+++ b/chat.py
+import random
+import json
+import torch
+from model import NeuralNet
+from nltk_utils import bag_of_words, tokenize
+device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
+with open('intents.json', 'r') as json_data:
+    intents = json.load(json_data)
+FILE = "data.pth"
+data = torch.load(FILE)
+input_size = data["input_size"]
+hidden_size = data["hidden_size"]
+output_size = data["output_size"]
+all_words = data['all_words']
+tags = data['tags']
+model_state = data["model_state"]
+model = NeuralNet(input_size, hidden_size, output_size).to(device)
+model.load_state_dict(model_state)
+model.eval()
+bot_name = "Sam"
+print("Let's chat! (type 'quit' to exit)")
+while True:
+    # sentence = "do you use credit cards?"
+    sentence = input("You: ")
+    if sentence == "quit":
+        break
+    sentence = tokenize(sentence)
+    X = bag_of_words(sentence, all_words)
+    X = X.reshape(1, X.shape[0])
+    X = torch.from_numpy(X).to(device)
+    output = model(X)
+    _, predicted = torch.max(output, dim=1)
+    tag = tags[predicted.item()]
+    probs = torch.softmax(output, dim=1)
+    prob = probs[0][predicted.item()]
+    if prob.item() > 0.75:
+        for intent in intents['intents']:
+            if tag == intent["tag"]:
+                print(f"{bot_name}: {random.choice(intent['responses'])}")
+    else:
+        print(f"{bot_name}: I do not understand...")
--- a/data.pth
+++ b/data.pth
--- a/images/.gitkeep
+++ b/images/.gitkeep
--- a/intents.json
+++ b/intents.json
+{
+  "intents": [
+    {
+      "tag": "greeting",
+      "patterns": [
+        "Hi",
+        "Hey",
+        "How are you",
+        "Is anyone there?",
+        "Hello",
+        "Good day"
+      ],
+      "responses": [
+        "Hey :-)",
+        "Hello, thanks for visiting",
+        "Hi there, what can I do for you?",
+        "Hi there, how can I help?"
+      ]
+    },
+    {
+      "tag": "goodbye",
+      "patterns": ["Bye", "See you later", "Goodbye"],
+      "responses": [
+        "See you later, thanks for visiting",
+        "Have a nice day",
+        "Bye! Come back again soon."
+      ]
+    },
+    {
+      "tag": "thanks",
+      "patterns": ["Thanks", "Thank you", "That's helpful", "Thank's a lot!"],
+      "responses": ["Happy to help!", "Any time!", "My pleasure"]
+    },
+    {
+      "tag": "anuradhapura-places",
+      "patterns": ["what are the places i can visit in anuradhapura?", "what are the places I can see in Anuradhapura?", "what are the locations I can see in Anuradhapura", "Why am i going to anuradhapuraya?","What are the tourist places in anuradhapura?"],
+      "responses": ["You can see Sigiriya, Ruwanweliseya, Thuparamaya, Isurumuniya and many other historical places in Anuradhapura"]
+    },
+     {
+      "tag": "create-sigiriya",
+      "patterns": ["who made sigiriya?", "who created sigiriya?", "who built sigiriya?","who built lion rock?"],
+      "responses": ["Sigiriya was built by King Kashyapa"]
+    },
+     {
+      "tag": "see-sigiriya",
+      "patterns": ["what can i see in sigiriya?", "why am i go to sigiriya?", "what are the beautiful places in sigiriya?","why am i go to lion rock?"],
+      "responses": ["You can see ancient ponds and wall art in the Sigiriya"]
+    },
+      {
+      "tag": "important-sigiriya",
+      "patterns": ["what is the important of the sigiriya?", "tell me about sigiriya?", "why people like sigiriya?","why is the important of the sigiriya for us?", "tell me about lion rock?","what is sigiriya?","what is the special of the sigiriya?"],
+      "responses": ["Sigiriya is one of the most valuable historical monuments of Sri Lanka. Referred by locals as the Eighth Wonder of the World this ancient palace and fortress complex has significant archaeological importance and attracts thousands of tourists every year. It is probably the most visited tourist destination of Sri Lanka."]
+    },
+      {
+      "tag": "when-sigiriya",
+      "patterns": ["when create sigiriya?", "when built sigiriya?", "which year made sigiriya"],
+      "responses": ["Since the 3th century BC Sigiriya was used as a monastery and after eight centuries it was turned into a royal palace"]
+    },
+      {
+      "tag": "old-sigiriya",
+      "patterns": ["how old sigiriya?", "how many years sigiriya?"],
+      "responses": ["Archeological excavations have proven that Sigiriya and its surrounding territories were inhabited for more than 4000 years."]
+    },
+     {
+      "tag": "crowd-sigiriya",
+      "patterns": ["how many peoples comes to the sigiriya in the one day?", "how many crowd visit to the sigiriya in a day?"],
+      "responses": ["Around 2000 people come to visit Sigiriya daily."]
+    },
+     {
+      "tag": "ticket-sigiriya",
+      "patterns": ["what is the ticket price of sigiriya", "entrance fee of sigiriya?"],
+      "responses": ["You should by a ticket which price of US$30 or 4620 LKR for tourists, or 50 LKR for Sri Lankan citizens."]
+    },
+      {
+      "tag": "heritage-sigiriya",
+      "patterns": ["is sigiriya world heritage?", "when sigiriya become the heritage?"],
+      "responses": ["Sigiriya  is a UNESCO listed World Heritage Site since 1982."]
+    },
+     {
+      "tag": "station-sigiriya",
+      "patterns": ["what is the nearest railway station to the sigiriya", "how long so far to closest railway station from sigiriya?","where is sigiriya"],
+      "responses": ["Habarna is the closet railway station to Sigiriya. It's 15km away from Sigiriya."]
+    },
+     {
+      "tag": "heigh-sigiriya",
+      "patterns": ["what is the height of sigiriya", "what is the peak of sigiriya?","elevation of sigiriya?","elevation of sigiriya"],
+      "responses": ["1,144 feet (349 metres) above sea level and is some 600 feet (180 metres) above the surrounding plain."]
+    },
+      {
+      "tag": "why-sigiriya",
+      "patterns": ["why create sigiriya", "what is reason for make sigiriya?","what is the main purpose of sigiriya","why built sigiriya"],
+      "responses": ["1In India he raised an army with the intention of returning and retaking the throne of Sri Lanka, which he considered to be rightfully his. Expecting the inevitable return of Moggallana, Kashyapa is said to have built his palace on the summit of Sigiriya as a fortress as well as a pleasure palace."]
+    }
+  ]
+}
--- a/model.py
+++ b/model.py
+import torch
+import torch.nn as nn
+class NeuralNet(nn.Module):
+    def __init__(self, input_size, hidden_size, num_classes):
+        super(NeuralNet, self).__init__()
+        self.l1 = nn.Linear(input_size, hidden_size) 
+        self.l2 = nn.Linear(hidden_size, hidden_size) 
+        self.l3 = nn.Linear(hidden_size, num_classes)
+        self.relu = nn.ReLU()
+    def forward(self, x):
+        out = self.l1(x)
+        out = self.relu(out)
+        out = self.l2(out)
+        out = self.relu(out)
+        out = self.l3(out)
+        # no activation and no softmax at the end
+        return out
--- a/nltk_utils.py
+++ b/nltk_utils.py
+import numpy as np
+import nltk
+#nltk.download('punkt')
+from nltk.stem.porter import PorterStemmer
+stemmer = PorterStemmer()
+def tokenize(sentence):
+    """
+    split sentence into array of words/tokens
+    a token can be a word or punctuation character, or number
+    """
+    return nltk.word_tokenize(sentence)
+def stem(word):
+    """
+    stemming = find the root form of the word
+    examples:
+    words = ["organize", "organizes", "organizing"]
+    words = [stem(w) for w in words]
+    -> ["organ", "organ", "organ"]
+    """
+    return stemmer.stem(word.lower())
+def bag_of_words(tokenized_sentence, words):
+    """
+    return bag of words array:
+    1 for each known word that exists in the sentence, 0 otherwise
+    example:
+    sentence = ["hello", "how", "are", "you"]
+    words = ["hi", "hello", "I", "you", "bye", "thank", "cool"]
+    bog   = [  0 ,    1 ,    0 ,   1 ,    0 ,    0 ,      0]
+    """
+    # stem each word
+    sentence_words = [stem(word) for word in tokenized_sentence]
+    # initialize bag with 0 for each word
+    bag = np.zeros(len(words), dtype=np.float32)
+    for idx, w in enumerate(words):
+        if w in sentence_words: 
+            bag[idx] = 1
+    return bag
--- a/sub_create_seat.php
+++ b/sub_create_seat.php
+<!DOCTYPE html>
+<html lang="en">
+<head>
+    <meta charset="UTF-8">
+    <meta http-equiv="X-UA-Compatible" content="IE=edge">
+    <meta name="viewport" content="width=device-width, initial-scale=1.0">
+    <title>Document</title>
+</head>
+<body>
+</body>
+</html>
\ No newline at end of file
--- a/train.py
+++ b/train.py
+import json
+import numpy as np
+import torch
+import torch.nn as nn
+from torch.utils.data import Dataset, DataLoader
+from nltk_utils import bag_of_words, tokenize, stem
+from model import NeuralNet
+with open('intents.json', 'r') as f:
+    intents = json.load(f)
+all_words = []
+tags = []
+xy = []
+for intent in intents['intents']:
+    tag = intent['tag']
+    # add to tag list
+    tags.append(tag)
+    for pattern in intent['patterns']:
+        # tokenize each word in the sentence
+        w = tokenize(pattern)
+        # add to our words list
+        all_words.extend(w)
+        # add to xy pair
+        xy.append((w, tag))
+ignore_words = ['?', '.', '!']
+all_words = [stem(w) for w in all_words if w not in ignore_words]
+all_words = sorted(set(all_words))
+tags = sorted(set(tags))
+print(tags)
+X_train = []
+y_train = []
+for (pattern_sentence, tag) in xy:
+    # X: bag of words for each pattern_sentence
+    bag = bag_of_words(pattern_sentence, all_words)
+    X_train.append(bag)
+    # y: PyTorch CrossEntropyLoss needs only class labels, not one-hot
+    label = tags.index(tag)
+    y_train.append(label)
+X_train = np.array(X_train)
+y_train = np.array(y_train)
+class ChatDataset(Dataset):
+    def __init__(self):
+        self.n_samples = len(X_train)
+        self.x_data = X_train
+        self.y_data = y_train
+    # support indexing such that dataset[i] can be used to get i-th sample
+    def __getitem__(self, index):
+        return self.x_data[index], self.y_data[index]
+    # we can call len(dataset) to return the size
+    def __len__(self):
+        return self.n_samples
+num_epochs = 1000
+learning_rate = 0.001
+batch_size = 8
+input_size = len(X_train[0])
+hidden_size = 8
+output_size = len(tags)
+print(input_size, output_size)
+dataset = ChatDataset()
+train_loader = DataLoader(dataset=dataset,
+                          batch_size=batch_size,
+                          shuffle=True,
+                          num_workers=0)
+device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
+model = NeuralNet(input_size, hidden_size, output_size).to(device)
+criterion = nn.CrossEntropyLoss()
+optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
+for epoch in range(num_epochs):
+    for (words, labels) in train_loader:
+        words = words.to(device)
+        labels = labels.to(dtype=torch.long).to(device)
+        # Forward pass
+        outputs = model(words)
+        # if y would be one-hot, we must apply
+        # labels = torch.max(labels, 1)[1]
+        loss = criterion(outputs, labels)
+        # Backward and optimize
+        optimizer.zero_grad()
+        loss.backward()
+        optimizer.step()
+    if (epoch+1) % 100 == 0:
+        print (f'Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}')
+print(f'final loss: {loss.item():.4f}')
+data = {
+"model_state": model.state_dict(),
+"input_size": input_size,
+"hidden_size": hidden_size,
+"output_size": output_size,
+"all_words": all_words,
+"tags": tags
+}
+FILE = "data.pth"
+torch.save(data, FILE)
+print(f'training complete. file saved to {FILE}')