Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
A
AAGGY
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Analytics
Analytics
CI / CD
Repository
Value Stream
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
23-153
AAGGY
Commits
81cdf8d9
Commit
81cdf8d9
authored
Nov 03, 2023
by
Sajana_it20194130
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Upload New File
parent
86b0672c
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
42 additions
and
0 deletions
+42
-0
NLP.py
NLP.py
+42
-0
No files found.
NLP.py
0 → 100644
View file @
81cdf8d9
# Natural Language Processing for Predict IoT Network Anomaly
# Importing the libraries
import
matplotlib.pyplot
as
plt
import
pandas
as
pd
import
seaborn
as
sns
import
joblib
import
numpy
as
np
# Importing the dataset
dataset
=
pd
.
read_csv
(
'DatasetRF.tsv'
,
delimiter
=
'
\t
'
,
quoting
=
3
)
# Cleaning the texts
import
re
import
nltk
nltk
.
download
(
'stopwords'
)
from
nltk.corpus
import
stopwords
from
nltk.stem.porter
import
PorterStemmer
corpus
=
[]
for
i
in
range
(
0
,
31
):
log
=
re
.
sub
(
'[^a-zA-Z0-9]'
,
' '
,
dataset
[
'traffic'
][
i
])
log
=
log
.
lower
()
log
=
log
.
split
()
ps
=
PorterStemmer
()
log
=
[
ps
.
stem
(
word
)
for
word
in
log
if
not
word
in
set
(
stopwords
.
words
(
'english'
))]
log
=
' '
.
join
(
log
)
corpus
.
append
(
log
)
# Creating the Bag of Words model
from
sklearn.feature_extraction.text
import
CountVectorizer
cv
=
CountVectorizer
(
max_features
=
140
)
X
=
cv
.
fit_transform
(
corpus
)
.
toarray
()
y
=
dataset
.
iloc
[:,
1
]
.
values
# Training the Random Forest Regression model on the whole dataset
from
sklearn.ensemble
import
RandomForestRegressor
regressor
=
RandomForestRegressor
(
n_estimators
=
10
,
random_state
=
0
)
regressor
.
fit
(
X
,
y
)
# Saving the trained model
joblib
.
dump
(
regressor
,
'random_forest_model.joblib'
)
joblib
.
dump
(
cv
,
'count_vectorizer.joblib'
)
\ No newline at end of file
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment