Commit 74bc48f7 authored by Chamila Dilshan's avatar Chamila Dilshan

Code 4

parent df9a97b9
plt.figure(figsize=(20,15))
plt.xticks(rotation=90)
ax=sns.countplot(x="Category", data=df)
plt.grid()
cat = df['Category'].value_counts().reset_index()
plt.figure(figsize=(20,12))
plt.pie(cat['Category'], labels=cat['index'], autopct='%.2f%%')
plt.title('Category Distribution')
plt.show()
df['Resume']
0 Skills * Programming Languages: Python (pandas...
1 Education Details \r\nMay 2013 to May 2017 B.E...
2 Areas of Interest Deep Learning, Control Syste...
3 Skills • R • Python • SAP HANA • Table...
4 Education Details \r\n MCA YMCAUST, Faridab...
...
957 Computer Skills: • Proficient in MS office (...
958 ❖ Willingness to accept the challenges. ❖ ...
959 PERSONAL SKILLS • Quick learner, • Eagerne...
960 COMPUTER SKILLS & SOFTWARE KNOWLEDGE MS-Power ...
961 Skill Set OS Windows XP/7/8/8.1/10 Database MY...
Name: Resume, Length: 962, dtype: object
Basic Data Cleaning
remove blank empty lines, remove newlines \r\n
df['Resume'] = df['Resume'].apply(lambda x: re.sub(r'\n{2,}', "", x)) # remove blank empty lines
df['Resume'] = df['Resume'].apply(lambda x: re.sub(r'\r\n', "", x)) # remove newlines \r\n
df['Resume']
\ No newline at end of file
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment