Email-Spam-Detection-Using-Logistic-Regression

Project Overview:

This project aims to build a robust email spam detection system using machine learning techniques. The primary objective is to classify emails as spam or not spam with high accuracy. We achieved a 95% accuracy using a logistic regression model, enhanced with feature engineering and TF-IDF vectorization.

Key Features:

Checking and Removing Null Values

df.isnull().sum()
isnull function identify the null values in the data frame and sum function sum-up, total number of values in the data frame.

Converting Labels SPAM and HAM into '0' and '1'

Used LabelEncoder() to convert categorical labels (text) into numeric form.
LabelEncoder is a class in the sklearn.preprocessing module of scikit-learn.

Data Preprocessing

Utilized the nltk toolkit to preprocess text data by employing functions such as sent_tokenize and word_tokenize.
As a result, I enriched my DataFrame with three new columns: number_characters, number_sentences, and number_words.

Model Training

Model is being trained using logistic regression.

Model Efficiency

Accuracy is used to calculate the efficiency of the model.

Dependencies

Python 3.x
Pandas
Numpy
Scikit-learn
Jupyter
NLTK
WordCloud

How to Run

Clone the repository:

https://github.com/raja045/Email-Spam-Detection-Using-Logistic-Regression.git

cd Email-Spam-Detection-Using-Logistic-Regression

Install the required packages

pip install -r requirements.txt

Run the Jupyter notebooks to see the step-by-step process of building the model:

jupyter notebook

Open SpamDetection.ipynb and run the cells sequentially.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
LICENSE		LICENSE
README.md		README.md
SpamDetection.ipynb		SpamDetection.ipynb
final.csv		final.csv
mail_data.csv		mail_data.csv
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Email-Spam-Detection-Using-Logistic-Regression

Project Overview:

Key Features:

Checking and Removing Null Values

Converting Labels SPAM and HAM into '0' and '1'

Data Preprocessing

Model Training

Model Efficiency

Dependencies

How to Run

Clone the repository:

Install the required packages

Run the Jupyter notebooks to see the step-by-step process of building the model:

About

Releases

Packages

Languages

License

raja045/Email-Spam-Detection-Using-Logistic-Regression

Folders and files

Latest commit

History

Repository files navigation

Email-Spam-Detection-Using-Logistic-Regression

Project Overview:

Key Features:

Checking and Removing Null Values

Converting Labels SPAM and HAM into '0' and '1'

Data Preprocessing

Model Training

Model Efficiency

Dependencies

How to Run

Clone the repository:

Install the required packages

Run the Jupyter notebooks to see the step-by-step process of building the model:

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages