Welcome to the Car Price Prediction project repository! In this project, I aim to predict car prices using various machine learning algorithms and techniques. I have gone through a rigorous process of data cleaning, exploratory data analysis (EDA), feature engineering, data preprocessing, model development, and evaluation to arrive at my best-performing model, the Random Forest Regressor.
In this project, I've accomplished the following steps:
-
Data Cleaning: I started by cleaning and preparing the dataset to remove any inconsistencies and missing values.
-
EDA: Extensive Exploratory Data Analysis was conducted, including univariate, bivariate, statistical testing, and multivariate analysis to gain insights into the data.
-
Feature Engineering: I employed feature engineering techniques to create new features and enhance the predictive power of my models.
-
PCA (Principal Component Analysis): Dimensionality reduction through PCA was explored to reduce complexity and improve model performance.
-
Data Preprocessing: Encoding categorical variables and scaling numerical features was carried out to prepare the data for modeling.
-
Model Development: I developed several regression models, including Linear Regression, Decision Tree, AdaBoost, Gradient Boosting, and Random Forest Regressor.
-
Model Evaluation: Thorough model evaluation was performed, and various metrics such as Mean Absolute Error (MAE), Mean Squared Error (MSE), and R-squared were used to compare and select the best-performing model.
After evaluating all the models, the Random Forest Regressor emerged as the top performer, demonstrating excellent generalization capabilities.
- 💾
CarPrice.csv
: Contains the dataset used for this project. - 📓
Car Price Prediction.ipynb
: Jupyter notebook with code for data cleaning, EDA, feature engineering, model development, and evaluation. - 📄
requirements.txt
: List of Python packages required to run the project code. - 📄
README.md
: You are currently reading it! The main project documentation.
- Clone this repository to your local machine.
- Create a virtual environment and install the required packages using
pip install -r requirements.txt
. - Navigate to the
notebooks/
orscripts/
directory to run the code for various project stages. - Follow the step-by-step instructions in the Jupyter notebooks or Python scripts.
Contributions are welcome! If you have ideas for improvement or find any issues, please feel free to open an issue or submit a pull request.
I would like to acknowledge the open-source community and libraries like scikit-learn, pandas, and matplotlib for their invaluable contributions to this project.
Happy Coding! 🚀