Conducted Market Basket Analysis (MBA) on Amazon product dataset to enhance recommendations. Identified top-selling products and top products in each category using review count. Implemented association rule mining for personalized recommendations. Evaluated effectiveness through metrics.
Market Basket Analysis is a data mining technique used to uncover associations between products purchased together by customers. In this project, we utilize MBA to identify patterns in Amazon product purchases, allowing us to generate personalized product recommendations for customers based on their shopping history.
Amazon product dataset from Kaggle
- Data Collection: Transactional data including user IDs, product IDs, and quantities purchased.
- Data Cleaning: Removing duplicates, handling missing values.
- Data Transformation: Converting data into transaction-level and basket-level formats.
- Data Encoding: One-hot encoding for categorical variables.
- Transaction Filtering: Removing low-support and irrelevant items.
- Association Rule Mining: Using Apriori or FP-Growth algorithms.
- Summary Statistics
- Distribution of Numerical Variables
- Correlation Analysis
- Categorical Variables Analysis
- Text Analysis
- Selection of relevant attributes for analysis or modeling.
- Creation of new features such as WeightedSum and subcategory columns.
- Summarization by subcategory and selection of top products.
- Polynomial Features: Creating polynomial features from original features.
- Ridge Regression: Handling multicollinearity and reducing overfitting.
- Model Evaluation: Cross-validation, MSE calculation, and model accuracy.
- Visualization: Scatter plots, residual plots, and histograms.

- Prafull Raj | AP21110011016
- Atharva Narkhede | AP21110011028
- Rahul Bajaj | AP21110011022
- Vrijeshwar Singh | AP21110010922
- Aditya Dubey | AP21110010729