Customer segmentation is a crucial aspect of business strategy, allowing companies to understand their customers better and optimize marketing efforts. This project applies K-Means Clustering to segment mall customers based on their Annual Income and Spending Score, helping businesses tailor personalized experiences and improve customer retention.
- Description: The dataset consists of demographic and spending behavior data of mall customers.
- Features:
CustomerID
: Unique identifier for each customerGender
: Customer's genderAge
: Customer's ageAnnual Income (k$)
: Customer's yearly income in thousands of dollarsSpending Score (1-100)
: A score assigned based on customer spending behavior
✔ Perform Exploratory Data Analysis (EDA) to identify patterns and insights.
✔ Implement K-Means Clustering to categorize customers into distinct groups.
✔ Visualize the customer segments to understand their characteristics.
✔ Provide actionable insights for businesses to enhance marketing strategies.
- Language: R
- Libraries Used:
ggplot2
,dplyr
,tidyverse
,cluster
,factoextra
- Clustering Algorithm: K-Means
- Identified distinct customer groups based on income and spending patterns.
- Visualized clusters using scatter plots and elbow method to determine the optimal number of clusters.
- Provided strategic recommendations for targeting high-value customers and improving retention.
1️⃣ Install the required R packages:
install.packages(c("ggplot2", "dplyr", "tidyverse", "cluster", "factoextra"))
2️⃣ Load the dataset into R.
3️⃣ Perform data preprocessing and exploratory analysis.
4️⃣ Apply the K-Means Clustering Algorithm.
5️⃣ Visualize and interpret the customer segments.
Key visualizations include:
📌 Scatter plots showing customer segments
📌 Distribution of spending scores and income
📌 Cluster separation using principal component analysis (PCA)
Developed by Gargi Mishra
📌 LinkedIn