Clustering-Comparison-between-methods

The data set has details of 1008 used cars along with a set of variables: Brand, Car model, Resale price, Mileage, Seat capacity, Vehicle type, fuel type, transmission, parking sensor, airbag, cruise control, keyless entry, alloy wheel, ABS, Climate control, Rear AC vent and Power Steering.

The following analysis is considered:

Deciding the distance measure for this dataset is crucial, since there is a mixture of categorical and numerical variables. We use the gower metric for this case, more details are provided in the documentation
Hierarchical clustering is applied on the entire dataset and cluster profiling is carried out
K-means and hierarchical clustering results are compared where k-means is applied on only the numerical variables
Comparison is made based on the following metrics : W/B Ratio, Within Sum of Squares, Calinski Harabasz Index, Dunn Index

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Clustering.pdf		Clustering.pdf
README.md		README.md
clustering_cars.R		clustering_cars.R
used_car_data.csv		used_car_data.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Clustering-Comparison-between-methods

About

Releases

Packages

Languages

Pratyush1296/Clustering-Comparison-between-methods

Folders and files

Latest commit

History

Repository files navigation

Clustering-Comparison-between-methods

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages