auto-kmeans is a comprehensive Python framework designed to simplify and automate the process of clustering analysis using the K-means algorithm. This tool aims to provide users with an intuitive interface for performing robust K-means clustering on datasets, enabling both novices and experienced data scientists to extract meaningful insights from their data efficiently.
- Automated Clustering: Streamlines the K-means clustering process, automatically determining the optimal number of clusters. This is an implementation of this idea.
- Data Preprocessing: Includes data preprocessing functionalities to prepare datasets for clustering analysis.
- Visualization: Offers visualization tools to help interpret the results of the K-means algorithm.
- Flexibility: Easy to integrate with existing data analysis pipelines and workflows.
- Sample Datasets: Comes with sample datasets to help users get started and to demonstrate the framework's capabilities.
Clone the repository and install the required dependencies to get started with auto-kmeans:
git clone https://github.com/nguyhu01/auto-kmeans.git
cd auto-kmeans
pip install -r requirements.txt
After installation, import autokmeans in your Python script to perform automated K-means clustering on your dataset.
from autokmeans import AutoKMeans
# Initialize and fit the model to your data
auto_km = AutoKMeans(data, max_clusters, alpha_k)
auto_km.fit(data, clustering_features)
Contributions are welcome! Please read the contributing guidelines before submitting pull requests to the project.
This project is open-sourced under the MIT License. See the LICENSE file for more details.
Huy Nguyen - Initial work - nguyhu01