Skip to content

Numpy implementation of K-means clustering algorithm

Notifications You must be signed in to change notification settings

grimmdaniel/k_means

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 

Repository files navigation

k_means

Numpy implementation of k-means clustering algorithm.

Supports Pandas DataFrame and Numpy ndarrays as input format.

Parameters

  • init(): initilaizing the K-means object with the number of clusters
    • k: int: number of clusters
  • fit_predict(): compute cluster centers and predict cluster index for each sample
    • data: input data, can be a Pandas DataFrame or Numpy ndarray object
    • dist_type: str: type of distance measure to be used, can be euclidean (default), manhattan, chebychev
    • max_iter: int: number of maximum iterations, default is 20
  • predict(): predict the closest cluster each sample in X belongs to
    • data: input data, can be a Pandas DataFrame or Numpy ndarray object

Attributes

  • cluster_centers_: numpy ndarray: coordinates of cluster centers
  • inertia_: float: sum of squared distances of samples to their closest cluster center

Usage:

from k_means.K_Means import K_Means

kmeans = K_Means(3)
labels, centroids, iterations = kmeans.fit_predict(data)

new_data = np.array([[-1,1],[0,-2],[2,0]])
kmeans.predict(new_data)

Examples

res1

res2

About

Numpy implementation of K-means clustering algorithm

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages