Numpy implementation of k-means clustering algorithm.
Supports Pandas DataFrame and Numpy ndarrays as input format.
- init(): initilaizing the K-means object with the number of clusters
- k: int: number of clusters
- fit_predict(): compute cluster centers and predict cluster index for each sample
- data: input data, can be a Pandas DataFrame or Numpy ndarray object
- dist_type: str: type of distance measure to be used, can be euclidean (default), manhattan, chebychev
- max_iter: int: number of maximum iterations, default is 20
- predict(): predict the closest cluster each sample in X belongs to
- data: input data, can be a Pandas DataFrame or Numpy ndarray object
- cluster_centers_: numpy ndarray: coordinates of cluster centers
- inertia_: float: sum of squared distances of samples to their closest cluster center
from k_means.K_Means import K_Means
kmeans = K_Means(3)
labels, centroids, iterations = kmeans.fit_predict(data)
new_data = np.array([[-1,1],[0,-2],[2,0]])
kmeans.predict(new_data)