Who Invented K-Means?
A Popular Clustering Algorithm Which Originated in 50s and 60s
1- K-Means Original Paper
K-Means was found by multiple independent individuals through multiple independent research throughout 1950s and 1960s.
James MacQueen is known as the first researcher to use the name K-Means in his research paper in 1967 published by University of California.
James MacQueen’s research paper proposed an algorithm for partitioning an data into a set of clusters with small variances in each cluster. Original research paper can be found here:
Preceding K-Means Work Under Different Names
2- More on the K-Means Origins
Originally, K-Means algorithm without using the name K-Means goes back to 1956 when Hugo Steinhaus, a Polish mathematician published on Bulletin of the Polish Academy of Science (Bulletin de l’Académie Polonaise des Sciences). The paper was named: “Sur la Division des Corps Matérielsen Parties”, “On the Division of Parts of Material Bodies”.
Hugo Steinhaus’ original paper on K-Means can be found on Laurant Duval’s website here.
Another great resource about the history and evolution of K-Means algorithm is “The K-Means Algorithm Evolution” by Joaquín Pérez Ortega et al. found here.
(K-Means) it was known by different names: dynamic clustering method,iterative minimum-distance clustering, nearest centroid sorting, and h-means, among others.
The K-Means Algorithm Evolution, April-2019
K-Means algorithm evolution can be tracked to the research papers below:
- Hugo Steinhaus – 1956: Bagging predictors
- Stuart Lloyd – 1957: Least Squares Quantization in PCM
- James MacQueen – 1967: Some Methods for Classification and Analysis of MultivariateObservations
- RC Jancey – 1966: Multidimensional Group Analysis
K-Means History Summary
In Summary, we have covered the history of K-Means Machine Learning algorithm. K-Means was suggested and used under different names by scientists and mathematicians working in different domains.
Its historical evolution is well documented and the name K-Means which is commonly used in modern literature was first coined by James MaccQueen in 1967.