Balanced clustering

Balanced clustering is a special case of clustering where, in the strictest sense, cluster sizes are constrained to or , where is the number of points and is the number of clusters.[1] A typical algorithm is balanced k-means, which minimizes mean square error (MSE). Another type of balanced clustering called balance-driven clustering has a two-objective cost function that minimizes both the imbalance and the MSE. Typical cost functions are ratio cut[2] and Ncut.[3] Balanced clustering can be used for example in scenarios where freight has to be delivered to locations with cars. It is then preferred that each car delivers to an equal number of locations.

Software

There exists implementations for balanced k-means[4] and Ncut[5]

References

  1. M. I. Malinen and P. Fränti (August 2014). "Balanced k-Means for Clustering". Joint Int. Workshop on Structural, Syntactic, and Statistical Pattern Recognition (S+SSPR 2014), LNCS 8621.
  2. L. Hagen and A. B. Kahng (1992). "New spectral methods for ratio cut partitioning and clustering". IEEE Transactions on Computer-Aided Design.
  3. J. Shi and J. Malik (2000). "Normalized cuts and image segmentation". IEEE Transactions on Pattern Analysis and Machine Intelligence. 22 (8): 888–905. doi:10.1109/34.868688.
  4. M. I. Malinen and P. Fränti. "Balanced k-Means implementation". University of Eastern Finland.
  5. T. Cour, S. Yu and J. Shi. "Ncut implementation". University of Pennsylvania.

M.S. Levin (2017). "On Balanced Clustering (Indices, Models, Examples)". J. of Communications Technology and Electronics, 62(12): 1506–1515. doi: 10.1134/S1064226917120105

This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.