Kamvar, Sepandar D. and Klein, Dan and Manning, Christopher D. (2002) Interpreting and Extending Classical Agglomerative Clustering Algorithms using a Model-Based Approach. Technical Report. Stanford.
BibTeX | DublinCore | EndNote | HTML |
![]()
| PDF 578Kb |
Abstract
We present two results which arise from a model-based approach to hierarchical agglomerative clustering. First, we show formally that the common heuristic agglomerative clustering algorithms -- single-link, complete-link, group-average, and Ward's method -- are each equivalent to a hierarchical model-based method. This interpretation gives a theoretical explanation of the empirical behavior of these algorithms, as well as a principled approach to resolving practical issues, such as number of clusters or the choice of method. Second, we show how a model-based approach can be used to extend these basic agglomerative algorithms. We introduce adjusted complete-link, Mahalanobis-link, and line-link as variants of the classical agglomerative methods, and demonstrate their utility.
Item Type: | Techreport (Technical Report) | |
---|---|---|
Uncontrolled Keywords: | clustering, probabilistic models, model-based clustering, hierarchical clustering | |
Subjects: | Computer Science Computer Science > Data Mining Miscellaneous | |
Projects: | Miscellaneous | |
Related URLs: | Project Homepage | http://www-nlp.stanford.edu/ |
ID Code: | 529 | |
Deposited By: | Import Account | |
Deposited On: | 19 Feb 2002 16:00 | |
Last Modified: | 25 Dec 2008 09:35 |
Download statistics
Repository Staff Only: item control page