Klein, Dan and Kamvar, Sepandar D. and Manning, Christopher D. (2002) From Instance-level Constraints to Space-Level Constraints: Making the Most of Prior Knowledge in Data Clustering. Technical Report. Stanford.
We present an improved method for clustering in the presence of very limited supervisory information, given as pairwise instance constraints. By allowing instance-level constraints to have space-level inductive implications, we are able to successfully incorporate constraints for a wide range of data set types. Our method greatly improves on the previously studied constrained k-means algorithm, generally requiring less than half as many constraints to achieve a given accuracy on a range of real-world data, while also being more robust when over-constrained. We additionally discuss an active learning algorithm which increases the value of constraints even further.
|Item Type:||Techreport (Technical Report)|
|Uncontrolled Keywords:||clustering, constrained clustering, prior knowledge|
Computer Science > Data Mining
|Related URLs:||Project Homepage||http://www-nlp.stanford.edu/|
|Deposited By:||Import Account|
|Deposited On:||19 Feb 2002 16:00|
|Last Modified:||25 Dec 2008 09:38|
Repository Staff Only: item control page