Stanford InfoLab Publication Server

From Instance-level Constraints to Space-Level Constraints: Making the Most of Prior Knowledge in Data Clustering

Klein, Dan and Kamvar, Sepandar D. and Manning, Christopher D. (2002) From Instance-level Constraints to Space-Level Constraints: Making the Most of Prior Knowledge in Data Clustering. Technical Report. Stanford.

BibTeXDublinCoreEndNoteHTML

[img]
Preview
PDF
430Kb

Abstract

We present an improved method for clustering in the presence of very limited supervisory information, given as pairwise instance constraints. By allowing instance-level constraints to have space-level inductive implications, we are able to successfully incorporate constraints for a wide range of data set types. Our method greatly improves on the previously studied constrained k-means algorithm, generally requiring less than half as many constraints to achieve a given accuracy on a range of real-world data, while also being more robust when over-constrained. We additionally discuss an active learning algorithm which increases the value of constraints even further.

Item Type:Techreport (Technical Report)
Uncontrolled Keywords:clustering, constrained clustering, prior knowledge
Subjects:Computer Science
Computer Science > Data Mining
Miscellaneous
Projects:Miscellaneous
Related URLs:Project Homepagehttp://www-nlp.stanford.edu/
ID Code:528
Deposited By:Import Account
Deposited On:19 Feb 2002 16:00
Last Modified:25 Dec 2008 09:38

Download statistics

Repository Staff Only: item control page