Stanford InfoLab Publication Server

Clindex: Clustering for Similarity Queries in High-Dimensional Spaces.

Li, C. and Chang, E. and Garcia-Molina, H. and Wang, J. and Wiederhold, G. (1999) Clindex: Clustering for Similarity Queries in High-Dimensional Spaces. Technical Report. Stanford.

BibTeXDublinCoreEndNoteHTML

[img]
Preview
PDF
284Kb

Abstract

Clindex: Clustering for Similarity Queries in High-Dimensional Spaces Chen Li, Edward Chang, Hector Garcia-Molina James Ze Wang and Gio Wiederhold Department of Computer Science, Stanford University Abstract In this paper we present a clustering and indexing paradigm (called Clindex) for highdimensional search spaces. The scheme is designed for approximate searches, where one wishes to find many of the data points near a target point, but where one can tolerate missing a few near points. For such searches, our scheme can find near points with high recall in very few IOs and performs significantly better than other approaches. Our scheme is based on finding clusters, and then building a simple but efficient index for them. We analyze the tradeoffs involved in clustering and building such an index structure, and present experimental results based on a 30,000 image database. Keywords: similarity search, multidimensional indexes.

Item Type:Techreport (Technical Report)
Uncontrolled Keywords:similarity search, multidimensional indexes
Subjects:Computer Science > Data Mining
Projects:Miscellaneous
Related URLs:Project Homepagehttp://infolab.stanford.edu/
ID Code:392
Deposited By:Import Account
Deposited On:25 Feb 2000 16:00
Last Modified:28 Dec 2008 09:31

Download statistics

Repository Staff Only: item control page