Stanford InfoLab Publication Server

Data Clouds: Summarizing Keyword Search Results over Structured Data

Koutrika, Georgia and Mohammadi Zadeh, Zahra and Garcia-Molina, Hector (2009) Data Clouds: Summarizing Keyword Search Results over Structured Data. In: 12th International Conference on Extending Database Technology (EDBT 2009), March 23-26 2009, Saint-Petersburg, Russia.


PDF - Accepted Version


Keyword searches are attractive because they facilitate users searching structured databases. On the other hand, tag clouds are popular for navigation and visualization purposes over unstructured data because they can highlight the most significant concepts and hidden relationships in the underlying content dynamically. In this paper, we propose coupling the flexibility of keyword searches over structured data with the summarization and navigation capabilities of tag clouds to help users access a database. We propose using clouds over structured data (data clouds) to summarize the results of keyword searches over structured data and to guide users to refine their searches. The cloud presents the most significant words associated with the search results. Our keyword search model allows searching for entities than can span multiple tables in the database rather than just tuples, as existing keyword searches over databases do. We present several methods to compute the scores both for the entities and for the terms in the search results. We describe algorithms for keyword searches with data clouds and we present our system, CourseCloud, that offers a unified search and browse interface to a course database. We present experimental results showing (a) the appropriateness of the methods used for scoring terms, (b) the performance of the proposed algorithms, and (c) the effectiveness of CourseCloud compared to typical search and browse interfaces to a course database.

Item Type:Conference or Workshop Item (Paper)
ID Code:896
Deposited By:Georgia Koutrika
Deposited On:13 Dec 2008 12:00
Last Modified:30 Dec 2008 09:31

Download statistics

Repository Staff Only: item control page