Information Finding Projects in the Stanford Digital Library

Information Finding
Projects in
the Stanford
Digital Library

One of the major research thrusts of the Stanford Digital Library project is helping users to find information. We have initiated a number of projects in this area, most related to our over-arching theme of interoperability. We have looked at ways that search tools can be used across multiple sources that use different syntaxes or languages. We have also looked at tools to provide statistical or collaborative filtering to locate relevant articles.

FAB

FAB is an adaptive multi-agent information retrieval system which finds interesting pages on the web.

"An Adaptive Agent for Automated Web Browsing"

Marko Balabanovic

GlOSS

The Glossary Server of Servers (GlOSS) project is designed to locate relevant information sources for your query.

" Generalizing GlOSS to Vector-Space Databases and Broker Hierarchies"

Luis Gravano

Query Translator

Databases have different query syntax and different capabilities, even for simple Boolean queries. Translation allows a single query to be mapped into the native format appropriate for each database.

Chen-Chuan K. Chang

SenseMaker

SenseMaker helps users iteratively reformulate their information needs through multi-dimensional organizing and active gathering of search results.

" SenseMaker: An Information-Exploration Interface Supporting the Contextual Evolution of a User's Interests"

Michelle Q Wang Baldonado

Grassroots

Groupware for information finding, combines mail, news, and web in a single environment with distribution lists

" Grassroots: A System Providing a Uniform Framework for Communicating, Structuring, Sharing Information, and Organizing People

Kenichi Kamiya
Martin Röscheisen

The Stanford Digital Library Metadata Architecture

Services need to provide

metadata about their offerings to help users decide when they should be invoked
protocol metadata to figure out how they should be invoked, and
collection metadata for what they should be invoked upon.

The metadata architecture provides a system organization to provide these metadata in a uniform, scaleable way.

Metadata for Digital Libraries: Architecture and Design Rationale

Michelle Q Wang Baldonado
Chen-Chuan K. Chang
Luis Gravano
Andreas Paepcke

STARTS: Stanford Protocol Proposal for Internet Retrieval and Search

A set of informal standards negotiated among the major search vendors and users to facilitate interoperation.

Chen-Chuan K. Chang
Hector Garcia-Molina
Luis Gravano
Andreas Paepcke

BackRub

BackRub is a web crawler which is designed to store the connection graph for the web. In other words BackRub stores which pages every web page links to. Currently we are developing techniques using this link data to improve web search engines as well as understand the structure of the web.

Larry Page

ComMentor

Third-Party Annotations on web pages provide for ways to share information, rate content, and keep notes

"Content Ratings and Other Third-Party Value-Added Information: Defining an Enabling Platform"

Martin Röscheisen
Christian Mogensen
Terry Winograd

InterOp Protocol

The heart of the "InfoBus", this protocol describes access methods to search collections, acquire results, and find out about sources.

Steve Cousins
Prof. Hector Garcia-Molina
Scott Hassan
Andreas Paepcke

SCAM: The Stanford Copy Analysis Mechanism

Making a perfect digital copy of a copyrighted work is easy in a networked world. How can the intellectual property rightsholders be protected? By detecting attempted distribution of illegal copies. Duplicate detection has other uses in information finding as well. An earlier, related project was known as COPS: The Copyright Protection Scheme.

"Building a Scalable and Accurate Copy Detection Mechansim"

Prof. Hector Garcia-Molina
Narayanan Shivakumar

InterBib

InterBib is a tool for maintaining bibliographic information. Capable of reading from and writing to many different formats, it acts as a unified, searchable repository of bibliographic records.

Information on InterBib

Andreas Paepcke