Information Finding
Projects in
the Stanford
Digital Library

One of the major research thrusts of the Stanford Digital Library project is helping users to find information. We have initiated a number of projects in this area, most related to our over-arching theme of interoperability. We have looked at ways that search tools can be used across multiple sources that use different syntaxes or languages. We have also looked at tools to provide statistical or collaborative filtering to locate relevant articles.
FAB is an adaptive multi-agent information retrieval system which finds interesting pages on the web.

"An Adaptive Agent for Automated Web Browsing"

The Glossary Server of Servers (GlOSS) project is designed to locate relevant information sources for your query.

" Generalizing GlOSS to Vector-Space Databases and Broker Hierarchies"

Query Translator
Databases have different query syntax and different capabilities, even for simple Boolean queries. Translation allows a single query to be mapped into the native format appropriate for each database.

SenseMaker helps users iteratively reformulate their information needs through multi-dimensional organizing and active gathering of search results.

" SenseMaker: An Information-Exploration Interface Supporting the Contextual Evolution of a User's Interests"

Groupware for information finding, combines mail, news, and web in a single environment with distribution lists

" Grassroots: A System Providing a Uniform Framework for Communicating, Structuring, Sharing Information, and Organizing People

The Stanford Digital Library Metadata Architecture
Services need to provide The metadata architecture provides a system organization to provide these metadata in a uniform, scaleable way.

Metadata for Digital Libraries: Architecture and Design Rationale

STARTS: Stanford Protocol Proposal for Internet Retrieval and Search
A set of informal standards negotiated among the major search vendors and users to facilitate interoperation.

BackRub is a web crawler which is designed to store the connection graph for the web. In other words BackRub stores which pages every web page links to. Currently we are developing techniques using this link data to improve web search engines as well as understand the structure of the web.

Third-Party Annotations on web pages provide for ways to share information, rate content, and keep notes

"Content Ratings and Other Third-Party Value-Added Information: Defining an Enabling Platform"

InterOp Protocol
The heart of the "InfoBus", this protocol describes access methods to search collections, acquire results, and find out about sources.

SCAM: The Stanford Copy Analysis Mechanism
Making a perfect digital copy of a copyrighted work is easy in a networked world. How can the intellectual property rightsholders be protected? By detecting attempted distribution of illegal copies. Duplicate detection has other uses in information finding as well. An earlier, related project was known as COPS: The Copyright Protection Scheme.

"Building a Scalable and Accurate Copy Detection Mechansim"

InterBib is a tool for maintaining bibliographic information. Capable of reading from and writing to many different formats, it acts as a unified, searchable repository of bibliographic records.

Information on InterBib

[Stanford] [DigLib]