Digital Library Project
Stanford University
Annual Report. Feb 1, 1997
Reporting Period: Feb 1, 1996-Feb 1, 1997
Content:
Administrative matters
InfoBus Architecture and Testbed
Economics
User Interfaces
Searching
Agents
STARTS Internet Meta-Search Proposal
Miscellaneous Activities
References to Papers Produced in 1996
Other Publications

1. Administrative

Osaka Gas Information Systems, a fully owned subsidiary of Japan's Osaka Gas company decided to send Yosuke Akamatsu as a visitor to our Digital Library project for one year. He is working on our economics thrust.

Many of us attended the University of Michigan DLI meeting in '96, and in December, we hosted the meeting of DLI participants. We also spent time preparing for an additional pre-conference workshop of key people from the Digitial Library Initiative and the leaders of the National Digital Library Federation (Directors of the nation's largest research libraries). Approximately 55 people attended. The meeting built a stronger bridge, and encouraged communication between the NDLF leadership and the NSF/DARPA/NASA Digital Library projects. The most productive discussions between these two groups centered around search and discovery methods, and metadata. A handful of NDLF people stayed for the meeting on the 16th and 17th.

Carl Lagoze and David Fielding of Cornell University, and Jim Davis of Xerox PARC joined us part time. They bring rich, relevant experience from their work on the NCSTRL project.

Several students spent their summer at surrounding companies, mostly continuing Digital Library research, though usually with their host companies' goals driving their direction. This has enriched the project, as the students bring new ideas and feedback to us.

Twenty-one working papers from the Stanford Digital Library Project have been made into technical notes or technical reports. This will give the DigLIb papers more visibility and archive them in the library.

2. InfoBus Architecture and Testbed

We spent a considerable amount of effort on exploring the possibilities of Java for the InfoBus. Our interest lies particularly in the use of Java to distribute InfoBus access software. In this scheme, all access software is obtained through the Web as a Java applet. Once at the client site, the applet assumes the role of a CORBA ORB, the core part of the CORBA system which facilitates communication with other objects. This mode of deployment is nearing completion and will be the basis for our library user studies.

Work on the Zserver was completed. This is an InfoBus proxy which behaves towards clients like a Z39.50 server. But it does not itself contain any data. Instead, it translates incoming Z39.50 requests into InfoBus requests. These are forwarded through the InfoBus to any of the InfoBus-accessible sources. The resulting information is returned to the client as if it came from a Z39.50 service. The advantage of this is that standard Z39.50 clients can access the InfoBus without the use of our software. The Zserver completes our Z39.50 interoperability suite, complementing the University of Michigan Z39.50 client to InfoBus software reported on earlier.

The University of Santa Barbara was added as an InfoBus repository. We can search over the Alexandria collection and obtain map metadata and GIF thumbnails of maps. Similarly, Project Alexandria has demonstrated access to our other sources through the InfoBus.

We constructed a socket-based interface to the InfoBus which allows for using our DLIOP protocol through the Unix socket interface. CMU has used it successfully to access the InfoBus.

We developed a Web proxy generator. This is a facility which allows users to define the behavior of Web-based search services. Using this description, the facility automatically constructs an InfoBus proxy which can then provide access to the service through the InfoBus.

Work progressed on further integrating the SCAM service into the InfoBus architecture. When this work is finished, SCAM will be accessible via method calls. In addition, a Java interface is being developed.

The second version of InterBib was released. It now provides support for RTF and for bibliography search. The RTF support means that users may now submit MS Word and HTML documents with BibTeX or Refer bibliographies to InterBib, and have reference lists added to the documents. Version one only supported Framemaker. The search facility allows users to search over all the bibliographies that were previously submitted to InterBib, were parsed correctly, and were released for public access by the submitters. InterBib was also fixed to properly handle submissions originating from Macintosh computers. Previously, only PC and Unix were handled properly. InterBib is available at http://www-interbib.stanford.edu/~testbed/interbib.

Work progressed on allowing views to be added to objects. This facility helps when multiple services interact with document objects, and each facility would like to 'pretend' that the documents have different attributes and behavior. With document views, each module can attach a different view to the document objects it operates on. This is like dynamically adding a superclass to the object's class. We are also exploring additional uses of views, such as access control to documents, the dynamic addition of payment capabilities, etc.

We added a stand-alone collection service to the InfoBus. This allows InfoBus clients to store documents and other objects in DLIOP compliant collections with a variety of underlying storage managers. This facility is beginning to get used throughout the system.

The ability for search proxies to support subcollections was added. This allows convenient access to external services with multiple collection offerings. Example: Knight-Ridder's Dialog Information Service.

A new proxy to the Xerox PARC document summarization service was developed.

A new proxy for the NCSTRL collection was added to the InfoBus by our colleagues at Cornell.

A proxy was constructed which uses a converter from the New Zealand Digital Library to convert postscript to approximate text.

We constructed a DLITE component and proxy to TextBridge, a Xerox remotely accessible OCR service. The intent is for users to send a document image. The service then returns an OCRed copy. This technical aspect of this work is completed, and we are in negotiation with Xerox regarding release of the facility to outside the Xerox fire wall.

Work progressed on making the InfoBus testbed thread-safe. We have several proxies and parts of SenseMaker running with threads. More work is needed in this area.

3. Economics

Work began on InterPay II. It is intended to move beyond interoperability of payment to the notion of interoperability among 'shopping models'. Thus, our interoperability concerns for online financial activities are now expanding to include issues such as the sequence of component actions such as offers, invoice delivery, negotiation, verification, payment, document delivery, etc.

Our framework for distributed commerce transactions was presented at an international conference on distributed computing systems. Follow-up work has improved the basic architecture and algorithm, which has been proven sound and complete. Extensions for direct trust and deadlines are in progress.

In the SCAM work , we considered the case when documents are located in several distributed text databases. In this scenario, we studied several liberal and conservative techniques that helped us ``prune'' away databases that were very unlikely to have document copies. A new paper on dSCAM presents algorithms that compute ``minimal'' queries to retrieve potential copies of illegal documents in autonomous text databases.

Work on relationship management for access control (R-MANAGE) moved from design towards implementation. We have been augmenting the basic testbed infrastructure by rights management facilities in order to deal with access control issues such as licensing and copyrights. The basic approach is based on a contract framework, adding both persons ("epers") and relationships ("commpacts") as first-class citizens to interface and architecture: An "e-person" (or "epers") is a persistent, access-controlled electronic representation of a person, that provides a single point of reference for everything related to this person, including controlled ways to get hold of personal information, to request approvals, to leave behind notfications, to automtically set up certain standard relationships (such as accounts with content providers), etc. An epers is essentially a generalization of a Unix account. A "commpact" is a "communication pact" between e-persons that will be the representation of a legal contract in many cases, but it can also encapsulate less formal agreements, such as privacy related ones. Commpacts are technically a set of rights and obligations. For example, the obligation to pay a certain sum for a subscription to a newsletter might be one such promise in a subscription commpact. Actions are then evaluated (authorized) with respect to the context of a previously established commpact. R-MANAGE integrates with InterPay II at this action level. Much of its functionality is available to users as part of a separate "relationship manager" task in the task interface.

The following economics-related interface components (with their corresponding backend proxies) are available in the Dlite interface:

Person objects, contract objects, and certificate objects
Searchable person "home provider" services for Stanford and Xerox PARC
A searchable contract forms provider for standard contract forms
A contract manager and an offer creation service
"File system" manager based on persistent collections
Miscellaneous proxies for searching persons (whois).
Certification services: sample proxies for certifying simple properties such as affiliation with Stanford (Stanford lookup proxy).

DL proxies now also have an "owner" with whom users can contract for usage terms and conditions.

Authentication: Based on the person representation and public-key credentials (RSA/md5) issued by the home provider, a "network login" facility has been added to the testbed. Both the browser (Netscape via cookies) and the DLITE task viewers are thus able to convey who is using them, and testbed services can securely identify their users.

4. User Interfaces

In parallel to the Java experiments at the infrastructure level, the user interface thrust expanded its scope to explore how the easy delivery of InfoBus access software would impact user-level interaction with the InfoBus. We focused in particular on the possibility of collaboration among users, some of which might be mobile. Our driving scenario includes a user on the road, who needs to consult with a reference librarian at home. Challenges being addressed include a good balance between screen interactions being visible to all parties, and the constraints of bandwidth and latency shortcomings.

SenseMaker has evolved in that it now uses a structured hierarchical attribute interlingua for computing choices for grouping results into categories. In the previous prototype, we were using a hierarchy of target sources instead.

We began to develop a conceptual model for information finding that can be thought of as a Recursive Extensible Active Card catalog for Heterogeneity (REACH). It explains the categorization activity enabled by SenseMaker as the task of creating virtual card catalogs.

We completed the initial design of a new front-end interface for SenseMaker. This design introduces the concept of the "hi-citer." A hi-citer describes an information object through a delineated sequence of attribute values with special highlighting properties. Given a set of hi-citers, highlighting an attribute value in one causes that same attribute to be highlighted in the other hi-citers in the set. Hi-citers allow for fast skimming and the quick comparison of "citations" in a heterogeneous environment. We plan to implement the new design in the coming quarter.

Threads were added to the current version of SenseMaker. This allows multiple users to access SenseMaker over the Web at once.

We added support for new proxies in SenseMaker. This broadens the types of sources with which SenseMaker can communicate. An interesting new example is Informedia, the CMU video search service. We are also working to allow for third-party bundling in SenseMaker.

Formal testing of our audio-based Web interface technology has been completed. Analysis of test results is ongoing.

We ran several user studies for our DLITE interface in which users were asked to complete a bibliographic task with different versions of DLITE. Several changes have been made to the system in response to these studies.

We added another interface component to DLITE for helping users compose fielded queries. This is to help novice users who need fielded, but keyword-only query entry. This work was undertaken in response to our user testing.

Subcollection support was added to our source constructor. This allows users to create interface components that represent subcollections of external services. Dropping queries into these will cause searches in those corresponding subcollections.

Our WebWriter and InterBib systems were incorporated into the testbed and into DLITE.

5. Searching

We constructed first versions of interfaces to CSQuest and CMU's Informedia project. CSQuest is a concept space over Inspec which was constructed at the University of Arizona (see University of Illinois's DL project for an explanation of this term). We are planning to use this facility for our interactive query development module.

The CMU proxy searches over Informedia's video text, title and abstract database, not the video itself. We are planning to move this first implementation to CMU's evolving RPC interface.

Our front-end query language was extended to enable more effective queries to UCSB's ADL map service. Specifically, our data types now include numeric values, and the attribute set has been extended to include various spatial aspects. We also extended the query translation mechanism to add missing features, such as negation and stemming.

We added a query translator for NCSTRL's Dienst server, which uses STARTS-like query language.

Translation service to AltaVista (a Boolean search engine on the Web) was added to the testbed, and the query translation implementation was ported to SunOS and Solaris.

Our SenseMaker search interface progressed further this quarter. Recall that SenseMaker users "make sense" out of their result collections by looking at them through multiple views. Within a view, complexity is reduced through user-directed "merging" and "bundling" of results. Now, SenseMaker users can also contextually evolve the direction of the search process once they have made sense of the current collection of results. They can expand upon, limit, or replace the current collection of results. Examples of expand actions implemented this quarter include:

Query-by-example. Users can point to "bundles" of related results and ask for them to serve as examples of what is to be found.
Query refinement. Users can change their queries directly and can also accept suggestions as to new terms they might use in their query. These suggestions are obtained from the CSQuest proxy.

Two user studies were conducted to test the SenseMaker interface.

In our query translation project we conducted experiments for measuring the cost of our query translation approach. We have compared the selectivity of front-end and translated queries to understand the post-filtering cost. Specifically, the experiment was desined to measure selectivity degeneration with respect to the following translation schemes:

When a query consisting of proximity operators must be replaced by weaker operators such as AND.
When a query includes stopwords that must be removed from the query.
When a query uses the Equals operator (i.e. phrase search) which must be replaced by the Contains operator (i.e., keyword search).

In support of several subprojects we designed a set of metadata components which will work on the infobus, and which will satisfy several of our metadata needs. In particular, we added support for multiple attribute models in our front-end query language. Users can use attrModel.attrName notation to specify the search attributes in queries. The client of the query translator can specify a default attribute model. To support multiple attribute models, the query translator internally keeps no knowledge of the various models (except a default one); it relies on attribute model proxies to provide attribute details of standard models such as Bib-1.

We have designed a new algorithm for automatically classifying text documents into an existing topic hierarchy. That is, given a pre-existing hierarchy of documents, e.g., the Yahoo collection, our algorithm learns how to take new documents and insert them into their appropriate place in the hierarchy. The algorithm utilizes some of our previous work on feature selection for text documents.

We have also started a collaborative effort with researchers at MIT, whose goal is to improve the accuracy of ad hoc queries to a corpus. The idea is to combine latent semantic indexing with machine learning techniques. The former teases out some set of themes that are dominant in the corpus, while the latter determines how important the various themes are. For example, some themes may correspond to topics, while others may correspond only to stylistic differences.

6. Agents

The Fab adaptive information retrieval system has been running "live" since the end of March. Several user tests have been performed.

A proxy for Fab was added to the testbed, and an interface has been built to allow use of Fab from within our DLITE interface.

7. STARTS Proposal for Meta-Search Support

We have been active in the definition of a proposal to support metasearching on the Internet. The proposal addresses three problems encountered by services that search multiple, heterogeneous search engines to satisfy a given query: finding promising collections, submitting appropriate forms of the query to the corresponding engines, and merging result rankings.

We held a one day workshop with several major search engine providers and consumers to reach agreement on a final draft. This draft is available at http://www-diglib.stanford.edu/cgi-bin/WP/get/SIDL-WP-1996-0043.

The Z39.50 community is working on a Z39.50 profile based on STARTS. Our Cornell colleagues have just completed a reference implementation of the protocol ( http://www2.cs.cornell.edu/Lagoze/starts/starts_reference.htm).

8. Miscellaneous Activities

Prof. Jerry Saltzer of MIT visited for one month, meeting individually with project team members and attending the seminars and weekly technical design meetings.

8.1 Visitors and Industry Contacts

Cornell
Hitachi on several occasions
Intuit
University of Karlsruhe
LRI, France
University of Massachussetts
Matshishita
Microsoft (Bill Gates)
MIT Media Lab
NEC
Osaka Gas
SUN
Tokyo University
Rebecca Lasher Met with Tuija Sonkkila from Helsinki.
Rebecca Lasher Met with several Korean visitors
Rebecca Lasher and Martin Roscheisen Met with Pam Samuelson from Berkeley Law School and School of Library and Information Science.
Paul Allen.
Phone interview with Bob Burkman, Editor of "Information Advisor" a technical newletter.
NTT Data of Japan.
Rebecca Lasher and Vicky Reich met with Mark Stefik of Xerox PARC.
Vicky Reich met with Paivi Kytomaki, Director of a library in Finland.
Vicky Reich met with Kuriyama Masamitu, a reference librarian from Japan.
Daphne Koller and Vicky Reich met with visitors from Hitachi.
Vicky Reich met with visitors from Matsushita.
Marko Balabanovic Gerry Andeen and John Eastling, Personal Discovery (will collaborate on psychographic profiling for web page recommendation)
Marko Balabanovic Paul Francis, NTT Japan (investigating relationships to their Ingrid system)
Marko Balabanovic Journalist from Frankfurt newspaper
Marko Balabanovic Thomas Bayer, Daimler Benz Research (automatically distributing email inquiries)
Michelle Baldonado, Steve Cousins, Luis Gravano Two visitors from India.
Michelle Baldonado Paul Francis from NTT
Kevin Chang Talked with Carl Lagoze (NCSTRL, Cornell Univ.) about suggested changes to the Dienst query language to facilitate integration of NCSTRL into our testbed.
Talked with Doreen Cheng from Phillips Palo Alto Research Lab on our work on query translation.
Steve Cousins Met with Roy Jones from the Stanford Business School to discuss DL issues.
Steve Cousins Met with Stanford Alumni to present the Digital Library project.
Hector Garcia-Molina Talked to John Sarborg Pedersen from Embassy of Denmark about emerging DigLib project in Denmark.
Hector Garcia-Molina, Andreas Paepcke, Terry Winograd, Rebecca Lasher: Steve Griffin of NSF
Steve Ketchpel Attended AAAI as member of program committee
Daphne Koller Met with Dr. Iwayama and Dr. Niwa from Hitachi.
Daphne Koller Met with Dr. Alon Levy from AT&T Research on the topic of architectures for intelligent information gathering.
Rebecca Lasher Carole Alcock, librarian from Australia.
Rebecca Lasher Carl Lagoze, from Cornell at Stanford.
Rebecca Lasher Met with five visitors from Denmark, who are guiding digital library development there. The group represented the Ministry of Research and Information Technology, Ministry of Culture, Ministry of Research, Ministry of Education, and the Royal Danish Embassy.
Rebecca Lasher Andrew Odlyzko, from AT&T.
Vicky Reich Entertained visitors from Max Planck Institute.
Vicky Reich and Martin Roscheisen Invitational Workshop on Terms and Conditions, NY, Sept 22-24th. Organized by Jim Davis and Judith Klavans.
Terry Winograd, Steve Cousins, Scott Hassan Visitors from Sony (included discussion and demo): Toshitada Doi, President of the D21 laboratory, and member of Sony's Board of Directors (Formerly, Dr. Doi ran Sony's workstation operations worldwide). Masao Watari, Manager of the Speech Group, D21 laboratory Hiroaki Ogawa, Engineer of the Speech Group, D21 Laboratory Mick Tanaka, Manager of Speech Recognition, Sony Research Labs (San Jose)
Terry Winograd Tomoyuki Yoshida, National Institute of Bioscience and Human Technology Watanabe Masayoshi, MITI Kazushige Suzuki, Research Institute of Human Engineering for Quality Life Masaki Taniguchi, Osaka National Research Institute

8.2 Public Presentations and Meetings Attended

We organized a one-day workshop where several major search engine providers and consumers discussed the STARTS proposal.

Michelle Baldonado, 38th Allerton Institute 1996 (Libraries, People, and Change: A Research Forum on Digital Libraries)
Steve Cousins, Demonstration for research library staff at Xerox PARC
Steve Cousins, Hypertext 96
Hector Garcia-Molina, Distinguished Lecture, Univeristy of Illinois at Chicago (Feb 2, 96)
Hector Garcia-Molina, Rebecca Lasher Meeting of DL Forum Working Group on Repositories and Interoperability, (March 11-12, 96)
Hector Garcia-Molina, Distinguished Lecture, ETH Zurich (April 22, 96)
Hector Garcia-Molina, Talk and visit to Hitachi in Tokyo, Japan (May 20-21, 1996)
Hector Garcia-Molina, Distinguished Lecture, Hong Kong Univerity (May 23, 1996)
Hector Garcia-Molina, Andreas Paepcke, University of Michigan. DLI meeting
Kenichi Kamiya, World-Wide Web conference. Presented paper
Steve Ketchpel, 16th International Conference on Distributed Computing Systems (DCS'96). Presented paper.
Steve Ketchpel, CommerceNet meetings
Andreas Paepcke, DELOS meeting of ERCIM. Presented overview of Stanford DL project.
Vicky Reich, User evaluation meeting at UCLA
Vicky Reich Intellectual Property Symposium, UCBerkely Law School
Vicky Reich San Jose School of Library and Information Science: Presented talk on Digital Libraries and Electronic Publishing
Vicky Reich Stanford Humanities Center. Presented talk on Digital Libraries and Electronic Publishing
Vicky Reich Attended Hackers Conference in Santa Rosa.
Vicky Reich Coalition for Networked Information meeting, San Fransico
Vicky Reich Organized half day visit for Kennith Crews. Ken gave various talks on intellectual property and copyright in the digital age.
Martin Roscheisen, Beyond Privacy as Anonymity: Rights Management Technologies for Privacy in a Market of Personal Information. Lunchtime presentation at Computers, Freedom, and Privacy. Conference on Computer Freedom and Privacy, CFP96, Boston, March 28-30. 1996.
Martin Roscheisen, IEEE Symposium on Research in Security and Privacy
Narayanan Shivakumar, DL96. Presented paper (see Bibliography below)
Terry Winograd, Steve Cousins, Kenichi Kamiya, CHI 96. Session chair. Presented poster and short paper.
Group attendance at DLI Meeting in at the University of Michigan.
Marko Balabanovic: "Adaptive Information Retrieval on the World-Wide Web", CSLI Interface Lab Tutorials, May 13 1996 (part of a series of talks for the CSLI industrial affiliates).
Luis Gravano: Invited talk, W3C Distributed Indexing/Searching Workshop (Cambridge, Massachusetts, May 28-29)
Steve Ketchpel: "Making Trust Explicit in Distributed Commerce Transactions". Presentation at International conference on Distributed Computing Systems
Steve Ketchpel: Presented "U-PAI: A Universal Payment Application Interface" at the Second USENIX Workshop on Electronic Commerce. November 20, 1996.
Rebecca Lasher, Andreas Paepcke, Scott Hassan: Carl Lagoze and Jim Davis of Cornell University
Rebecca Lasher: Presentation on digital libraries and the economics of electronic journals at the annual Special Libraries conference in Boston June 9, 1996
Rebecca Lasher: NCSTRL meeting in Washington D.C.
Rebecca Lasher: Met with nine court officials from Korea. The group were composed of judges and computer scientists who are working together to build a legal digital library.
Andreas Paepcke: Invited talk on digital libraries, object file systems, and databases in the context of the Web and distributed object technology. OMG/W3C workshop on distributed objects and mobile code in Boston.
Vicky Reich and Rebecca Lasher: Attended a full day class on copyright at UC Berkeley, May 4th.
N. Shivakumar: Phone interview with Phil Ross, Reporter, Forbes Magazine.
Terry Winograd: Talk and book signing, Stanford book store.
Terry Winograd: Presentation at Interval Research: "User models and ontologies".
Luis Gravano Attended a meeting of Z39.50 implementors to help launch a new Z39.50 STARTS-based profile.
Steve Ketchpel Gave talk at Rudgers University during workshop on Trust Management in Networks
Steve Ketchpel Talk at AT&T Research to their online information systems & Services group about U-PAI and Distributed Transactions.
Daphne Koller Attended the annual American Association for Artificial Intelligence conference in Portland, Oregon.
Daphne Koller Attended the annual Uncertainty in Artificial Intelligence conference in Portland, Oregon.
Rebecca Lasher and Vicky Reich Met with ACM publications staff as part of a librarian advisory committee on electronic publications.
Andreas Paepcke Presented our metadata architecture at the 2nd Delos workshop of ERCIM, a research consortium of the European Union.
Mehran Sahami Attended and presented at the annual Machine Learning conference in Bari, Italy.
Mehran Sahami Attended the annual American Association for Artificial Intelligence conference in Portland, Oregon.
Mehran Sahami Attended the annual Uncertainty in Artificial Intelligence conference in Portland, Oregon.
Mehran Sahami Attended and presented at the annual Knowledge Discovery in Databases conference in Portland, Oregon.
Terry Winograd Talk at Adobe, Mountain View, on the digital libraries project: Digital Libraries, Documents, and Services
Terry Winograd Organized and spoke at workshop on HCI Design, including talk on "Designing the Space of Interactions"
Terry Winograd Workshop at Stanford on copyright and new technology, organized by the Register of Copyrights, and Stanford Law School.
Terry Winograd Invited speaker at the Silicon Valley Chapter of the Association for Software Design

8.3 Local Events

NSF/ARPA/NASA site visit
Stanford Forum. A large gathering of industry representatives. Presented InfoBus technology and SenseMaker
Hosted DLI meeting in December
Hosted NDLP meeting in December

8.4 Regular Meetings/Seminars

Weekly Digital Library seminar
Executive committee meetings when required
Weekly technical design meetings

9. References for 1996

[1] Marko Balabanovic and Yoav Shoham. Combining Content-Based and Collaborative Recommendation. Communications of the ACM, 40(3), March, 1997.

[2] Marko Balabanovic. An Adaptive Web Page Recommendation Service. In Proceedings of the First International Conference on Autonomous Agents, February, 1997.

[3] Michelle Q Wang Baldonado and Steve B. Cousins. Addressing heterogeneity in the networked information environment. Review of Information Networking, to appear.

[4] Michelle Q Wang Baldonado and Terry Winograd. SenseMaker: An Information-Exploration Interface Supporting the Contextual Evolution of a User's Interests. In Proceedings of the Conference on Human Factors in Computing Systems, 1997.

[5] Michelle Baldonado, Chen-Chuan K. Chang, Luis Gravano, and Andreas Paepcke. The Stanford Digital Library Metadata Architecture. International Journal of Digital Libraries, 1(2), February, 1997. See also http://www-diglib.stanford.edu/cgi-bin/WP/get/SIDL-WP-1996-0051.

[6] Michelle Baldonado, Chen-Chuan K. Chang, Luis Gravano, and Andreas Paepcke. Metadata for Digital Libraries: Architecture and Design Rationale. In Submitted to DL97, 1997. At http://www-diglib.stanford.edu/cgi-bin/WP/get/SIDL-WP-1997-0055.

[7] Edward Chang and H�ctor Garc�a-Molina. Reducing Initial Latency in a Multimedia Storage System. In Third International Workshop of Multimedia Database Systems, 1996.

[8] Chen-Chuan K. Chang, H�ctor Garc�a-Molina, and Andreas Paepcke. Boolean Query Mapping Across Heterogeneous Information Sources. IEEE Transactions on Knowledge and Data Engineering, 8(4):515-521, Aug, 1996.

[9] Edward Chang and Hector Garcia-Molina. Minimizing Memory Use In Video Servers. Number SIDL-WP-1996-0045. Stanford University, December, 1996.

[10] Chen-Chuan K. Chang and Hector Garcia-Molina. Evaluating the Cost of Boolean Query Mapping. In Submitted to DL97, 1997. At http://www-diglib.stanford.edu/cgi-bin/WP/get/SIDL-WP-1997-0053.

[11] Steve B. Cousins, Scott W. Hassan, Andreas Paepcke, and Terry Winograd. A Distributed Interface for the Digital Library. Number SIDL-WP-1996-0037. Stanford University, 1996. Accessible at http://www-diglib.stanford.edu/cgi-bin/WP/get/SIDL-WP-1996-0037.

[12] Steve B. Cousins, Andreas Paepcke, Terry Winograd, Eric A. Bier, and Ken Pier. The Digital Library Integrated Task Environment (DLITE). 1997. Submitted to DL 97.Accessible at http://www-diglib.stanford.edu/cgi-bin/WP/get/SIDL-WP-1996-0049.

[13] Arturo Crespo and Eric A. Bier. WebWriter: A Browser-Based Editor for Constructing Web Applications. In Proceedings of the Sixth World-Wide Web Conference, 1996.

[14] Arturo Crespo, Bay-Wei Chang, and Eric A. Bier. Responsive Interaction for a Large Web Application: The Meteor Shower Architecture in the WebWriter II Editor. In Proceedings of the Seventh World-Wide Web Conference, 1997.

[15] Luis Gravano, Narayanan Shivakumar Hector Garcia-Molina. dSCAM: Finding Document Copies Across Multiple Databases. Proceedings of 4th International Conference on Parallel and Distributed Information Systems, 1996.

[16] Luis Gravano, Chen-Chuan K. Chang, H�ctor Garc�a-Molina, and Andreas Paepcke. STARTS: Stanford Proposal for Internet Meta-Searching. In Proceedings of the International Conference on Management of Data, 1997.

[17] Kenichi Kamiya, Martin R�scheisen, and Terry Winograd. Grassroots: A System Providing a Uniform Framework for Communicating, Structuring, Sharing Information, and Organizing People. In Proceedings of the Sixth World-Wide Web Conference, 1996. Also published in part as a short paper for CHI'96 (conference companion).

[18] Steven Ketchpel and H�ctor Garc�a-Molina. Making Trust Explicit in Distributed Commerce Transactions. In Proceedings of the International Conference on Distributed Computing Systems, 1996.

[19] Steven Ketchpel, Hector Garcia-Molina, Andreas Paepcke, Scott Hassan, and Steve Cousins. UPAI: A Universal Payment Application Interface. In USENIX 2nd e-commerce workshop, 1996.

[20] Steven P. Ketchpel, Hector Garcia-Molina, and Andreas Paepcke. Shopping Models: A Flexible Architecture for Information Commerce. In Submitted to DL97, 1997. At http://www-diglib.stanford.edu/cgi-bin/WP/get/SIDL-WP-1996-0052.

[21] Ron Kohavi and Mehran Sahami. Error-Based and Entropy-Based Discretization of Continuous Features. In Second International Conference on Knowledge Discovery in Databases, 1996. At ftp://starry.stanford.edu/pub/sahami/papers/kdd96-disc.ps.

[22] D.Koller and Y. Shoham. Information agents: A new challenge for AI. IEEE Expert:8-10, June, 1996.

[23] Daphne Koller and Mehran Sahami. Toward Optimal Feature Selection. 1996. Submitted for publication.

[24] D. Koller and M. Sahami. Hierarchically Classifying Documents Using Very Few Words. In Submitted to ICML97, 1997.

[25] Andreas Paepcke, Steve B. Cousins, H�ctor Garc�a-Molina, Scott W. Hassan, Steven K. Ketchpel, Martin R�scheisen, and Terry Winograd. Towards Interoperability in Digital Libraries: Overview and Selected Highlights of the Stanford Digital Library Project. IEEE Computer Magazine, May, 1996.

[26] Andreas Paepcke. Searching is Not Enough: What We Learned On-Site. D-Lib Magazine, May, 1996.

[27] Andreas Paepcke. Information Needs in Technical Work Settings and their Implications for the Design of Computer Tools. Computer Supported Cooperative Work: The Journal of Collaborative Computing, 5:63-92, 1996.

[28] Martin R�scheisen and Terry Winograd. A Communication Agreement Framework of Access/Action Control. In Proceedings of the 1996 IEEE Symposium on Research in Security and Privacy, 1996.

[29] M. Sahami, M. Hearst, and E. Saund. Applying the Multiple Cause Mixture Model to Text Categorization. In Proceedings of the Thirteenth International Conference on Machine Learning, pp. 435-443. Morgan Kaufmann, 1996. At ftp://starry.stanford.edu/pub/sahami/papers/ml96-mcmm.ps.

[30] Mehran Sahami. Learning Limited Dependence Bayesian Classifiers. In Second International Conference on Knowledge Discovery in Databases, 1996. At ftp://starry.stanford.edu/pub/sahami/papers/kdd96-learn-bn.ps.

[31] Narayanan Shivakumar and H�ctor Garc�a-Molina. Building a Scalable and Accurate Copy Detection Mechanism. In Proceedings of the Third Annual Conference on the Theory and Practice of Digital Libraries, 1996.

[32] Tak Woon Yan, Matthew Jacobsen, H�ctor Garc�a-Molina, and Umeshwar Dayal. From User Access Patterns to Dynamic Hypertext Linking. In Submitted to the Fifth World-Wide Web Conference, 1996.

10. Other Publications

With the help of Xerox PARC a new, updated video tape of the DLITE interface was produced.