Digital Library Project Stanford University Quarterly Report. November 1, 1995 1 Administrative ---------------- On Oct 23 we held a meeting of parts of the Advisory Board of our Digital Library Project, where our progress and plans were discussed. Eugene Miya, our NASA funding contact, was present. We hired a systems programmer: Tom Schirmer. He will start Nov 13. 2 Infobus Architecture and Testbed ---------------------------------- All testbed service proxies are now operating under the new distributed, asychronous information exchange protocol we developed with University of Michigan and University of Illinois. Members of the technical team at U. Mich visited Stanford to work out the details. Members of the Stanford team visited U. of I. for the same purpose. First remote method calls from U Mich to the Stanford Dialog proxy have been sent and processed. A meeting with UC Santa Barbara revealed common interests, so we plan to collaborate. Concrete plans for division of labor were drawn up. The Oracle ConText summarization tool was added to the testbed as a service. A new tool, InterBib allows management of citation databases, translation of citations along several dimensions, the automatic construction of reference lists for publications and screen displays, and the dynamic importation of citations from online services. 3 Economics ----------- We continued implementation of the InterPay architecture, focusing on First Virtual and DigiCash payment mechanisms. The system now supports the transfer of real funds through commercial payment mechanisms. We defined a formal framework for evaluating the feasibility of distributed transactions in the face of mistrusting participants and developed algorithms for creating feasible execution sequences. The effectiveness of "indemnities" for distributed transactions were demonstrated. This research was submitted to International Conference on Distributed Computing Systems '96. We participated in CommerceNet payment systems working group meetings. A subpart of our DL bibliography on electronic payment mechanisms was advertised to this group, resulting in accesses from 71 different hosts. We have continued our work on copy detection of digital documents. The SCAM prototype was integrated into the testbed. 4 User Interfaces ----------------- We developed two scenarios for driving our interface work. One is the construction and maintenance of computer science courses, the other is the task of keeping up with competitors and activities in large companies. Using these scenarios, we prototyped an interface based on the concept of "Task Component Networks" (TCNs). It uses animation support and Netscape interface features to allow the user to interact with live services on the InfoBus, such as search services (WebCrawler, Lycos, Dialog, InfoSeek...), summarization (Context), SCAM and SenseMaker SenseMaker is a prototype tool, which allows users to perform interactive analysis on results from heterogeneous search services. Currently, SenseMaker mediates between the user and five distinct sources (WebCrawler, Lycos, InfoSeek's Web database, Dialog 275, and Folio's Inspec database). It communicates with these services using the Stanford InterOpprotocol, and can be accessed via both Web and non-Web interfaces. The tool enables users to recursively generate dynamic, customized abstractions of results. We have also continued to work on the theoretical model of designators which underlies the tool. A set of Web-related tools were developed. The WebWriter Editor includes: a parser for HTML, based on lex and yacc, that supports HTML 2.0 and many of the 3.0 extensions implemented by Netscape; a user interface with forms-based dialogs for editing the parameters of 35 different classes of HTML objects, including tables and clickable maps, cut-and-paste, file saving and loading, a page-level preview mode, and a facility called "binders" that allows multipage applications to be loaded as a unit. The WebWriter Interpreters include: a variable-substitution mechanism that allows the user to describe which parts of a page are to be filled in later by programs; two "interpreters" that assemble a new page given an HTML page with marked output regions, and the output of one or more Unix programs. The WebWriter Development Environment is a set of C++ libraries that support: generating HTML objects including user interface widgets; retrieving documents using the file, http, and ftp protocols; and saving and restoring the state of Web sessions by "pickling" and "unpickling" C++ objects, so that applications can have objects whose apparent lifetime spans many Web pages. Several demo applications were built including a file directory editor called Web Neptune, a chat program called Walkie-talkie, and a GIF displayer called Coins. We also developed an autmated working papers repository which is accessible through the Web via the SIDL home page. 5 Searching ----------- Our query translation work has progressed. We formalized the theoretical aspects of the query mapping and completed the basic theorems that support the minimality of the translation. A first prototype version of the query translator for Dialog and Stanford-Folio was implemented. This prototype implementation was integrated into the testbed. 6 Miscellaneous Activities -------------------------- 6.1 Interviews - Several 6.2 Visitors and Industry Contacts Patrick Valduriez, INRIA, Aug 4 Dave Kristoferson, Dialog Aug 9 Constance McLindon, NSF, Aug 23 Don Haderlee, IBM, Aug 24 Peter Lockerman, Prof. U. Karlsruhe, Germany, Aug 25 Dan Kiscis, U. of Michigan, Sep 18 John Ellis, OpenMarket, Sept 25 Director of Technology, Vietnamese Ministry of Education, Oct 9 Hitachiv visitors, Oct 13 Terry Smith and his colleagues from U. of Santa Barbara, Oct 19 Robert Pettengill from Schlumberger Austin Research, Oct. 24th. Lourdes Feria, Mexico Jeff Winters, Discover Magazine 6.3 Public Presentations and Meetings Attended Jeff Mappes from Stanford's Information Technology Systems & Services led a session on how Stanford is handling security particularly in Web space (Aug. 30). Hector-Garcia Molina participated in a panel on "The Future of Digital Journals" at the VLDB Conference in Zurich, September 12, 1995. Luis Gravano presented paper at VLDB on generalizing GlOSS, September 12, 1995 (see bibliography below). Tak Woon Yan gave a talk on copy removal SIFT at VLDB, September 12, 1995. Hector-Garcia Molina visited Santa Barara Oct 31, 1995. Gave a talk and discussed how to interoperate. ASIS Oct. 9-12 Vicky Reich. California Academic Research Libraries Conference Oct. 21st Vicky Reich and Rebecca Lasher. The group visited Xerox PARC on Oct 23 to present results. Andreas Paepcke and Scott Hassan visited U. of Illinois and presented results. Oct 26. Allerton Conference attended by Vicky Reich Oct. 29-31. 6.4 Regular Meetings/Seminars - Weekly digital library seminar - Weekly executive committee meetings - Weekly technical design meetings 7 Bibliography -------------- These papers are also available on our home page: http://www-diglib.stanford.edu Towards Interoperability in Digital Libraries: Overview and Selected Highlights of the Stanford Digital Library Project. Andreas Paepcke, Steve B. Cousins, Hector Garcia-Molina, Scott W. Hassan, Steven K. Ketchpel, Martin Roscheisen, Terry Winograd. Submitted to IEEE Computer. IITA Digital Libraries Workshop report. Clifford Lynch, Hector Garcia-Molina et al. Making Trust Explicit in Distributed Commerce Transactions. Steven Ketchpel and Hector Garcia-Molina, Submitted for publication. The SCAM Approach to Copy Detection in Digital Libraries. N. Shivakumar and H. Garcia-Molina, Diglib Magazine, November 1995. Building a Scalable and Accurate Copy Detection Mechanism. N. Shivakumar and H. Garcia-Molina, Submitted for publication. Generalizing GlOSS to Vector-Space Databases and Broker Hierarchies. Luis Gravano, Hector Garcia-Molina,VLDB'95, September 1995. Beyond Browsing: Shared Comments, SOAPs, Trails and On-line Communities. M. Roscheisen, C. Mogensen and T. Winograd, WWW'95, Darmstadt. Techniques and Tools for Making Sense out of Heterogeneous Search Service Results. Michelle Q Wang Baldonado and Terry Winograd. Submitted to Digital Libraries '96 Boolean Query Mapping Across Heterogeneous Information Sources. Kevin Chen-Chuan Chang, Hector Garcia-Molina, Andreas Paepcke. Submitted to IEEE Transactions on Knowledge and Data Engineering. Other materials: Video: We have prepared a 10 minute videotape on WebWriter, entitled: "WebWriter: Interface Development on the World Wide Web." shot September 22, 1995.