Shivakumar, N. and Garcia-Molina, H. (1996) Building a Scalable and Accurate Copy Detection Mechanism. In: Proceedings of 1st ACM International Conference on Digital Libraries (DL'96) , March 1996, Bethesda Maryland.
BibTeX | DublinCore | EndNote | HTML |
![]()
| PDF 180Kb |
Abstract
Often, publishers are reluctant to offer valuable digital documents on the Internet for fear that they will be re-transmitted or copied widely. A Copy Detection Mechanism can help identify such copying. For example, publishers may register their documents with a copy detection server, and the server can then automatically check public sources such as UseNet articles and Web sites for potential illegal copies. The server can search for exact copies, and also for cases where significant portions of documents have been copied. In this paper we study, for the first time, the performance of various copy detection mechanisms, including the disk storage requirements, main memory requirements, response times for registration, and response time for querying. We also contrast performance to the accuracy of the mechanisms (how well they detect partial copies). The results are obtained using SCAM, an experimental server we have implemented, and a collection of 50,000 netnews articles
Item Type: | Conference or Workshop Item (Paper) | |
---|---|---|
Uncontrolled Keywords: | SCAM, Copy detection, Plagiarism, Copyright | |
Subjects: | Computer Science > Digital Libraries | |
Projects: | Digital Libraries | |
Related URLs: | Project Homepage | http://www-diglib.stanford.edu/diglib/pub/ |
ID Code: | 180 | |
Deposited By: | Import Account | |
Deposited On: | 25 Feb 2000 16:00 | |
Last Modified: | 09 Dec 2008 09:36 |
Download statistics
Repository Staff Only: item control page