Shivakumar, N. and Garcia-Molina, H. (1995) SCAM: A Copy Detection Mechanism for Digital Documents. In: 2nd International Conference in Theory and Practice of Digital Libraries (DL 1995), June 11-13, 1995, Austin, Texas.
BibTeX | DublinCore | EndNote | HTML |
![]()
| PDF 235Kb |
Abstract
Copy detection in Digital Libraries may provide the necessary guarantees for publishers and newsfeed services to offer valuable on-line data. We consider the case for a registration server that maintains registered documents against which new documents can be checked for overlap. In this paper we present a new scheme for detecting copies based on comparing the word frequency occurrences of the new document against those of registered documents. We also report on an experimental comparison between our proposed scheme and COPS [6], a detection scheme based on sentence overlap. The tests involve over a million comparisons of netnews articles and show that in general the new scheme pbetter in detecting documents that have partial overlap. Keywords: Copy Detection, Plagiarism, Registration Ser-ver, Databases
Item Type: | Conference or Workshop Item (Paper) | |
---|---|---|
Uncontrolled Keywords: | SCAM, Copy detection, Plagiarism, Copyright | |
Subjects: | Computer Science > Digital Libraries | |
Projects: | Digital Libraries | |
Related URLs: | Project Homepage | http://www-diglib.stanford.edu/diglib/pub/ |
ID Code: | 95 | |
Deposited By: | Import Account | |
Deposited On: | 25 Feb 2000 16:00 | |
Last Modified: | 05 Feb 2009 15:05 |
Download statistics
Repository Staff Only: item control page