The following is something I wrote for
RISKS forum that I thought others might be interested in. A recent discussion on the
USACM (Public Policy Committee of the Association for Computing Machinery) mailing list triggered these thoughts.
It's obvious that the availability of so much information online makes plagiarism easier - it's impossible for a reader to know everything that could have been used without permission or attribution. On the flip side, things like Google make it easier to find suspected instances - as an example, when I'm reviewing an article for a journal or conference, I frequently put phrases in to Google that I suspect are stolen, and have on numerous instances found that they were in fact taken verbatim without attribution. [Hint to the plagiarist: if you're going to use someone
else's words without attribution, make sure they fit with your writing style. This is particularly notable when choosing text written by someone with a different native language than your own - if your native language is English and you copy something written by a native Chinese speaker, it will be fairly obvious; the converse is also obviously true.]
For high school and college students, technology like
TurnItIn is one way of finding plagiarism without teachers having to do extensive searching. Although I haven't personally seen the output, my understanding is that the student submits text which is automatically analyzed, and potential instances of plagiarism are noted in a message to the teacher. (If someone could provide a better explanation, I'd certainly appreciate it! I noticed that
TurnItIn now put emphasis on improving students' writing style, perhaps as a way to give students a feeling that they're getting something out of the deal.)
There are several problems with products of this sort:
(1) False positives. When my daughter was in high school, she noted several times that
TurnItIn considered her a plagiarist because it was unable to distinguish between properly quoted/referenced text, and unauthorized copying. Teachers who simply look at the overall "score" without reading the individual comments will tend to penalize those students who do the best job of citing background work! (I'm reasonably sure that
TurnItIn is sufficiently cautious as not to deny that there are false positives, and to strongly encourage teachers and students to examine the results rather than simply believing them verbatim.)
(2) Copyright infringement.
TurnItIn keeps copies of student papers in their database, for matching against future papers. This seems reasonable at first blush - after all, selling term papers is an old tradition, dating back well before the Web (although today's students may not believe that)! However, by keeping submissions for matching,
TurnItIn may be violating copyright, as a recent lawsuit claims (see
"McLean Students Sue Anti-Cheating Service", Washington Post, March 29 2007). Additionally, students have effectively no option to refuse adding their papers to the database, and are not compensated for their submissions.
So to bring this to RISKS, the issue is that we have competing risks: the risk of plagiarism being combated by
TurnItIn and similar products vs. the risk of unfair accusations of plagiarism and copyright infringement - all of which is enabled by technology.