Solution Corner - Science Editor

Dear Solution Corner:

Automated plagiarism-detection tools seem to be gaining a lot of traction in the STM publishing community. What should I know about their potential benefits and limitations?

Signed, Wondering in Walla Walla, Washington

Dear WWWW:

A number of excellent tools are emerging to assist editors in this regard. SC cannot endorse one over another, but we can help you to understand what you can expect from these tools. One of the most important things to know is that in practice these are similarity-detection tools; until further notice, human beings must ultimately determine what is actual plagiarism. So an important consideration for any organization that is considering one of these tools is whether it already has clear guidelines for authors on what constitutes plagiarism and clear practices for its staff or volunteer editors to follow when plagiarism is detected. Adopting a similarity-checking tool before thinking through policies and practices could result in the organization’s being presented with a plethora of information without having a clear path forward for dealing with it.

Most similarity-checking tools will give you a report that shows which text in a given manuscript is identical with text elsewhere. Depending on the tool and your settings, “elsewhere” can be as broad as “the Internet” or narrowed to include only material archived in specific databases (PubMed, for instance). Most tools will allow you to narrow your search by excluding some parts of the manuscript that you are checking (you might not want to include the reference list, for instance). The report will usually tell you what overall percentage of the manuscript matches other sources and then break down the matches to show you how much of the manuscript is similar to particular sources. The most sophisticated of the tools will make their reports available to you in an online version and will include live links to the sources so that you can easily navigate between the manuscript that you are checking and the source of similarity.

Most of the tools allow you to choose when to do a similarity check and on which manuscripts: You may choose to submit all your manuscripts or just a portion of them, and you may choose to do similarity checks only on new manuscripts, only on manuscripts that are ready for publication, or some combination. Organizations that run similarity reports on a large number of manuscripts may find it useful to spend a few months in observing the overall similarity scores for manuscripts and in finding a comfortable “threshold” score below which it is not usually worthwhile for a human to take a further look.

An additional thing to keep in mind is that similarity-checking tools often reveal self-plagiarism, whereby an author neglects to cite formally portions of papers that he or she published previously. The extent to which that is a violation of ethics depends on the scientific community in question, but in many circles it is considered to be as serious as plagiarizing the work of others.

In summary, WWWW, the advent of these powerful tools is a double-edged sword, like so many other things in life. On the one hand, they constitute a breakthrough that should make the maintenance and enforcement of a high standard of publishing ethics easier as part of the peer-review process. On the other hand, the results that they provide often raise questions and concerns that do not have clear-cut answers. Ultimately, even the best of the automated tools require editors to be adequately trained in the nuances of the results for them to be truly effective in practice.

KENNETH F HEIDEMAN is director of publications, American Meteorological Society, Boston, Massachusetts.