Hypothes.is: Open Annotation + Science

This year, we celebrate the 350th anniversary of the first scientific journal, the Proceedings of the Royal Society of London.* Dissemination of knowledge is fundamental to science, yet despite the increasing power and pervasiveness of information technology in all fields, the research article—the primary means of scientific communication—has remained virtually unchanged (FORCE11 Manifesto, www.force11.org/white_paper).

Now, however, online tools allow researchers around the world to rapidly distribute articles or other digital research objects, which are then transformed into interactive forums for discussion and for the linking of knowledge.

Powerful new platforms—such as Mendeley, Zotero, and Research Gate— allow researchers, for example, to discuss and share scientific papers or comment on books. These notes and comments are conversations among researchers who are separated in space and time. Such annotations create knowledge layers that can enhance value and link content across documents. Consider how scribbles in the margins of historic texts are prized for yielding insight into the minds of authors or readers of earlier times.

For the researcher, annotations are an important way to organize field notes. They are also used to share thoughts privately or in small groups of collaborators. However, until now, such sharing has occurred exclusively within the participants’ own specific platforms.

A new approach that is in development is based on standards work going on at the W3C, the 20-year-old international body that manages open standards for the web. Imagine that researchers reading an article anywhere online or via an app on a tablet could engage in conversation—public or private—regardless of website or platform. Imagine that this social layer is distinct and separate from the document and that it is based on an open standard, so that anyone could create software to read or write contributions. Further imagine that this discussion could take advantage of the precision of annotations with powerful semantic tagging and copyediting features.

Hypothes.is is developing software to enable that vision. As a nonprofit organization, we believe that, like the web, this new annotation layer should be unencumbered by private interests that will kill its chances of being widely useful. An early prototype released in October 2014 allows users to select content within any web page or PDF and annotate it in conjunction with other users. Unlike traditional comments on web pages, annotations are placed into context (e.g., on a snippet of text or an image), not in an endless scroll at the bottom, where the target of the comment is likely to be unclear. In addition, unlike most existing annotation paradigms, these are designed specifically for sharing via the web and will, when complete, use the Web Annotation standard developed by the W3C (www.w3.org/annotation/). A browser plug-in reveals public conversations as a layer on a particular page. With those layers, comments can be turned on and off as the reader chooses. Alternatively, individuals and (soon) groups can annotate for their own purposes and choose not to share their discussions. Essentially, Hypothes.is allows multiple users to take notes or have discussions—all online—without the need to download, print, or import or export content into a particular environment.

We think that annotations can play a role not only as a postpublication layer but during the entire cycle of knowledge production, including research, writing, revising, and peer review. Last year, together with the American Geophysical Union (AGU), eLife, and arXiv, we secured a grant from the Sloan Foundation to work toward bringing annotation to scholarly peer review. In-line annotation would allow reviewers and authors to interact directly in particular parts of an article in a threaded discussion format while preserving anonymity. Depending on the journal’s model, selected discussion threads could be made available with the published article to help readers to understand nuances behind key passages. In January of this year, eJournalPress, the review platform used by AGU and eLife, previewed an integrated version using Hypothes.is and brought to life an annotated review workflow. After cycles of feedback and subsequent development, full implementation is slated for late 2015.

With arXiv, the preprint service run by Cornell University, rather than formal peer review, the focus is on community discussion, which has its own unique set of challenges. What are the social tensions between the desire to ask public questions or offer critiques and the risk that the author may be on the review committee for your next grant proposal? One solution may be to focus on enabling smaller groups and journal clubs so that comments are limited to these circles. Another may be to provide more powerful tools to bloggers who are already engaged in discussion in forums away from the article itself. Whatever the circumstances, our objectives stem from an exploratory, community-driven approach in which we are experimenting with practical suggestions that can serve multiple communities.

Annotations can themselves be a form of scholarship. Funding from the Helmsley Foundation will allow us to integrate our tools with ORCID and Research Resource Identifiers (RRIDs). Through ORCID, each annotation can be tied to a unique author ID, which will allow annotations to be counted and recognized as scholarly contributions that (at the researcher’s discretion) will form a discoverable part of a researcher’s profile. Through collaboration with the Neuroscience Information Framework (neuinfo.org) and the FORCE11 Resource Identification Initiative (www.force11.org/Resource_identification_initiative), annotations will be tied to unique IDs that are linked to particular research resources, such as antibodies, genetically modified animals, software tools, and databases. Through this annotation framework, researchers will be able to share information on which studies these reagents have been used in and so alert others if a problem is noted with a particular research resource. Currently, the only way to disseminate this information widely is word of mouth or inclusion in a published article with the hope that a researcher reads it before using the reagent or tool in question again.

With the current spotlight on reproducibility problems and biases toward publishing only favorable results in science, annotations can quickly warn about other quality issues, suggest modifications to experimental techniques to achieve better results, or simply provide helpful background information for unfamiliar topics. Adverse findings can be quickly communicated without the effort of writing a full paper. The enhanced visibility of small or one-off trials and bench experiments can suggest fruitful avenues to those who are better trained or who have more resources to deepen an investigation with access to small amounts of data and statistical results.

Annotation is already ubiquitous among scholars, from research through publication and beyond, and is carried out in diverse, mostly proprietary systems that until now have existed within their own frameworks and silos. Moving toward an open, interoperable standard for annotation can unlock fundamentally new capabilities. Discovering what those are and how they can benefit researchers and communities will be an evolutionary process.

January–March 2015

* In 1905, the publication was divided into two journals: Proceedings A and Proceedings B.

MARYANN MARTONE is the director of Biosciences Division and DAN WHALEY is the CEO of Hypothes.is.