What Editors Need to Know about CrossRef in 2014: Service Offerings Benefit Many Parties in the Scholarly Communication Process

In 2014, CrossRef celebrates its 15th anniversary. Begun in 1999 to create a consistent reference-linking infrastructure in online scholarly literature, CrossRef has developed into an association of academic and scholarly publishers offering a variety of services to participating organizations with the aim of improving scholarly communication.

When most people think about CrossRef, they think about the Digital Object Identifier (DOI), an International Organization for Standardization standard for creating consistent URLs. CrossRef is the largest DOI registration agency, and initiatives surrounding the DOI and associated article metadata are still at the core of what CrossRef does. However, as CrossRef has grown as an organization, it has diversified to offer a number of services to respond to its members’ needs. This article will describe several of those services, from such new initiatives as the FundRef fundinginformation service and text-mining and data-mining tools to such established services as CrossMark update identification, CrossCheck duplicate screening, and reference linking, which are showing healthy growth.

FundRef: Measuring the Outcomes of Research Funding

FundRef went live in May 2013 and provides a standard way to report funding sources for published scholarly research. Why is it needed? The lack of standardization in funding-body names and metadata has made analyzing or mining researchfunding information from scholarly publications difficult. Different publishers display funding information in different locations in text fields, such as acknowledgments sections and footnotes. Funder names are not standardized: They may be abbreviations or be acknowledged at different levels in the organizational hierarchy. Those practices mean that funding bodies cannot easily track the output of their expenditures, publishers cannot easily identify the major funders of the research that they publish, research institutions cannot easily identify major funders of their employees’ scholarly output, and transparency to the public about public funding and its results is lacking.

CrossRef maintains a standard taxonomy called the FundRef Registry, which is a master list of more than 4000 standard funder names from all over the world. The taxonomy was donated by Elsevier and is freely available to anyone via the Creative Commons Public Domain (CC0) license. Scholarly publishers can incorporate the registry into their submission systems. Publishers ask authors, at submission, to choose the name or names of the funding bodies from this master list and to submit accompanying grant numbers. CrossRef has also made tools available for publishers to tag backfile content with FundRef information retroactively. Publishers’ production systems store this funding information so that publishers can now submit standard FundRef metadata with the bibliographic metadata that they already send to CrossRef to assign DOIs. They may also add FundRef data after the initial bibliographic metadata have been submitted.

Once the FundRef metadata are in the CrossRef database, they are searchable, either through CrossRef’s search interfaces, via an Application Programming Interface (API), or in third-party tools that incorporate CrossRef metadata. Publishers, funders, and other interested parties can query by funder name or grant number to discover the resulting publications. They can also look up a piece of content by using other metadata (such as author, title, or CrossRef DOI) and find out the funding sources. FundRef Search (http://search.crossref.org/fundref) is CrossRef’s free Web tool for looking up funding bodies and finding papers that have resulted from their grants. Publishers are also able to display FundRef information in a standard way. For publishers that are participating in the CrossMark service, FundRef data will automatically appear in the Publication Record tab of the CrossMark dialog box.

At the time of writing, 29 publishers have signed up for FundRef, including BioMed Central, IEEE, Hindawi Publishing Corporation, Oxford University Press (OUP), SciELO, and Wiley. A full list can be found on the CrossRef Web site: www.crossref.org/fundref/index.html. More than 47,500 CrossRef DOIs with FundRef metadata are available, and the number is growing rapidly. FundRef has garnered a good deal of attention with funding bodies and publishers alike, and CrossRef has committed to working with the funding, publishing, and library communities to make the data useful and widely available.

CrossRef Metadata Search

With the collection of bibliographic metadata through CrossRef DOI deposits, funding information through FundRef, and other publication record information through CrossMark, the CrossRef database is a growing source of useful metadata. As mentioned, FundRef Search lets researchers, agencies, publishers, and the general public look up publications by funder. Another useful, free public search tool is CrossRef Metadata Search (search.crossref. org), which allows anyone to search for any publication metadata stored in CrossRef’s database. It is a simple way to search for a particular CrossRef DOI or ShortDOI (shortDOI.org1) or for articles in a particular journal via the journal’s ISSN, and it also shows funder information (if available) and links to any patents that cite a particular CrossRef DOI. CrossRef’s metadata database represents more than 64 million CrossRef DOI records from journal articles, conference papers, books and book chapters, data sets, and components of articles, such as tables and figures. CrossRef Metadata Search supports searching by standard bibliographic metadata (author, title, and publication). It also allows users to search by Open Researcher and Contributor Identifier (ORCID). It provides a way to refine searches on the basis of publication year and other criteria. CrossRef Metadata Search also allows users to generate formatted citations from search results. More technical users can output CrossRef Metadata Search results in ContextObjects in Spans for import into Zotero and other document-management tools. A free API is also available so that users can integrate results into their own applications. Basic OpenSearch support is available so that CrossRef Metadata Search can be added to a browser search bar.

CrossRef Support for Text-Mining and Data-Mining Research

Another service to support researchers and publishers coming in 2014 is support for text mining and data mining. CrossRef has been running a pilot (called CrossRef Prospect) to simplify the technical and legal interactions between researchers and publishers to facilitate the growing interest in text mining and data mining of scholarly content. 1 The shortDOI Service creates shortened DOI names, of the form 10/abcde, as aliases for existing DOI names, which are often long strings. CrossRef will provide two complementary tools. First and most important, a common API will be available to direct researchers to the full text—the version appropriate for mining—of content, identified by CrossRef DOIs, among publisher sites. CrossRef is not providing the discovery tools but rather a directory of where the minable content lives on participating publishers’ sites. Second, publishers whose standard licenses do not allow text mining and data mining can make use of a license registry, a central library of terms and conditions. Participating CrossRef Member publishers can upload supplementary “click-through” agreements for researchers to agree to before proceeding to mine content. Together, those tools will allow researchers to harvest content for text-mining and data-mining analysis easily by using a standard API throughout all publishers’ content. CrossRef does not provide access controls for this content; for researchers to take advantage of the tools, they must already have access, whether through subscription or through open access by the publisher. The tools build on well-defined Web standards and best practices, such as the DOI and content negotiation.

Growing Number of CrossMark Participants Allows Researchers to Identify Changes and Get Valuable Publication-Record Information

CrossRef has also seen growth and development in its existing services. For example, more than 25 CrossRef member publishers now participate in the CrossMark update identification service. Participants include the BMJ Journals, Cambridge University Press, Elsevier, F1000 Research, The Royal Society, and Wiley. A complete list of participating publishers is available at crossref.org/crossmark/AboutParticipatingPubs.htm. More than 300,000 CrossRef records include CrossMark metadata; over 3,000 indicate that content has been updated since publication. Before CrossMark, researchers had no way to tell when important changes had occurred in an article or other scholarly document that they may have downloaded months earlier. Now by simply clicking a single, recognizable logo, any reader can have access to a status update from the PDF of the HTML version of the article. Clicking on the CrossMark logo on a scholarly document launches a pop-up box that provides status information, for example, that the document is up to date or that it has a correction, update, retraction, or other change that could affect the interpretation or crediting of the work. The CrossMark Status Tab also provides a permanent link, via the CrossRef DOI, to the publisher-maintained version of the content and an update, if one exists. Another important function that the CrossMark service provides is displaying additional (optional) publication-record information in a standard way. In addition to the Status tab, a CrossMark popup dialog box may also contain a Record Tab. In addition to displaying FundRef information as previously mentioned, publication-record information can include publication dates, links to supplementary data, ORCIDs, or rights information. CrossRef member publishers participating in CrossMark have added more than a million items of additional metadata. CrossRef calls those pieces of non-bibliographic metadata assertions. All the information is available through CrossRef’s free APIs, and it is also available to third-party recipients of CrossRef metadata so that they may display CrossMark updates and information to their users. Utopia Docs, Inera eXtyles, and Microsoft Academic Search have already integrated CrossMark into their products, and CrossRef has plans to incorporate CrossMark metadata into its own tools, including CrossRef Metadata Search.

CrossCheck: Helping Publishers to Detect Manuscript Similarity to Published Works

The CrossCheck duplicate-detection service, powered by iThenticate, is in its sixth year of operation and has more than 500 member publishers. Recent adopters include the American Chemical Society, the Institution of Engineering and Technology, and the Royal Society of Chemistry. Use is growing; member publishers uploaded more than 100,000 documents to iThenticate for checking in each of the months of August, October, and November 2013. CrossRef expects use to continue to grow as new members integrate the service into their peer-review processes and manuscript-tracking systems improve their workflow integrations for CrossCheck. Regular CrossCheck users have benefited from a number of recent improvements in the iThenticate document-screening system. The major change has been the release of the Document Viewer, which presents documents uploaded to the system in their original format. It helps users to interpret the reports by allowing them to see clearly the section where the matched text sits in the document and thus to establish the context of the match. Other new features include section exclusion (the ability to exclude materials and methods and abstracts from the reports), small-match exclusion, and a filesize increase from 20 MB to 40 MB, which allows users to upload larger files to be checked. For more information on those features, see www.ithenticate.com/products/whats-new.

Expanding Membership, Impact, and Constituencies

CrossRef membership is growing at a record pace, with 4777 participating publishers and societies in 76 countries, 2038 participating libraries, and many affiliates. CrossRef remains the largest registration agency for DOIs, with 64,459,767 CrossRef DOIs—not just for articles from 33,000 journals but for additional content types, such as 7 million books, book chapters, and reference entries and more than a million data sets. In fact, CrossRef DOIs for data sets and book content are the fastest-growing types. Other scholarly document types with CrossRef DOIs include conference proceedings, papers, reports, theses, and components, such as figures and tables. Our members have benefited from nearly 85 million DOI resolutions (end-user clicks) in November 2013. CrossRef staff actively engage in the industry by attending and presenting at conferences (such as CSE annual meetings) around the world—Antarctica, Asia, Europe, and North and South America. More than 150 people attended the CrossRef annual meeting in Cambridge, Massachusetts, in November, and several hundred more viewed a live stream of the event. CrossRef continues to grow and innovate to benefit ever-expanding constituencies in the scholarly research community. Yes, CrossRef provides services to its scholarly publishing members that drive traffic to their sites, increase the discoverability of their content, and help them to improve its quality by identifying possible cases of plagiarism and by alerting readers to important changes. But there are many other beneficiaries. CrossRef’s search tools serve researchers and the public, and FundRef benefits funding organizations and institutions. Text-mining and data-mining services reduce transaction costs for both researchers and publishers. CrossRef continues to engage with a wider group of stakeholders than ever before, increasing discoverability, convenience, and evaluation criteria and providing quality tools for scholarly publications.

Spring 2014 Sample Correspondence