You can use that citation or adapt it to your citation style of choice. Increasingly, reference managers such as Zotero and Endnote will let you import citation data from data repositories and produce correctly formatted citations.
Citing Your Own Data
When citing your own data, you will typically include references in two locations. First, the citation will appear in a “data availability statement,” which is placed on the article’s abstract page on the journal web site or in the first footnote.
Second, the data should also be cited formally in the bibliography of the article. While data availability statements are fairly standard in published papers, including the data in the bibliography is not. Doing so is important, however, as it helps your article and data to stay linked together. We strongly encourage this practice.
In both cases, remember to include the DOI.
Citing Other Researchers’ Data
When you use other researchers’ data in your written work, cite them as you would with any other scholarly contribution on which you draw. If the data are available from a data repository, always refer to the repository copy in your citation: repositories are set up to ensure that the data will still be available, and findable using the citation information they provide, many years ahead.
Some large data projects, such as the World Value Survey self-archive, meaning the project holds and disseminates the data rather than publishing them through a data repository. In almost all cases, they will recommend a citation format, which you should follow, taking particular care to cite the exact version of the data you have used in your work.
Citing data gives credit to researchers who collect and share data. Doing so also enhances the findability of data, thus making your work more transparent. In some instances, additional steps beyond citing data may be warranted. For instance, if your work draws on (and cites) a large-scale data project and you only use a subset of the data produced by the project, you should describe how and why you extracted the particular subsets of the data that you used in your study.
Depending upon how central to your scholarship the data you re-used were, you might even consider offering the scholar who generated them co-authorship on your publication. This practice is currently more prevalent in the natural sciences, but may be worth considering in the social sciences as disciplines begin to place greater value on data generation. An interesting alternative that has appeared in the medical field is to list “data authors” (Bierer, Crosas, and Pierce 2017) on publications – signaling that those scholars contributed to data generation but did not collaborate on the publication (and don’t necessarily concur with its conclusions).
Either of these alternatives raises the profile of data and those who create them, emphasizing the contributions they make to knowledge generation.
- Find two recent research articles in your field that rely on publicly available data (either shared by the authors themselves or by others). Do they include data availability statements? Are the data cited in the bibliography? If the answer to either of the questions is no, how would a data availability statement/bibliography entry have looked?
- show solution
- The purpose of the first part of this exercise is to get you used to the idea of looking for data availability statements as part of your standard research practices. Where data are available, take a look – you can learn a lot from studying other researchers’ data. Where data are not available, ask yourself why not. For the citation and the data availability statement, check whether your solution has the key elements identified above.