Guidelines for Citing Data

Data citation is an invaluable tool of scholarly work. For authors of datasets, it is important that they receive attribution for their work. Citing data also allows readers to locate, access and reuse the data for their own use or for replication.

When citing data, you should include the following information:

  • Author: Name(s) of each individual or organizational entity responsible for the creation of the dataset.
  • Date of Publication: Year the dataset was published or disseminated.
  • Title: Complete title of the dataset, including the edition or version number, if applicable.
  • Publisher and/or Distributor: Organizational entity that makes the dataset available by archiving, producing, publishing, and/or distributing the dataset.
  • Electronic Location or Identifier: Web address or unique, persistent, global identifier used to locate the dataset (such as a DOI). Append the date retrieved if the title and locator are not specific to the exact instance of the data you used.
  • Version: Versioning or timeslice information should be supplied for any updated or dynamic dataset.

These are minimum elements required for dataset identification and retrieval. Fewer or additional elements may be requested by publication guidelines or style manuals. Be sure to include as many elements as needed to precisely identify the dataset you have used. Additional guidance can be found in Force11's Joint Declaration of Data Citation Principles and the FAIR Guiding Principles (Findability, Accessibility, Interoperability and Reuse of digital assets).

Data archives may provide guidelines on how to cite the data, for example the Harvard Dataverse Network, ICPSR and the Roper Center have standard citations included in the study record.

For additional information on how to cite data, see the following resources: