Creating a Data Management Plan (DMP)

In this lesson you will learn

  • How to create a data management plan
  • How to use tools like the DMP Tool to create a data management plan
  • How to assess a data management plan
  • How to treat your data management as a living document

Initial questions

  • Have you had to write a data management plan for a grant application or some other purpose?
  • Have you had any tools to help you to write such a plan?
  • Have you wondered how your plan would be judged?

What is a data management plan?

So far we have talked about data management and the importance of planning. A Data Management Plan (DMP) is a document that specifies your plans for data management for the whole duration of the project. The plan summarizes the various decisions you took while planning your data management. Broadly, the plan should cover the following topics:

  1. The types of data you expect to collect,
  2. How those data will be documented and organized,
  3. How the data will be stored and kept secure, and
  4. How the data will be shared (or why not) and stored for the long term.

We provide more detailed instructions below.

Why write a data management plan?

In many cases you will write your data management plan because a funding agency requires it – in the US, for example, the National Science Foundation (NSF) has required data management plans since 2011. Different funders have slightly different requirements for DMPs, but the NSF’s requirement of a 2-page document is fairly typical. An important goal of your data management plan, then, is to convey to the funder that you have thought carefully about data management.

We would encourage you, though, to compose a data management plan even if you are not required to do so by a funder. Having your plans for your data written out helps keep you organized and on track as you conduct your research project. Having written plans can be especially useful if you need to coordinate the activities of a research team – including just a couple of RAs. Having your data management plans written out also allows you to communicate those plans to others. At QDR, for example, looking at researchers’ DMPs allows us to quickly understand their goals and needs, helping us to offer them the support they need.

From the field: The NSF and other agencies are starting to look at data management plans more and more closely. In 2012, barely adequate DMPs were routinely accepted. That is no longer the case. Take the DMP portion of your grant seriously.

Tools for writing a data management plan

You can find a lot of resources for writing data management plans and we have listed some at the end of this lesson. Here we would like you to take a close look at two resources.

At QDR, we have provided a data management checklist, specifically designed for qualitative researchers to make sure their data management plan covers all important topics. For every topic we provide examples, where possible from qualitative research. We also include tips based on our work advising researchers on their data management plans. You can use the checklist both proactively as a guide to writing your DMP or, as the name implies, to check that an existing data management plan covers everything it should.

Still, you may find it difficult to start writing such a plan. A useful tool for this is the DMP Tool, produced by the California Digital Library. The DMP Tool provides a large number of templates, based on funder requirements, and helps you to structure your DMP according to those templates, guiding you through the process with questions. When you are done, it provides you with a neatly formatted data management plan. The language of the DMP Tool can be a bit abstract, so it may be useful to combine it with QDR’s checklist.

Exercise

Using the DMP Tool

  1. Create an account for the DMP Tool (if possible through your university) and write a very brief data management plan. Your plan does not need to be very detailed (for now) – the goal is to get a sense of the tool.
  • show solution
    1. Go to https://dmptool.org/, click on “Get started” and select your institution on the next screen. Then click “Create New DMP” and use a template. We suggest the NSF’s Social, Behavioral, and Economic Science as a good default, but feel free to pick any funder that seems relevant.

    Some Key Elements of a DMP

    You should rely on tools such as our checklist or the DMP Tool for a comprehensive list. Here we simply highlight some particularly important or frequently misunderstood issues.

    • Personnel: As a young researcher, you may not think of yourself as leading a “team,” but chances are you will be using research assistants and potentially other contractors like transcribers, translators, or even stringers before you publish your research. You should consider what parts of your data these team members can access, how you provide (and restrict) access for them, and how you will train them to manage your data responsibly.
    • Data types: Generally, storing your data in standard file formats is ideal. Where you rely on proprietary, specialized formats, you are at the mercy of the company supporting these formats. Using proprietary formats is not always avoidable (e.g., most qualitative data analysis software like NVivo or atlas.ti stores data in a proprietary format), but do not rely on them unnecessarily. Moreover, consider including alternative storage and dissemination formats for such data.
    • Data volume: DMP guidelines often ask you to specify the volume of data. Do not feel like you need to be overly precise here. The purpose of this is for you to think of adequate storage facilities, so the important thing is to be in the right ballpark: you want to know if you need to store 1GB or 1TB of data. It doesn’t matter whether it’s 1GB or 2GB.
    • Metadata: We will talk more about metadata in module 2, but a useful way to address questions about metadata is to talk to the repository you want to use to share your data. Repositories commonly follow metadata standards and can tell you what those are.
    • Data sharing: Even if you can absolutely not share your data (e.g., you interviewed people acutely at risk should the interviews become public), you should specify the rationale clearly and specifically in your DMP. Funders want to know that you are not just defaulting to not-sharing any data.

    Assessing DMPs

    How do you know that your DMP is good enough? When working with researchers, we have often found that assessing other researchers DMPs is a good way to think about this. We use a rubric developed by a group of US librarians to assess DMPs submitted to NSF for their compliance with NSF criteria for such plans (Whitmire et al., 2015).

    Exercise

    Assessing DMPs

    1. Find the rubric by Whitmire et al. here and a scoresheet for the rubric here. Use this rubric to evaluate the data management plan by Fisher and Nading (2016) here. Alternatively, when working in a group or with a partner, you can exchange DMPs and assess each other.
    2. As you work through the rubric, try to not just think about what you would improve about the data management plan, but also where the rubric itself might fall short.
    • show solution
      1. You can look at a rubric for the Fisher and Nading data management plan that we filled out here. Don’t expect to have the exact same assessment throughout, but look again at areas where you disagree with our assessment.
        1. The most important weakness of the rubric is that it assesses the mere presence of a topic, not the quality of the response. For example, “I will e-mail my data to interested researchers” would qualify as a full score to question 4.2, but is clearly not a satisfactory answer. Moreover, given NSF’s interest in DMPs, they focus heavily on the sharing and dissemination component, but make little mention of some of the earlier, crucial issues, such as back-up or personnel.

        DMPs as Living Documents

        Your data management plan should, ideally, anticipate eventualities, but none of us can predict all the eventualities of a multi-year research project. This is why there is a movement to treat DMPs as “living documents.” The idea here is that a DMP evolves together with your project. For example, as you add team members or collect additional data, this will be added to the DMP. You should either save a new, date-stamped copy every time you change your DMP or use a version control system such as git. Used this way, data management plans become a valuable piece of documentation for your project and also a crucial point of reference for you. Especially as you are engaging in complex research projects, we strongly recommend you to consider this approach.

        Looking ahead, information specialists are working not just on making data management plans living but also active. For example, if you mention a data repository as part of your planning, the repository would automatically be notified of this. The data management plan would also send you automated reminders on key dates specified in the plan. Lastly, active DMPs would also let funders check easily that you have complied with the promises made in your DMP (and thus your grant application). You can read more about this vision in a recent paper by Simms and Jones (2017).

        Resources

        California Digital Library. 2017. DMP Tool. https://dmptool.org/

        Fisher, Josh, and Alex Nading. 2016. “A Political Ecology of Value: A Cohort-Based Ethnography of the Environmental Turn in Nicaraguan Urban Social Policy.” Research Ideas and Outcomes 2 (May):e8720. https://doi.org/10.3897/rio.2.e8720.

        Qualitative Data Repository. 2017. Data Management Checklist. https://qdr.syr.edu/drupal_data/public/QDR%20-%20Data%20Management%20Checklist.pdf

        Simms, Stephanie Renee, and Sarah Jones. 2017. “Next-Generation Data Management Plans: Global, Machine-Actionable, FAIR.” International Journal of Digital Curation 12 (1):36–45. https://doi.org/10.2218/ijdc.v12i1.513.

        Whitmire, Amanda, Jake Carlson, Brian Westra, Patricia Hswe, and Susan Parham. 2015. “The DART Project: Using Data Management Plans as a Research Tool.” Open Science Framework. https://doi.org/10.17605/osf.io/kh2y6.