Sharing Data – Considerations, Benefits, and Challenges

In this lesson you will learn

  • Some of the key questions you’ll need to ask yourself as you consider sharing your research data
  • Some benefits and some challenges of sharing research data
  • How different organizations in the academic ecosystem influence when, where, and how you share your research data

Initial questions

  • Have you ever shared any of your data?
  • Have you ever used data shared by other researchers? If so, was it easy or difficult to access the data? How easy was it to use the data?

What Does “Sharing Data” Mean?

Carefully managing your data (discussed in Modules 1 and 2 of this course) makes them more valuable for your own research, and this is the main reason why you do so. However, effectively managing your research data also makes it easier for you to share them with other scholars, and makes it more likely those other scholars will be able to interpret and understand your data.

As you probably know, the notion of sharing qualitative data is a relatively new one in the United States. While the idea has more resonance in some European countries, and in the U.K. in particular, among U.S. social scientists there has been a very weak tradition of sharing qualitative data. Today, though, things are beginning to change. Different stakeholders in the research lifecycle (funders and journals, for instance) are calling for the sharing of more data, and new technologies to facilitate data sharing are being developed. If you join the vanguard of scholars who are sharing their qualitative data, there will be lots of decisions that you’ll need to make when doing so.

  • How much of your data will you share? Sharing data is not an all-or-nothing procedure. Different aspects of your data may be under different types and degrees of constraint. As you’ll learn in the next two sections of this lesson, it may be easier to share data that were not generated through human participants research, or that are not under copyright.
  • How and where will you share your data? There are several options, and major advantages to sharing your data in an institutionalized venue, as we discuss more below.
  • When will you share your data? Will you share them while the research project with which they are associated is ongoing, or only afterwards? Might you share the data when you submit an article for review, or only just before publication? Organizations that funded your research or journals where you wish to publish it may have rules that you’ll need to follow.
  • With whom will you share your data? With anyone who wishes to access them? Or is there a justifiable reason to limit access to your data in some way?

These are all choices that you should begin thinking about even before you start to collect data. While there are rarely “right” and “wrong” answers, well-considered decisions will allow you to contribute the most to your research community, and to the production of knowledge.

No matter how you answer these questions, sharing your data only makes sense if the data will be meaningful to other scholars. Sharing data requires preparing the data and creating documentation such that the data can be interpreted and used by others. Since you will have effectively managed your data – such that the “future you” will be able to understand them – you are already many steps ahead in this process. Nonetheless, you may have to prepare slightly more detailed documentation, obtain some additional permissions, or perhaps de-identify the data (if they were derived from interactions with human participants; we discuss this process in the next lesson).

Reasons to Share Data

A fundamental question not included in the list of questions above is why. Why would you share your research data? There are lots of reasons why sharing your data is a good idea.

First, let’s consider your point of view, as the researcher who generated the data. Sharing your data:

  • Helps you comply with external mandates:
    • Scholarly associations may have recommendations or guidelines about data sharing. In the US, this is the case for many social science associations including the American Sociological Association (ASA), the American Political Science Association (APSA), and the American Psychological Association (APA).
    • The organization(s) that funded your work may require that you share the fruits of your research with the broader scholarly community; indeed, you may have to describe your plans in a Data Management Plan that you submit with your funding application.
    • Journals may require that you make your published work transparent by sharing the data that undergird it.
      • For instance, more than 25 political science journal editors have signed “JETS” – the Journal Editors Transparency Statement – supporting the development of transparency requirements.
  • Provides for long-term safe storage for your data (if you deposit them in a repository).
  • Helps you demonstrate the rigor and power of your analysis.
  • Increases the visibility of your work and potentially your citation rates, enhancing your reputation.
  • Enables you to make new connections to and perhaps collaborate with other scholars.

If you carried out interviews or surveys or otherwise generated data through interacting with people, another set of reasons for sharing your data relate to those individuals. Your “human participants” can also benefit from the sharing of information they provide to you. Sharing human participants data:

  • Allows a broader population to learn from and capitalize on the important effort that interview respondents invested in answering your questions.
  • Can help to minimize any negative effects that data collection may have on the subject population.
    • For instance, if the topic of a particular set of interviews is very sensitive or traumatic, it can be difficult for interview respondents to answer questions. If you share your data, that community doesn’t need to repeatedly relive their difficult experience as they are questioned about it by different researchers.
  • Allows your respondents’ voices to be amplified. Some populations are hard to reach and others have not had a voice for other reasons. Sharing the information that they shared with you allows their views and perspectives to be heard.
  • Allows a scholar who plans to interview members of the same population that you did to learn about his respondents in advance of interacting with them, making those interactions more informed and potentially more enjoyable for the respondents.

Finally, sharing data benefits the broader scholarly community. Sharing data:

  • Creates a public good. Sharing data allows them to accumulate, and allows others to analyze them.
  • Is economically efficient: doing so helps to avoid costly duplication of data collection, and makes optimal use of the funding that supported the data collection.
  • Enhances the instruction of research methods. Students often learn best by “doing,” and what better way to engage them than by using data from real research to teach them (as is already the norm in teaching quantitative methods)? Shared data can also be used to create simple, stylized data sets especially designed for teaching purposes.
  • Helps scholars without the resources to themselves collect the data they need for their research to get data they can use to carry out their work.
  • Facilitates the formation of research networks and partnerships and the building of broader epistemic communities.
  • Allows others to evaluate, validate as trustworthy, and replicate your work, maximizing accountability and knowledge generation.
  • May be ethically required: if what you’ve discovered sheds significant light on important dynamics – which it surely does – it may be unethical not to share your data.

Reasons Not to Share Data

Given qualitative scholars’ traditional hesitance to share their research data, there must be some very good reasons not to do so. We offer some reasons here. Can you think of others?

  • First use: given how difficult generating qualitative data can be, you might be concerned about sharing your data before you have had a chance to publish on the basis of them.
  • Epistemological concerns: you might see data as a product of the reflexive relationship between you and the people you involved in your study, such that someone who was not privy to the inter-subjective event that produced the data does not have the background knowledge and tacit understandings necessary to interpret them.
  • Inappropriate use: you might fear that your data will be used for the wrong purposes.
  • Lack of Interest: You might think that your data aren’t of interest to, or couldn’t be useful to, anyone else.
  • Language concerns: you might question the value of sharing if you anticipate that the language of your textual or audio data makes it unlikely they will be understood by relevant populations of researchers.
  • Resource constraints: you might feel you just don’t have the time or money to invest in preparing your data for sharing.
  • Lack of incentives: you may feel that your discipline doesn’t really value the sharing of qualitative research data and that there aren’t any professional rewards for doing so.
  • Human participants concerns:
    • You may worry that asking respondents for consent to share their data will cause them to decline to participate in your study or will impact what they convey.
    • You may have promised your respondents that you would not share the information they provided to you with anyone (and may have even promised to destroy that information at the end of your study).
    • You may have collected the data under complete assurances of confidentiality.
    • Even if you didn’t promise your respondents anything with regard to data sharing, your data may seem too sensitive for others to see.
    • You may doubt that your data can be fully de-identified either due to the form in which you collected them (e.g., video), or because some combination of the collected data (or the collected data and other publicly available data) is sure to reveal the identities of human subjects.
    • It may seem pointless to share your de-identified data because de-identification caused so much information to be lost, significantly compromising the analytic value of the data.
      • You will learn more about sharing human participant data in the next lesson
  • Legal constraints and proprietary obligations:
    • The archives in which you collected documents may have rules about their dissemination.
    • Information may have been provided under a non-disclosure agreement (NDA).
    • Information may be classified.
    • Your data may be under copyright, i.e., the exclusive legal right to use and distribute some works of authorship that is held by their originator, preventing their sharing;
    • You may have purchased your data under a licensing agreement that stipulated that they cannot be made public.

These concerns and challenges are important to consider and address. Many of the stickiest challenges relate to human participants and legal constraints, and to copyright in particular. The next two lessons in this module offer guidance on addressing those concerns. Overall, however, it’s important to keep in mind that most general guidelines on sharing data, such as those offered by various social science academic associations (e.g., ASA 2018, APSA 2012, APA 2016), take into account that you may not be able to share all of your data and simply require that you explain why you cannot share the data that you cannot share. Funding organizations and journals in various disciplines have likewise adopted these caveats.

Exercise

Creating a Data Sharing Policy

  1. Two of the institutional actors that have been carefully considering all of the questions above are funders and publishers. They need to take on these issues because they must develop policies for applicants and grantees, and authors, to follow. Given the considerations, benefits, and challenges above, how should key institutional actors design policies to encourage responsible data sharing? Pretend you are the editor-in-chief of a journal of your choice. Develop a draft policy that you will translate into author guidelines for providing the qualitative data that underpin articles published in your journal. Your policy should address the following issues:
    • Which data should be shared and what in addition to data needs to be shared?
    • When in the publication process should data (and materials) be shared?
    • Where should data (and materials) be shared?
    • What established exceptions should there be?
    • Who should judge whether a scholar’s situation fits within those exceptions – or whether the alternative situation for which a scholar seeks an exception should be deemed exceptional?
    • How will the policy be enforced?
  • show solution
    1. There are no right or wrong answers to these questions! We encourage you to evaluate your answers along the following lines: Are your answers coherent – do they fit together without contradicting each other? How easy will they be to translate into clear guidelines for authors? What kind of behavior are the different aspects of your policy, and your policy as a whole, encouraging on the part of authors and reviewers? Of the various involved actors (authors, reviewers, other institutions), whom have you empowered – granted decision-making authority? Have you addressed possible trade-offs between promoting data sharing and deterring authors from submitting their best work to your journal? What potential sources of conflict does your policy introduce?