Skip to main content

NIH Data Management & Sharing Policy

The National Institutes of Health (NIH) announced a new data management & sharing (DMS) policy to foster good data stewardship. This page includes information and resources for UNMC researchers to help them prepare, create, and submit a DMS Plan with their NIH applications, including:

Since January 25, 2023, the policy requires that all application types (new, resubmission, renewal, and revision) that generate Scientific Data, regardless of funding level, include a detailed plan for how the data will be managed and shared during the entire funding period.

NOTE: The new DMS Policy does not apply to funded NIH projects that do not generate scientific data, such as applications for Training (Ts), Fellowships (Fs), Construction (C06), Resources (Gs), etc.

For each of the categories below, click the arrow to expand the content.

Overview: What's New?

The National Institutes of Health (NIH) has issued a new Data Management and Sharing Policy (DMS Policy) starting January 25, 2023.

The new policy requires a Data Management and Sharing Plan (DMS Plan) for ALL NIH-funded projects that generate scientific data. Previously, the NIH only required a DMS Plan for projects over $500,000. This policy places proper data management and reusability of data at the center of research practices so that we can all advance scientific findings and support the integrity of those findings. This policy helps researchers use best practices in data management and sharing to facilitate the shift to open science and open data.

The NIH has also created a website dedicated to Scientific Data Sharing, but these pages on the SPA website aim to provide UNMC-specific information for our researchers while also summarizing the information from the NIH. Below, we include upcoming and recorded webinars (given by UNMC, NIH, or others), as well as provide links to other resources.

Types of Data

There are two “large bucket” categories of data that most researchers work with on a regular basis: quantitative and qualitative data.

In the biomedical sciences, quantitative data is used to provide measurements, calculate change over time, and generally used in raw data gathering. This raw data can then be used as the basis of statistical analyses.

Qualitative data is often thought of as social sciences data because many researchers in the social sciences use surveys and oral responses—in other words, natural language—as the basis of analyses. However, researchers in the sciences often use these same techniques when describing a particular set of data or when mapping data geographically.

Both types of data are used in the sciences, and both can be used as the basis for primary data and secondary data.


Quantitative Data

When using a variable that can be counted, measured, and given a numerical value, it is considered a type of quantitative data. Quantitative variables can answer the “how” questions: “how many,” “how much,” or “how often.”

Many researchers will also call quantitative data “numerical,” because of its capacity to measure and thus bridge empirical observation with mathematical expression. Because of the relationship between observation and mathematical expression, a researcher uses statistical analyses in experiments to find significant differences that can be replicated using similar methods.

There are two main types of quantitative or numerical data: discrete and continuous.

Discrete data is usually defined as a type of data that can be counted. These data cannot be made more precise, and so they involve integers, or numbers that cannot be made divisible. A classic example of a discrete data type would be a member of a family: you cannot have 1.3 or 4.2 children in a family. Another example might be how many doctor visits one may have in a year.

Continuous data can be divisible into smaller parts using decimal points. Continuous data, when graphed, create a distribution of values on a continuum. A classic example of continuous data is a person’s height.

Both discrete and continuous quantitative data use measures of central tendency (mean, median, mode) and dispersion (Standard Deviation, standard error, Interquartile Range) to measure results. Which measurement a researcher chooses to use is based on the type of data on which a hypothesis is tested.


Qualitative Data

Qualitative data is defined as variable categories using verbal groupings rather than numbers. Many people tend to confuse qualitative research with qualitative data: qualitative research is the method of collecting data from first-hand observations, interviews, or questionnaires that researchers use to study society using unstructured or semi-structured techniques like those mentioned above. Data is qualitative when the variables in a data set are verbal rather than numerical.

Qualitative data is also called “categorical” data, or data that can be placed into organized categories.

There are two main types of qualitative or categorical data: nominal and ordinal.

Nominal data variables have two or more categories that have “names” and no inherent order to them. For example, gender is a nominal category (female, nonbinary, male). When a variable only has two possible categories, it is called binary or dichotomous data. For example, asking if someone has a driver’s license (yes/ no).

Ordinal data can be places in categories with a clear order or hierarchy. For example, education level has a clear hierarchy (“high school,” Bachelor’s,” “Master’s,” “PhD”).

When analyzing qualitative data, a researcher will use frequency distribution in the form of a pie chart (nominal data), column, or bar chart (nominal or ordinal data).


Primary Data

Primary and secondary data have less to do with the variables used in data analyses and more to do with who generates the data that a researcher uses for analyses.

Primary data is data generated by the researcher for the primary use of the researcher. At a future time, this primary data may transform into secondary data when uploaded into a repository for use by others. Primary data is data used and collected in the moment and is used in current experiments. Because it is up to the researcher/ researcher’s team to collect data, the process takes time and is very involved.

Primary data is largely available in its raw form; thus, it has not been processed or refined. But, because it has not been processed or refined, it is more accurate and reliable.


Secondary Data

Secondary data is usually defined as data that someone else has collected. This can come from large healthcare organizations, the government, or other large organizations. It can be used after the fact of collection. Thus, it is data that has already been used in earlier experiments.

Researchers can find such data in internal healthcare systems, data repositories, either specific to one’s field of research or in a more generalist repository, or as part of a publication.

Choosing a Repository

What is a data repository? 

A data repository is a type “of sustainable information infrastructure which produces long-term storage and access to research data” (re3data.org). A data repository provides long-term storage and searchability of data used in scientific research.

Why use a data repository?

The NIH mandates the writing of a data management and sharing plan as of January 25, 2023 for all grant applications. Beyond NIH’s DMS Policy plan mandate, a data repository ensures accessibility and encourages reuse of data beyond the life of a grant or a single research project.

How to choose a data repository?

Choosing a data repository can depend on the research type, the grant type, or the data type. There are two main types of data repositories: "discipline-specific" and "generalist" repositories.

Discipline-specific repositories should be given primary consideration, since they will allow for optimal discovery and reuse. The NIH has compiled a list of scientific data repositories for making data available, which is organized by discipline. The NIH DMS Policy does not endorse or require the use of a data repository affiliated with the NIH.

If no discipline-specific repository exists, it is appropriate to choose a generalist repository.

Discipline-Specific Repositories: 

You can find a searchable table of NIH-supported, discipline-specific data repositories here:
https://sharing.nih.gov/data-management-and-sharing-policy/sharing-scientific-data/repositories-for-sharing-scientific-data

You can find a registry of research data repositories here:
https://www.re3data.org/

Generalist Repositories:

UNMC is recommending the use of several generalist repositories, including

More generalist repositories may be recommended by UNMC later, or you can choose another one that suits your needs.

Writing Your Plan (DMPTool)

 

Budgeting for DMS

Each application for research funding should include a budget for data management and sharing. This budget should be included as part of your overall budget for the project on either:

  • the R & R Detailed Budget Form under “F. Other Direct Costs” or
  • the PHS 398 Modular Budget Form under “Additional Narrative Justification.”

Please see the NIH’s Budgeting for Data Management and Sharing webpage here https://sharing.nih.gov/data-management-and-sharing-policy/planning-and-budgeting-for-data-management-and-sharing/budgeting-for-data-management-sharing#after for links to the forms, information on allowable costs, and how the budget will be assessed.

UNMC Budget Worksheet

The following budget worksheet has been created for UNMC researchers to estimate data management and sharing costs. This worksheet shows how to estimate a budget, showing each broad category of the data management and sharing lifecycle, how many hours to project for each activity based on the size of the grant request, and the justification for inclusion into a budget.

Because not all researchers will need to budget for every item on this list, a cost calculator is forthcoming. This cost calculator will help researchers customize a data management and sharing budget for individual project proposals.

Please note that all activity hours are estimates; we underscore that a researcher’s individual project may require more or fewer hours over the lifespan of the project. Also note that a research project may or may not contain data management for every item on this list. For instance, not every project will need image management, in which case a researcher not using image data would be expected to omit that item from that project’s overall budget.

Download the Budget Worksheet (PDF)

Submitting Your DMS Plan

 

Webinars & Resources

 

Frequently Asked Questions

General questions and answers about the new DMS Policy that may not be covered on other pages, as well as questions pertaining to UNMC-specific elements and procedures are included below. We will continue to gather answers to new questions and update this page. Review the section above titled Writing Your Plan (DMPTool) for specific questions about DMPTool.

Many additional questions specific to the NIH policy have been compiled and disseminated by the NIH and can be found on their DMSP FAQ page.

Where can I get help at UNMC for the NIH Data Management and Sharing Policy?

Several groups on campus will play a role in assuring UNMC and its researchers are ready to meet these new policy changes. Please direct any questions you have to researchdata@unmc.edu.

Alternatively, set up a consultation with UNMC's Data Services Librarian using the bookings page through the McGoogan Health Sciences Library.

Who is checking my data management and sharing plan?

The UNMC Sponsored Programs Office will check for presence of a plan within your application. They will not review your plan’s quality or confirm that all plan parts are present for your type of research. For a thorough review of your plan, please contact researchdata@umnc.edu.

Am I expected to share all data generated during my research?

No. Under the DMS Policy, researchers are expected to maximize the appropriate sharing of scientific data, which is defined as data commonly accepted in the scientific community as being of sufficient quality to validate and replicate the research findings.

How does the DMS Policy fit in with other NIH data sharing policies and requirements (e.g., individual NIH Institute/Center or Office (ICO) funding polices, the NIH Genomic Data Sharing (GDS) Policy, the NIH Policy on Dissemination of NIH-Funded Clinical Trial Information)?

The DMS Policy establishes the foundation for NIH’s data management and sharing expectations, which NIH ICOs and programs may build upon to meet their programmatic needs (e.g., designated repositories, specific data collection standards). Current NIH policies specific to certain types of research (e.g., clinical trials, research generating large-scale genomic data) continue to apply and complement the goals of the new DMS Policy.

If researchers are reusing existing, shared data to generate new datasets, are they expected to reshare the primary data they incorporated into their new analysis? Are the derived data generated considered scientific data and expected to be shared?

The DMS Policy applies to research that results in the generation of scientific data. Scientific data can result from secondary research, but researchers are not expected to share the existing, shared primary data used to conduct the secondary research. Researchers are, however, expected to maximize appropriate sharing of any new, derived data generated as a result of their research.

Does the DMS Policy apply to social and behavioral scientific research? Can qualitative data be “scientific data”?

Yes, NIH-supported social and behavioral scientific research that results in the generation of scientific data are subject to the DMS Policy. Qualitative data may constitute scientific data if it meets the definition in the DMS Policy.

What steps does the DMS Policy take to ensure institutions and researchers protect research participants?

Award recipients must comply with any applicable laws, regulations, statutes, guidance, or institutional policies related to research with human participants and that protect participants’ privacy.

Does the DMS Policy expect that research informed consent obtained from research participants must allow for broad sharing and the future use of data (either with or without identifiable private information)?

No. Informed consent for participation in research remains the cornerstone of trust between researchers and research participants and thus the DMS Policy does not dictate how this process is achieved. Rather, researchers’ intention for scientific data management and sharing, as proactively described in Plans, is strongly encouraged to be part of the informed consent process. The DMS Policy does not expect that informed consent given by participants will be obtained in any particular way.

How will noncompliance with the NIH DMS Policy be handled?

NIH will monitor compliance with Plans over the course of the funding period during regular reporting intervals (e.g., at the time of annual Research Performance Progress Reports (RPPRs)). Noncompliance with Plans may result in the NIH ICO adding special Terms and Conditions of Award or terminating the award. If award recipients are not compliant with Plans at the end of the award, noncompliance may be factored into future funding decisions.

What is a data or metadata standard?

The National Center for Data Services describes metadata as “information that describes, explains, locates, classifies, contextualizes, or documents an information resource.” In the context of data management, metadata allows you to track the provenance, or original source of a dataset, and help you to track which version of the data you are analyzing. Describing data in a machine-readable format allows you to search for data in a repository.

How will data management plans be assessed?

The evaluation of DMS Plans will be conducted by the agency, with input from the Contracting Officer’s Representative (COR) and other NIH subject matter experts as part of the proposal evaluation process.

Are projects establishing repositories or creating data infrastructure subject to the DMS Policy (i.e., establishing a data coordinating center with no research question proposed)?

No. Projects that only develop or support infrastructure resources (e.g., repository or knowledgebase establishment) and do not generate findings or scientific data are not subject to the DMS Policy. However, NIH recommends that the infrastructure developed with NIH resources comport with the desired characteristics for repositories (see “Selecting a Repository for Data Resulting from NIH-Supported Research”).

How should we handle situations where there are proprietary considerations about confidential data or intellectual property?

The NIH covered this briefly as part of a webinar, which can be found during this section of their recording. For more information specific to your situation, we would recommend you reach out to researchdata@unmc.edu

How granular does the stored data have to be? Most of the time data are reduced from original capture to make it more manageable. Should it be original data or reduced data? 

You should be storing all data, both raw and processed. The data management and sharing plan will ask where you plan to store data 1) during the lifetime of the project and 2) after the grant has ended. You will need to have a plan for storage during and beyond the life of the project. Thus, storing is only the first component. Secondly, you’ll need to think about preservation of the data. Where will this data live after the project is over? This is where data repositories—and finding and appropriate data repository in the grant application phase—is of the utmost importance. Thirdly, you’ll be asked about your data sharing plan. Do you plan to share the raw and processed data, or just the processed data? It is up to you to ask what “manageable” means for your project. That being said, the policy is about making your data replicable and reusable by other researchers, so if the data that you usually share is reduced data, then can another researcher re-use that data and replicate the results adequately? If not, then you may need to think about sharing the raw and reduced data. If so, then you are sharing the data adequately. 

How are we addressing patent implications? My reading of the policy is that when the grant ends, data needs to be available.  

If there are patent implications for an invention, we recommend reaching out to UNeMed at the point at which you are writing your data management and sharing policy. In terms of general intellectual property, PIs own rights in data resulting from sponsored projects.  Sponsored projects are not works for hire, and thus the sponsor does not own the data.

Data sharing is essential for expedited translation of research results into knowledge, products, and procedures to improve human health.  Sponsors generally endorses the sharing of final research data.  One exception is personal health information. 

Have agencies other than the NIH also mandated data management/sharing plans, and will these work via the same DMP tool we now have? 

Yes. Almost every grant-funding agency—both federal and private—in the country is either in the process of developing a data management and sharing policy or has one in place. All grant-funding agencies have templates uploaded in to DMPTool. Simply choose the appropriate funding agency when creating a DMPTool data management and sharing policy and use the subsequent template. 

How long should data be shared beyond the term of the NIH-funded grant? Can this be budgeted into the cost of Data Management and Sharing? 

Data should be shared for at least 3-5 years after the award period. However, most repositories share data in perpetuity. Data repositories do not charge for ongoing storage of your data. Once the data is uploaded to the repository, a repository will not ask for further monetary assistance. Should you find a data repository that is asking for more funding after upload of data, please reach out to researchdata@unmc.edu. 

Do K99/R00 grants require a plan? 

No. At this time, the NIH is not requiring training grants to include a data management and sharing plan, because the DMSP is only required for the collection of scientific data. However, the NIH has made it clear that this exception may change in the near future. 

Are there repositories for qualitative (narrative transcripts) data? 

Absolutely. There are discipline-specific repositories (found on re3data.org) and generalist repositories (listed in the above section titled Choosing a Repository). You can also reach out to researchdata@unmc.edu for a consultation on qualitative data repositories. 

Contact Information

Several groups on campus will play a role in assuring UNMC and its researchers are ready to meet these new policy changes. Please direct any questions you have to researchdata@unmc.edu.

Consultations

Set up a consultation with UNMC's Data Services Librarian using the bookings page through the McGoogan Health Sciences Library.