Skip Navigation
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Your Environment. Your Health.

Data Management and Sharing Plan Development

NIEHS Data Management and Sharing Plan Checklist

clipboard with checkmarks and lines

A Data Management and Sharing (DMS) Plan is a plan describing the data management, preservation, and sharing of scientific data and accompanying metadata. NIH has developed guidance for recommended elements of a Data Management and Sharing Plan. Applications subject to NIH’s Genomic Data Sharing (GDS) Policy should also address GDS-specific considerations within the elements of a DMS Plan. DMS Plans are recommended to be two pages or less in length. NIH has developed an optional DMS Plan format template that aligns with the recommended elements of a DMS Plan. Important: Do not include hypertext (e.g., hyperlinks and URLs) in the DMS Plan attachment.

NIEHS encourages data management and sharing practices to be consistent with the FAIR (Findable, Accessible, Interoperable, and Reusable) principles. The NIEHS Scientific Data Resources webpage lists resources relevant to the development of Data Management and Sharing Plans.

The plan should address each of the following elements:

Element 1: Data Type

NIH defines Scientific Data as the recorded factual material commonly accepted in the scientific community as of sufficient quality to validate and replicate research findings, regardless of whether the data are used to support scholarly publications. Scientific data do not include laboratory notebooks, preliminary analyses, completed case report forms, drafts of scientific papers, plans for future research, peer reviews, communications with colleagues, or physical objects, such as laboratory specimens.

Metadata are data that provide additional information intended to make scientific data interpretable and reusable (e.g., date, independent sample and variable construction and description, methodology, data provenance, data transformations, any intermediate or descriptive observational variables).

Description of the scientific data to be generated and shared throughout the grant.

  • In general terms, describe the types and amount of scientific data to be generated and/or used in the research project (e.g., RNA-seq, targeted LC-MS, and epidemiological survey data of research participants). Descriptions should indicate the data type, level of aggregation (e.g., individual, summarized), and/or the degree of data processing that will occur.
  • Describe which scientific data from the project will be preserved and shared and provide the rationale for the decision. Researchers should decide which data to preserve, and share based on NIH goals to maximize data sharing, but accounting for ethical, legal, and technical factors that may impede sharing.
  • Briefly describe the metadata and any other documentation (e.g., study protocols, data collection instruments, data dictionaries) that will be made accessible to allow interpretation of the scientific data.

For Data Subject to GDS Policy

Data Types expected to be shared under the GDS Policy should be described in this element. Note that the GDS Policy expects certain types of data to be shared that may not be covered by the DMS Policy’s definition of “scientific data”. For more information on the data types to be shared under the GDS Policy, consult Data Submission and Release Expectations.”

Element 2: Related Tools, Software and/or Code

Information on related tools, software, and/or code.

  • Indicate whether specialized tools or software are needed to access, manipulate, or reuse shared scientific data. If applicable, list the name(s) of needed tools/software and specify how the tools can be accessed (open source and freely available, generally available for a fee in marketplace, available only from the research team).

Element 3: Standards

Description of standards to be applied to the scientific data and associated metadata.

Data standards are documented agreements on representation, format, definition, structuring, tagging, transmission, manipulation, use and management of data. Resources such as FAIRsharing and The Digital Curation Centre provide information on available data and metadata standards. The use of Common Data Elements (see NIH Common Data Elements (CDE) Repository), standard data collection tools (see NIH Disaster Research Response (DR2) Resources Portal, PhenX Toolkit (consensus measures for Phenotypes and eXposures)), and existing ontologies (see Environmental Health Language Collaborative) are highly encouraged. See the EHS Ontology Resource Catalog for a compilation of organizations, ontologies/terminologies, and tools useful to harmonizing environmental health research.

NIH encourages researchers to select the repository that is most appropriate for their data type and discipline. The Environmental Health Sciences Domain Specific Data Repositories Dashboard provides information on domain-specific repositories that align with NIEHS-supported science areas. The NIH BioMedical Informatics Coordinating Committee (BMIC) Data Sharing Repositories website lists additional NIH-supported data sharing resources. If no appropriate discipline or data-type specific repository is available, researchers should consider other options, including generalist repositories, institutional repositories, or cloud-based data repositories. For additional details, see the NIEHS Scientific Data Resources webpage.

  • Indicate data and metadata standards to be applied in the research project. Standards may include data formats, consensus measures, common vocabularies, and other documentation. While many scientific fields have developed and adopted common data standards, others have not. In such cases, the plan may indicate that no consensus data standards exist. While NIEHS does not generally require specific standards, we are seeking to increase adoption of standards that facilitate data integration and harmonization. You are encouraged to contact NIEHS if you would like help in determining if standards exist and which standards are appropriate.

Element 4: Data Preservation, Access, and Associated Timelines

Plans for data preservation.

  • Provide names of the repositories where scientific data and metadata arising from the project will be archived. NIEHS encourages the use of established data repositories that meet desirable characteristics. You are encouraged to contact NIEHS if you are unable to identify a suitable data repository or have questions on selection of a repository.
  • Describe how data will be findable and identifiable. Indicate how persistent identifiers (PIDs), such as Digital Object Identifiers (DOIs), Open Researcher and Contributor (ORCID) IDs, and Research Organization Registry (ROR) IDs, will be assigned to identify data, people, organizations, or other entities. Indicate whether the data and/or metadata will be indexed in a searchable resource. If use of PIDs is not possible, indicate why they cannot be used.
  • Describe the anticipated timeframes for preserving and sharing scientific data. Specify when the scientific data will be made available to other users (i.e., no later than time of an associated publication or end of the performance period, whichever comes first) and for how long the scientific data will be made available. NIEHS encourages researchers to share scientific data as soon as possible and to make scientific data available for as long as they anticipate it being useful for the larger research community, institutions, and/or the broader public.

For Data Subject to GDS Policy

  • For human genomic data:
  • For Non-human genomic data:
    • Investigators may submit data to any widely used repository.
    • Non-human genomic data is expected to be shared as soon as possible, but no later than the time of an associated publication, or end of the performance period, whichever is first.

Element 5: Access, Distribution, or Reuse Considerations

Description of factors affecting access, distribution, or reuse of scientific data.

  • Describe with whom the data will be shared and under what conditions. Indicate if access to scientific data will be controlled (i.e., made available only after approval and the mechanism for data access request). NIEHS expects that researchers maximize the appropriate sharing of scientific data generated, consistent with privacy, security, informed consent, and proprietary issues.
  • If applicable, provide a rationale for why access, distribution, or reuse of data will be restricted. In cases where data access is controlled, there is still considerable value to the community to freely access summary and aggregate data. Indicate if access to summary and aggregate data will be restricted.
  • Describe any applicable factors affecting access, distribution, or reuse of scientific data. Include information related to:
    • Informed consent (e.g., disease-specific limitations, particular communities’ concerns.
    • Privacy and confidentiality protections (i.e., de-identification, Certificates of Confidentiality, and other protective measures) consistent with applicable federal, Tribal, state, and local laws, regulations, and policies.
    • Restrictions imposed by federal, Tribal or state laws, regulations, or policies or existing or anticipated agreements (e.g., with third party funders, with partners, with Health Insurance Portability and accountability Act (HIPAA) covered entities that provide Protected Health Information under a data use agreement, through licensing limitations attached materials needed to conduct the research or any other consideration which may limit the extent of data sharing.
    • Data sharing agreements, licenses, and/or any other considerations that may limit the extent of data sharing or reuse.

Expectations for Human Genomic Data Subject to the GDS Policy

  • Informed Consent Expectations:
    • For research involving the generation of large-scale human genomic data from cell lines or clinical specimens that were created or collected AFTER the effective date of the GDS Policy (January 25, 2015):
      • NIH expects that informed consent for future research use and broad data sharing will have been obtained. This expectation applies to de-identified cell lines or clinical specimens regardless of whether the data meet technical and/or legal definitions of de-identified (i.e., the research does not meet the definition of “human subjects research” under the Common Rule).
    • For research involving the generation of large-scale human genomic data from cell lines or clinical specimens that were created or collected BEFORE the effective date of the GDS Policy:
      • There may or may not have been consent for research use and broad data sharing. NIH will accept data derived from de-identified cell lines or clinical specimens lacking consent for research use that were created or collected before the effective date of this Policy.
  • Institutional Certifications and Data Sharing Limitation Expectations:
    • DMS Plans should address limitations on sharing by anticipating sharing according to the criteria of the Institutional Certification.
    • In cases where it is anticipated that Institutional Certification criteria cannot be met (i.e., data cannot be shared as expected by the GDS Policy), investigators should state the institutional Certification criteria in their DMS Plan, explaining why the element cannot be met, and indicating what data, if any, can be shared and how to enable sharing to the maximal extent possible (for example, sharing data in a summary format). In some instances, the funding NIH ICO may need to determine whether to grant an exception to the data submission expectation under the GDS Policy.
  • Genomic Summary Results:
    • Investigators conducting research subject to the GDS Policy should indicate in their DMS Plan if a study should be designated as “sensitive” for the purposes of access to Genomic Summary Results (GSR), as described in NOT-OD-19-023.

Element 6: Oversight of Data Management and Sharing

Investigators may request funds toward data management and sharing in the budget and budget justification sections of their applications. To learn more, visit the NIH webpage on Budgeting for Data Management and Sharing.

Plans for oversight of data management and sharing.

  • Identify the individual(s) (e.g., titles, roles) who will be responsible for executing the various components of data management over the course of the research program.
  • Describe how compliance with the data management and sharing plan will be monitored and managed.
Back
to Top