NIEHS Data Management and Sharing Plan Checklist

clipboard with checkmarks and lines

A Data Management and Sharing (DMS) Plan is a plan describing the data management, preservation, and sharing of scientific data and accompanying metadata. NIH has developed guidance for recommended elements of a Data Management and Sharing Plan. Applications subject to NIH’s Genomic Data Sharing (GDS) Policy should also address GDS-specific considerations within the elements of a DMS Plan. DMS Plans are recommended to be two pages or less in length. NIH has developed an optional DMS Plan format template that aligns with the recommended elements of a DMS Plan. Important: Do not include hypertext (e.g., hyperlinks and URLs) in the DMS Plan attachment.

NIEHS encourages data management and sharing practices to be consistent with the FAIR (Findable, Accessible, Interoperable, and Reusable) principles. The NIEHS Scientific Data Resources webpage lists resources relevant to the development of Data Management and Sharing Plans.

The plan should address each of the following elements:

Element 1: Data Type

Description of the scientific data to be generated and shared throughout the grant.

  • In general terms, describe the types and amount of scientific data to be generated and/or used in the research project (e.g., RNA-seq, targeted LC-MS, and epidemiological survey data of research participants). Descriptions should indicate the data type, level of aggregation (e.g., individual, summarized), and/or the degree of data processing that will occur.
  • Describe which scientific data from the project will be preserved and shared and provide the rationale for the decision. Researchers should decide which data to preserve, and share based on NIH goals to maximize data sharing, but accounting for ethical, legal, and technical factors that may impede sharing.
  • Briefly describe the metadata and any other documentation (e.g., study protocols, data collection instruments, data dictionaries) that will be made accessible to allow interpretation of the scientific data.

For Data Subject to GDS Policy

Data Types expected to be shared under the GDS Policy should be described in this element. Note that the GDS Policy expects certain types of data to be shared that may not be covered by the DMS Policy’s definition of “scientific data”. For more information on the data types to be shared under the GDS Policy, consult Data Submission and Release Expectations.”

Element 2: Related Tools, Software and/or Code

Information on related tools, software, and/or code.

  • Indicate whether specialized tools or software are needed to access, manipulate, or reuse shared scientific data. If applicable, list the name(s) of needed tools/software and specify how the tools can be accessed (open source and freely available, generally available for a fee in marketplace, available only from the research team).

Element 3: Standards

Description of standards to be applied to the scientific data and associated metadata.

  • Indicate data and metadata standards to be applied in the research project. Standards may include data formats, consensus measures, common vocabularies, and other documentation. While many scientific fields have developed and adopted common data standards, others have not. In such cases, the plan may indicate that no consensus data standards exist. While NIEHS does not generally require specific standards, we are seeking to increase adoption of standards that facilitate data integration and harmonization. You are encouraged to contact NIEHS if you would like help in determining if standards exist and which standards are appropriate.

Element 4: Data Preservation, Access, and Associated Timelines

Plans for data preservation.

  • Provide names of the repositories where scientific data and metadata arising from the project will be archived. NIEHS encourages the use of established data repositories that meet desirable characteristics. You are encouraged to contact NIEHS if you are unable to identify a suitable data repository or have questions on selection of a repository.
  • Describe how data will be findable and identifiable. Indicate how persistent identifiers (PIDs), such as Digital Object Identifiers (DOIs), Open Researcher and Contributor (ORCID) IDs, and Research Organization Registry (ROR) IDs, will be assigned to identify data, people, organizations, or other entities. Indicate whether the data and/or metadata will be indexed in a searchable resource. If use of PIDs is not possible, indicate why they cannot be used.
  • Describe the anticipated timeframes for preserving and sharing scientific data. Specify when the scientific data will be made available to other users (i.e., no later than time of an associated publication or end of the performance period, whichever comes first) and for how long the scientific data will be made available. NIEHS encourages researchers to share scientific data as soon as possible and to make scientific data available for as long as they anticipate it being useful for the larger research community, institutions, and/or the broader public.

For Data Subject to GDS Policy

  • For human genomic data:
  • For Non-human genomic data:
    • Investigators may submit data to any widely used repository.
    • Non-human genomic data is expected to be shared as soon as possible, but no later than the time of an associated publication, or end of the performance period, whichever is first.

Element 5: Access, Distribution, or Reuse Considerations

Description of factors affecting access, distribution, or reuse of scientific data.

  • Describe with whom the data will be shared and under what conditions. Indicate if access to scientific data will be controlled (i.e., made available only after approval and the mechanism for data access request). NIEHS expects that researchers maximize the appropriate sharing of scientific data generated, consistent with privacy, security, informed consent, and proprietary issues.
  • If applicable, provide a rationale for why access, distribution, or reuse of data will be restricted. In cases where data access is controlled, there is still considerable value to the community to freely access summary and aggregate data. Indicate if access to summary and aggregate data will be restricted.
  • Describe any applicable factors affecting access, distribution, or reuse of scientific data. Include information related to:
    • Informed consent (e.g., disease-specific limitations, particular communities’ concerns.
    • Privacy and confidentiality protections (i.e., de-identification, Certificates of Confidentiality, and other protective measures) consistent with applicable federal, Tribal, state, and local laws, regulations, and policies.
    • Restrictions imposed by federal, Tribal or state laws, regulations, or policies or existing or anticipated agreements (e.g., with third party funders, with partners, with Health Insurance Portability and accountability Act (HIPAA) covered entities that provide Protected Health Information under a data use agreement, through licensing limitations attached materials needed to conduct the research or any other consideration which may limit the extent of data sharing.
    • Data sharing agreements, licenses, and/or any other considerations that may limit the extent of data sharing or reuse.

Expectations for Human Genomic Data Subject to the GDS Policy

  • Informed Consent Expectations:
    • For research involving the generation of large-scale human genomic data from cell lines or clinical specimens that were created or collected AFTER the effective date of the GDS Policy (January 25, 2015):
      • NIH expects that informed consent for future research use and broad data sharing will have been obtained. This expectation applies to de-identified cell lines or clinical specimens regardless of whether the data meet technical and/or legal definitions of de-identified (i.e., the research does not meet the definition of “human subjects research” under the Common Rule).
    • For research involving the generation of large-scale human genomic data from cell lines or clinical specimens that were created or collected BEFORE the effective date of the GDS Policy:
      • There may or may not have been consent for research use and broad data sharing. NIH will accept data derived from de-identified cell lines or clinical specimens lacking consent for research use that were created or collected before the effective date of this Policy.
  • Institutional Certifications and Data Sharing Limitation Expectations:
    • DMS Plans should address limitations on sharing by anticipating sharing according to the criteria of the Institutional Certification.
    • In cases where it is anticipated that Institutional Certification criteria cannot be met (i.e., data cannot be shared as expected by the GDS Policy), investigators should state the institutional Certification criteria in their DMS Plan, explaining why the element cannot be met, and indicating what data, if any, can be shared and how to enable sharing to the maximal extent possible (for example, sharing data in a summary format). In some instances, the funding NIH ICO may need to determine whether to grant an exception to the data submission expectation under the GDS Policy.
  • Genomic Summary Results:
    • Investigators conducting research subject to the GDS Policy should indicate in their DMS Plan if a study should be designated as “sensitive” for the purposes of access to Genomic Summary Results (GSR), as described in NOT-OD-19-023.

Element 6: Oversight of Data Management and Sharing

Plans for oversight of data management and sharing.

  • Identify the individual(s) (e.g., titles, roles) who will be responsible for executing the various components of data management over the course of the research program.
  • Describe how compliance with the data management and sharing plan will be monitored and managed.