Skip Navigation
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.


The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Internet Explorer is no longer a supported browser.

This website may not display properly with Internet Explorer. For the best experience, please use a more recent browser such as the latest versions of Google Chrome, Microsoft Edge, and/or Mozilla Firefox. Thank you.

Your Environment. Your Health.

Data Science Initiatives

Big Data to Knowledge (BD2K)

Big Data to Knowledge (BD2K) is a trans-NIH initiative established to enable biomedical research as a digital research enterprise, to facilitate discovery and support new knowledge, and to maximize community engagement. The BD2K initiative addresses four major aims that, in combination, are meant to enhance the utility of biomedical Big Data to:

  • Facilitate broad use of biomedical digital assets by making them discoverable, accessible, and citable.
  • Conduct research and develop the methods, software, and tools needed to analyze biomedical Big Data.
  • Enhance training in the development and use of methods and tools necessary for biomedical Big Data science.
  • Support a data ecosystem that accelerates discovery as part of a digital enterprise. 

The Office of Data Science represents NIEHS interests in several BD2K initiatives, including the Sustainability Working Group and the Data Discovery Index Coordination Consortium.

Check out these resources for the latest BD2K developments:

Information Data word art

Data Science Training

The goal of this training series is to increase researchers' skills in data analysis, visualization, machine learning, and graph analytics.

Past training topics include:

  • Databases and Data systems
  • Environmental Health Science Datasets
  • Analysis Methods and Tools using Python
  • Data Products – Overview
  • Graph Analytics
  • Introduction to R
  • Machine Learning using R
  • Introduction to Git and GitHub

Data Science Seminar Series

This seminar series is featuring thought-leaders from research universities and the biomedical industry

Data Management at NIEHS Core Labs

The NIEHS Core Labs use a number of paper-based applications and rely heavily on human labor for data management. As part of a pilot program, ODS is implementing iRODS technology to aid in automating aspects of the Core Labs workflow. iRODS uses machine-based actional rules to automate and expedite data processes. The iRODS implementation will utilize the system’s robust permissions system as well as metadata catalogue to enable controlled, and where possible, automatic addition of file tags to the files produced through the workflow process. Searches within the metadata catalogue using the tags will enable scientific queries that are currently too labor intensive to perform as well as facilitate easy generation of budget, resource, and regulatory reports.

Past Projects

NIEHS - NCATS - UNC DREAM Toxicogenetics Group

The objective of this challenge was to obtain a greater understanding about how a person's individual genetics can influence cytotoxic response to exposure to widely used chemicals. In 2013, it was led and organized by scientists from Sage Bionetworks, DREAM, the University of North Carolina, NIEHS, and the National Center for Advancing Translational Sciences.

Environmental Health Science Common Language

The Office of Data Science is leading the NIEHS effort to establish standards for metadata and keyword tags, and work toward implementation of a common environmental health science vocabulary. are needed to enable data sharing, integration, and analysis of environmental data, and are crucial for advancing discovery in environmental health research. NIEHS will play a lead role in engaging the environmental health science community to implement an environmental health science semantic ecosystem.

Join the community of researchers, professors, and ontologists working together on the EHS Common Language standards by joining our email distribution list.

to Top