Data Science Initiatives
Big Data to Knowledge (BD2K)
Big Data to Knowledge (BD2K) is a trans-NIH initiative established to enable biomedical research as a digital research enterprise, to facilitate discovery and support new knowledge, and to maximize community engagement. The BD2K initiative addresses four major aims that, in combination, are meant to enhance the utility of biomedical Big Data to:
- Facilitate broad use of biomedical digital assets by making them discoverable, accessible, and citable.
- Conduct research and develop the methods, software, and tools needed to analyze biomedical Big Data.
- Enhance training in the development and use of methods and tools necessary for biomedical Big Data science.
- Support a data ecosystem that accelerates discovery as part of a digital enterprise.
The Office of Data Science represents NIEHS interests in several BD2K initiatives, including the Sustainability Working Group and the Data Discovery Index Coordination Consortium.
Check out these resources for the latest BD2K developments:
Data Science Training
The goal of this training series is to increase researchers' skills in data analysis, visualization, machine learning, and graph analytics.
Past training topics include:
- Databases and Data systems
- Environmental Health Science Datasets
- Analysis Methods and Tools using Python
- Data Products – Overview
- Graph Analytics
- Introduction to R
- Machine Learning using R
- Introduction to Git and GitHub
Data Science Seminar Series
This seminar series is featuring thought-leaders from research universities and the biomedical industry
Data Management at NIEHS Core Labs
The NIEHS Core Labs use a number of paper-based applications and rely heavily on human labor for data management. As part of a pilot program, ODS is implementing iRODS technology to aid in automating aspects of the Core Labs workflow. iRODS uses machine-based actional rules to automate and expedite data processes. The iRODS implementation will utilize the system’s robust permissions system as well as metadata catalogue to enable controlled, and where possible, automatic addition of file tags to the files produced through the workflow process. Searches within the metadata catalogue using the tags will enable scientific queries that are currently too labor intensive to perform as well as facilitate easy generation of budget, resource, and regulatory reports.
NIEHS - NCATS - UNC DREAM Toxicogenetics Group
The objective of this challenge was to obtain a greater understanding about how a person's individual genetics can influence cytotoxic response to exposure to widely used chemicals. In 2013, it was led and organized by scientists from Sage Bionetworks, DREAM, the University of North Carolina, NIEHS, and the National Center for Advancing Translational Sciences.
Environmental Health Science Common Language
The Office of Data Science is leading the NIEHS effort to establish standards for metadata and keyword tags, and work toward implementation of a common environmental health science vocabulary. are needed to enable data sharing, integration, and analysis of environmental data, and are crucial for advancing discovery in environmental health research. NIEHS will play a lead role in engaging the environmental health science community to implement an environmental health science semantic ecosystem.
Join the community of researchers, professors, and ontologists working together on the EHS Common Language standards by joining our email distribution list.