Training Resources

  • Bridge2AI-TRAINING
    Recorded lectures on multiple aspects of Artificial Intelligence (AI) and Machine Learning (ML) education presented by the NIH Bridge Center.
  • Intro to Statistics: Making Decisions Based on Data
    This course will cover visualization, probability, regression and other topics that will help you learn the basic methods of understanding data with statistics.
  • Introduction to Data Science
    Tour the basic techniques of data science, including both SQL and NoSQL solutions for massive data management (e.g., MapReduce and contemporaries), algorithms for data mining (e.g., clustering and association rule mining), and basic statistical modeling (e.g., linear and non-linear regression).
  • Introductory Machine Learning
    ML is a key technology in Big Data, and in many financial, medical, commercial, and scientific applications. This course balances theory and practice, and covers the mathematical as well as the heuristic aspects.
  • Johns Hopkins Reproducible Research MOOC on YouTube
    This course will focus on literate statistical analysis tools which allow one to publish data analyses in a single document that allows others to easily execute the same analysis to obtain the same results.
  • Learn the Command Line
    The Command Line is a vital tool, allowing you to run programs, write scripts, automate tasks, and combine simple commands right on your computer. Learn how to use the Command Line to work with data—a tool most developers use every day.
  • Mining Massive Datasets
    This class teaches algorithms for extracting models and other information from very large amounts of data. The emphasis is on techniques that are efficient and that scale well.
  • National Human Genome Research Institute YouTube Channel
  • NIH Bioinformatics at NIAID Training Resources
  • NIH Data Science Training Portal
    This website is for data science courses on the NIH campus. Here, you can discover and register for upcoming short courses. You can give input about what topics you want to learn about and request courses.
  • Tackling the Challenges of Big Data - MIT Professional Education
    This course will survey state-of-the-art topics in Big Data, looking at data collection, data storage and processing and extracting structured data from unstructured data, systems issues, analytics, visualization, and a range of applications.
  • The Data Scientist's Toolbox
    The course gives an overview of the data, questions, and tools that data analysts and data scientists work with. There are two components to this course. The first is a conceptual introduction to the ideas behind turning data into actionable knowledge. The second is a practical introduction to the tools that will be used in the program like version control, markdown, git, GitHub, R, and RStudio.
  • The inTelligence And Machine lEarning (TAME)
    Toolkit for Introductory Data Science, Chemical-Biological Analyses, Predictive Modeling, and Database Mining for Environmental Health Research.
  • The Johns Hopkins Data Science Specialization
    This specialization covers the concepts and tools you'll need throughout the entire data science pipeline, from asking the right kinds of questions to making inferences and publishing results.

Environmental Health Databases

NIEHS provides the following databases & data portals as resources to scientists.

Environmental Health Language Collaborative

The Office of Data Science is coordinating the Environmental Health Language Collaborative. The Collaborative is a new initiative to advance community development and application of a harmonized language for describing Environmental Health Science (EHS) research. The Collaborative is part of the NIEHS effort to establish standards for EHS data and metadata that are crucial for enabling efficient data sharing, integration, and analysis of environmental data, and for advancing discovery in environmental health research. Learn more about this initiative.