By David Richards
In the past year, NIEHS launched a new initiative to enhance the value of environmental health science data and research. The Environmental Health Language Collaborative (EHLC) spurs the development and adoption of new approaches to describing and categorizing environmental health sciences research.
“The Environmental Health Language Collaborative seeks to catalyze knowledge discovery through community-driven approaches to develop and implement a harmonized environmental science language,” said Stephanie Holmgren, M.S.L.S., program manager in the NIEHS Office of Data Science.
Environmental health science is a multidisciplinary field, composed of researchers with different perspectives and who generate diverse data types. As a result, they have their own vocabularies and may use different words to explain similar concepts. For example, scientists may use soot, soot particles, or black carbon, all to describe the same exposure. In addition, researchers may use the same term, but with a different meaning given the context of use.
Holmgren provided the example of a drought. To a hydrologist, meteorologist, or farmer, drought can be measured as the amount of subsurface water, rainfall, or crop yield, respectively. Therefore, efforts to describe data using a common language can enhance a researcher’s ability to more readily find, reuse, or reanalyze multiple sources of drought-related data.
According to Holmgren, when researchers use a common language, they also may be able to combine varied data sets related to environmental health. They can then utilize advanced statistical and machine learning approaches to gain insights or make predictions with the larger data sets.
“The technological advances generating big data and more high-throughput data has led to a volume of data that can be unwieldly to integrate across studies if you are not talking the same language,” said Holmgren. “This initiative enables researchers to combine datasets from different research studies to answer large-scale complex research questions.”
Ways to Get Involved
If you would like to join the community of researchers, ontologists, informaticists, systems developers, and others working together on environmental health common language approaches, please sign up for the EHLC email distribution list. If you have questions about EHLC, please contact Stephanie Holmgren, Office of Data Science.
Launched in 2021, EHLC has been building its community, defining its purpose and goals, and identifying research questions that can benefit from harmonized language approaches, such as through ontologies and semantics solutions – the categorization and meaning of terms and vocabularies. In June and July 2021, the EHLC hosted and recorded two webinars to build the foundation for the EHLC community and to establish a starting point for understanding ontologies, respectively.
In September 2021, the collaborative hosted a two-day workshop to reach a consensus on the scope of the initiative and to discuss specific use cases – scientific questions that would benefit most from the development and adoption of harmonized language approaches. “Numerous ontologies, minimal information standards, and vocabularies already exist for specific types of research,” said Holmgren. “However, there are still gaps with respect to environmental health research. The goal of the collaborative is to identify where those gaps are, and then for the community to build the language solutions to address those gaps.”
Holmgren emphasized these semantic solutions will focus more on harmonization than standardization. Standardization requires a community to agree upon and use a uniform term going forward. A harmonized approach enables comparing variables from different studies and aligning them to a harmonized term, which then allows pooling of the data for analysis.
Holmgren provided an example where researchers have used a variety of beverage terms, such as cola, pop, soda, and soft drink. To integrate the data from studies using those different terms, each is matched to the harmonized term of carbonated beverage.
Looking ahead in 2022, the EHLC aims to continue expanding community involvement and making progress on developing deliverables for the existing use cases.
EHLC is one of a few NIEHS initiatives to enhance capacity in data science. The Office of Data Science also oversees the NIEHS commitment to FAIR (findable, accessible, interoperable, and reusable), a principle adopted in the NIEHS 2018-2023 strategic plan. FAIR aims to improve knowledge discovery and innovation through integration and reuse of data.
NIEHS has also created the Office of Environmental Science Cyberinfrastructure and the Office of Scientific Computing within the past few years to further expand data science and harmonization efforts.
Implications for Climate Change and Human Health Data
National Institutes of Health (NIH) Data Science Initiatives
Final NIH Policy for Data Management and Sharing
NIH finalized a policy to facilitate the management and sharing of scientific data generated from NIH-funded or conducted research. The Policy for Data Management and Sharing establishes requirements to improve good data management practices and to maximize opportunities for sharing data. The policy will go into effect in January 2023.
Strategic Plan for Data Science
NIH released the Strategic Plan for Data Science in June 2018 to outline pathways to improve existing data management of NIH-funded biomedical data. The strategic plan addresses storing, managing, standardizing, and publishing biomedical research and data. NIH is implementing the plan and seeks community input to assist refinement.
Global warming, climate variability, climate emergency, and climate crisis are terms that have been used to describe climate change.
Harmonizing language within climate change and human health research can promote greater reuse and reanalysis of diverse data types and discipline perspectives, leading to improved modelling and predictions of the health impacts of climate change.
The NIEHS Climate Change and Human Health program is constructing a glossary that aims to create a shared language for intersectoral collaboration across a diverse audience of researchers, policy makers, academics, and practitioners. Terms that will be included in the glossary are commonly referenced in climate change and human health literature and the news media.
“Establishing a common understanding of the terminology can promote effective communication and help foster new ideas and action to tackle the impacts of climate change on public health in novel ways,” said Trisha Castranio, program manager for Global Environmental Health at NIEHS. “Our efforts also include the development of a resource portal with tools, methodologies, and models that builds on this idea and will encourage transdisciplinary engagement and fresh and imaginative thinking.”