What scientific questions would benefit most from development and adoption of a harmonized language standard? Below are five draft use cases that are being put forward for community input.
A working group of environmental health researchers and NIEHS program officers developed an initial set of five general use cases, along with sub use case examples, as starting points for community discussion. In several cases, the use cases require not only advances in standardized vocabularies, but also in statistical and modeling approaches, which represents opportunities to engage with those communities. Although use cases overlap and some consolidation is possible, the use cases are provided in near original form to avoid errors or simplifications that might result.
For each of these use cases, a small group of invite-only participants will be drafting use case packages that include a clearer definition of the use case research question, available datasets and ontologies/terminologies that can be used for developing solutions, existing gaps that need to be addressed, and other non-language related challenges that need to be known. A template for the use case package is available (27KB). These use case packages will be the focus of discussion at the virtual public September workshop track ‘Develop Solutions’ and will also provide initial content for a use case repository.
Use Case #1: What Data Exists for a Given Chemical/Endpoint/Exposure Scenario?
- What studies measuring endocrine systems perturbation are available in this database?
- What chemicals are chemically similar to compound X, and is there any 2-year cancer bioassay data available for these compounds?
- What animal data exists that provides conclusions on endpoint X given different terms used to describe endpoint X?
- What other data are available for chemical X when it is found in a formulation?
- What assays were "Active" for this chemical (where "Active" may have different meanings across assays)?
Use Case #2: How Best to Combine Data From Multiple Independent Studies?
- Combine individual-level data from multiple independent studies (heterogeneous study designs and data collection protocols) to understand (with increased statistical power) how exposures X+Y impact health outcome Z.
- How can we describe model organism toxicological assays/data in a way that’s interoperable and reusable to better understand the phenotypic/epigenomic/transcriptomic impact of exposures X+Y across species A+B?
- Integrate and compare data across labs to support more robust corroboration in the confidence of results from toxicological assessments.
- Given conclusive changes in endpoints to one or more exposures, what other data sources exist on the same exposures and endpoints that can confirm or contradict the findings? Including across similar endpoints across different species?
- Given natural text mentions of concepts from scientific studies, what ontology(ies) do these mentions map to normalize terminologies across 100s-1000s of studies?
Use Case #3: Given Measures of Biological Responses to One or More Exposures, What Are the Biological Processes That Might Be Related to the Observed Changes?
- Given conclusive changes in endpoints to one or more exposures, what are biological processes that might lead to the observed changes?
- How can we use a knowledge graph to fill in the adverse outcome or adverse exposure pathways based on the start or end of the pathway?
- What other modes of action/adverse outcome pathways does this assay hit?
- What assays target this mode of action or key event?
- Given an association between exposure and outcome found in an epidemiological study, find the in vivo and in vitro studies that lend support to the association and that suggest involved bioprocesses, including associations that are dependent on developmental windows.
- Given signatures of biological responses to exposures from multiple modalities (e.g., gene expression, pathology), can we link these signatures to known biological phenotypes and processes to characterize response signatures and to identify gaps in characterizations?
- Link a set of available assays (e.g., in PubChem) to known biological processes and phenotypes to better characterize chemical exposures.
Use Case #4: What Are the Biomarkers, Phenotypes, and/or Outcomes That Can Be Measured and Used As an Indicator of Exposure?
- What biomarkers can be used to examine exposure to a given chemical?
- Can we identify biomarkers for different classes of exposures (e.g., exposures to metals/metalloids in soil via dust inhalation, exposure to common pesticides via well water) contextualized by delivery route?
- Given conclusive changes in endpoints in response to one or more exposures, what other data sources exist on the same exposures and endpoints that can confirm or contradict the findings? Including across similar endpoints across different species?
Use Case #5: What Do My Unique Exposure Conditions Based on Where I Live and Work (E.g., Geographical Location, Occupation, Regulations, Hobbies) Indicate About Potential Risks to My Health?
- What is my biggest exposure risk based on my geographical location?
- What am I exposed to in my line of work? How might this impact my health?
- For what components of X industrial emission do we need more information on health outcomes?
- What levels of exposure to X will decrease risk of health outcomes?
- What are the health and economic benefits from regulations or policies that reduce exposure to X?
- What are my biggest exposure risks based on work-life conditions, especially where I live and work (work, geography, hobbies)? What is the route of exposure that is most relevant to my specific conditions?
- How does response to exposure change based on susceptibility (e.g., genetic, disease, SES backgrounds, differences between signatures of exposures and differences of risk)?