Sudipto Banerjee, Ph.D.
October 30, 2019
Sudipto Banerjee, Ph.D. can model how pollution moves across entire continents, how trees grow in different climates, and how exposures affect human health.
To make these models and predictions, he uses Bayesian statistics, which quantifies uncertainty by combining information from observed data with prior or expert estimates.
Banerjee, professor and chair of the Department of Biostatistics at the University of California, Los Angeles Fielding School of Public Health, then uses these models to explain environmental and human health outcomes.
“Statistics is often the junction where a lot of significant disciplines in the sciences meet,” Banerjee said. “A modern statistician must be a good collaborative scientist.”
Banerjee’s collaborations include work in public and environmental health, pollution, disease mapping, forestry, and ecology. Banerjee is part of the National Institutes of Environmental Health Sciences (NIEHS) Gulf Long-Term Follow-Up Study (GuLF STUDY), which explores potential health effects of community members and workers following the 2010 Deepwater Horizon oil rig explosion.
He also leads NIEHS grants on the methodology and application of Bayesian statistics to understand human health outcomes.
Big Data Connects Environment and Health
According to Banerjee, big data presents challenges to statisticians due to the size and the complexities of the units being measured.
Traditional statistical methods are often inadequate for modeling complex dependencies. Banerjee believes Bayesian models draw on the strengths of information across a variety of sources, making for a more powerful inferential tool, which leads to better prediction of outcomes.
With Bayesian models, Banerjee has predicted how trees will grow, given variations in climate over time. In another study, he used a massive air quality dataset to improve predictions in particulate matter levels levels across Europe over long periods of time.
Banerjee is also interested in connections between the environment and human health. In a 2015 study, he created a model that could predict asthma-related hospitalization rates in any California county on any day of the year.
Currently, Banerjee is working to show how environmental and climate-related factors negatively affect human health. To do this, he uses big data from personal fitness trackers to document how people move around in their environment. Combining fitness tracker data with health outcomes and climate data for the area, he creates an unbiased prediction of how someone living in a certain location might be affected by their environment.
Since health outcome data are aggregated over regions and climate data are presented at a much finer resolution, Banerjee develops ways to integrate this data in his modeling approaches.
“We are exploring how to align data sets of different scales to create a meaningful model in which we can understand the impact of climate and environment on health,” said Banerjee.
Data to Uncover Hazards to Industrial Workers
For Banerjee’s other NIEHS-funded project, he uses Bayesian statistics to create exposure models to improve the efficiency of workplace risk assessments, which assess potential hazards that are associated with workplace tasks.
Banerjee and his research partner, Gurumurthy Ramachandran, Ph.D., at Johns Hopkins Bloomberg School of Public Health, are simulating workplace scenarios to show how contaminants are generated and ventilated. Data from these experiments will be combined with data from actual workplace environments to better understand inhalation exposures.
This exposure model can be used to evaluate proposed workplace installations or operations. Banerjee also plans to create an open-access statistical software package so other researchers can model workplace exposures.
A Big Award for Work With Big Data
In July 2019, The Committee of Presidents of Statistical Societies presented Banerjee with the George W. Snedecor Award, which is given every other year to a researcher conducting groundbreaking fundamental work on Bayesian modeling using big datasets.
Banerjee remembers learning from a textbook Snedecor coauthored. “It was one of the most influential textbooks on statistical methods,” Banerjee said. “It is really a stupendous honor for me to be receiving an award associated with his name.”