Julia Brody, Ph.D.
Silent Spring Institute
A new study from NIEHS grantees reported that traditional methods to make environmental health data anonymous and protect the privacy of study participants may not be enough to prevent re-identification. According to the authors, privacy risks have been investigated for genetic and medical records, but rarely for environmental data.
The researchers reviewed 12 prominent environmental health studies and looked at the types of data collected and the availability of outside datasets that overlapped with the study data. They reported that all studies included at least two of five overlapping data types, such as geographic location, medical data, occupation, housing characteristics, and genetic data. The authors explained that overlapping datasets could be linked, making participants vulnerable to re-identification.
Using data from the Household Exposure Study and the Green Housing Study, the team analyzed whether environmental measurements could increase risk to privacy. They analyzed raw data on measurements of chemicals in household air and dust and found that the participant’s region of residence could be inferred from the raw data with 80-98% accuracy.
Although sharing data has many benefits, use of multiple data types provided more opportunities for research data to be matched with other commercial or public databases, increasing the vulnerability of re-identification of participants. According to the authors, these findings reinforce the need for scientists to develop more explicit informed consent documents and identify the types of data that should be excluded from public sharing.
Citation: Boronow KE, Perovich LJ, Sweeney L, Yoo JS, Rudel RA, Brown P, Brody JG. 2020. Privacy risks of sharing data from environmental health studies. Environ Health Perspect 128(1):17008.