Broadly Applicable Statistical Methods
Biostatistics & Computational Biology Branch
Examples of ongoing projects include:
Order-restricted inference: Often methods of analysis can be made more powerful by exploiting the underlying structure implied by the scientific question and the study design. For example, a cancer biologist may be interested in understanding changes in gene expression over “ordered conditions,” such as exposure to different doses and/or duration of exposure to a chemical, tumor stages, etc. In other instances the inequality constraints may arise naturally on a unit circle instead of the p-dimensional Euclidean space. For instance, cell-cycle experiments based on temporal patterns of gene expression can estimate, among other things, the phase angle associated with peak expression of each cell-cycle gene. These can be represented as points on a unit circle. Based on available literature and known biological functions of cell-cycle genes, one may expect the phase angles to follow a certain directional ordering. Members of the Branch are developing analytic methods that exploit such ordering. Nonparametric methods for analyzing ordered multivariate data are also being developed.
High-dimensional data analysis: Methods are also being developed for analyzing high dimensional data, such as those arising in genomic studies (e.g. gene expression, CpG methylation) and toxicology. For example, toxicologists interested in studying the effects of a toxicant on an animal’s genome, conduct dose-response microarray studies to compare different dose groups in terms of the expressions of thousands of genes, resulting in a large number of statistical tests. Quantitative high through screening (qHTS) assays are being developed by toxicologists and pharmacologists in order to screen thousands of compounds efficiently, informatively, and inexpensively. Analysis of the resulting datasets presents numerous challenges because they use nonlinear statistical models, such as the Hill model, and the asymptotic p-values obtained from such analysis are not necessarily reliable. Members of the Branch are developing methods for analyzing such complex data.