Shyamal D. Peddada, Ph.D.
Biostatistics Branch
-
-
Shyamal D. Peddada, Ph.D.
Principal Investigator -
Tel (919) 541-1122
Fax (919) 541-4311
peddada@niehs.nih.govCurriculum Vitae (111KB) -
P.O. Box 12233
Mail Drop A3-03
Research Triangle Park, North Carolina 27709
Delivery Instructions
Research Summary
The research program, headed by Shyamal Peddada, Ph.D., has two major components, namely, collaborative and methodological research. The statistical methods developed in this program have broad applications, including the following examples:
- Cell-cycle/circadian clock gene expression studies
- Dose-response and time-course studies
- High dimensional data including high throughput screening (HTS) assays
- Standard two-year rodent cancer bioassay
Collaborative research: This research program includes a wide range of collaborations with epidemiologists, pathologists, toxicologists, microbiologists, etc. Some examples include, study of growth of fibroids in pre-menopausal women (Dr. Peddada is the P.I. on this project), epigenetics, analysis of toxicogenomic data to understand the molecular characteristics of chemically induced tumors. In recent years there has been considerable interest in understanding human internal microbial environment [e.g. NIH Human Microbiome Project http://www.hmpdacc.org/] and its impact on human health. Gut microbes play an important role during the early days of after birth. For example they are involved in the development of oral tolerance (i.e. ability for the immune system to recognize substances consumed orally and weaken or suppress the immune response to them), maturation of the immune system, regulation of intestinal angiogenesis and stress responses. As society becomes more hygienic and as the rate of cesarean deliveries and use of antibiotics increases, there is a potential for the disruption of the natural colonization of infant gut microflora. This may potentially result in increased risk of immune-related diseases in childhood. Dr. Peddada is collaborating with Dr. Merete Eggesbo (Norwegian Institute of Public Health) to understand the evolving composition of gut microflora during infancy of a healthy child and how that relates to various health outcomes later in childhood.
Methodological research: The methodological research component of this program is largely motivated by applications in environmental health. Methods developed in this program exploit the underlying structure offered by the scientific question and the study design. Researchers are often interested in drawing inferences on unknown population parameters (or probability distributions) when the parameters (or probability distributions) are constrained by inequalities. For example, a cancer biologist may be interested in understanding changes in gene expression over “ordered conditions” such as exposure to different doses and/or duration of exposure to a chemical, tumor stages etc. In some instances the inequality constraints may arise naturally on a unit circle instead of the p-dimensional Euclidean space. For instance, cell-cycle experiments are routinely conducted to determine, among other things, the phase angle associated with each cell-cycle gene. Thus in this case the parameter space is described by points on a unit circle. Based on available literature and known biological functions of cell-cycle genes, one may expect an (isotropic) order among the phase angles around the unit circle. In this research program, Dr. Peddada is developing methods for analyzing data that exploit such inequalities/order. Nonparametric methods for analyzing ordered multivariate data are also being developed in this research program. The resulting methods are often more powerful and efficient than standard methods.
Methods are also being developed in this program for analyzing high dimensional data, such as those arising in genomic studies (e.g. gene expression, CpG methylation) and toxicology. For example, toxicologists interested in studying the effects of a toxin on animal’s genome, conduct dose-response microarray studies to compare different dose groups in terms of the expressions of thousands of genes, resulting in large testing known as multiple testing. Quantitative high through screening (qHTS) assays are becoming popular among toxicologists and pharmacologists in screening thousands of compounds inexpensively. Analysis of such assays presents numerous challenges because they use nonlinear statistical models such as the Hill model. The asymptotic p-values obtained from such analysis are not necessarily accurate for small tail probabilities (which are important for multiple testing). Resampling methods are computationally very expensive for implementation. In this program, Dr. Peddada is developing methods for analyzing such complex data.
Software
The following software was developed in this research program and is freely available to download.
- Circular FSA("/Rhythmyx/assembler/render?sys_contentid=51492&sys_revision=2&sys_variantid=639&sys_context=0&sys_authtype=0&sys_siteid=&sys_folderid=" sys_dependentvariantid="639" sys_dependentid="51492" inlinetype="rxhyperlink" rxinlineslot="103" sys_dependentid="51492" sys_siteid="" sys_folderid="") (Programmed by Prof. Miguel Fernandez and Ms. Sandra Barragán, University of Valladolid, Spain): For a given set of estimates of angular parameters, this software can be used for testing whether the corresponding angular parameters satisfy a pre-specified order around a unit circle.
References:- Fernandez M, Rueda C, Peddada SD* (2012) Identification of a core set of signature cell-cycle genes whose relative order of time to peak expression is conserved across species. Nucleic Acids Research, 40,2823-32. Epub 2011/12/03. doi: 10.1093/nar/gkr1077.
- Rueda C, Fernandez M, Peddada SD* (2009). Estimation of parameters subject to order restriction on a circle with application to estimation of phase angles of cell-cycle genes. J. Amer. Statist. Assoc., 104, 338-347.
- R code for fitting Random Periods Model ("/Rhythmyx/assembler/render?sys_contentid=51500&sys_revision=2&sys_variantid=639&sys_context=0&sys_authtype=0&sys_siteid=&sys_folderid=" sys_dependentvariantid="639" sys_dependentid="51500" inlinetype="rxhyperlink" rxinlineslot="103" sys_dependentid="51500" sys_siteid="" sys_folderid="") (Programmed by Mr. Sai Cheemalapati, High School Summer Intern): For a given periodic time-course data (e.g. expression of cell-cycle gene or a circadian clock gene), this program fits the nonlinear random periods model (RPM) and estimates all parameters of the model, namely, the intercept, slope, amplitude, phase, period and attenuation. It also provides the information matrix associated with the estimates.
Reference:- Liu D, Umbach D, Peddada SD, Li L, Crockett P, Weinberg C (2004). A Random-Periods Model for Expression of Cell-Cycle Genes. Proceedings of National Academy of Sciences, 101, No. 19, 7240-7245.
- ORIOGEN 4.01 - Order Restricted Inference for Ordered Gene Expression and Multiple Pairwise Comparisons("/Rhythmyx/assembler/render?sys_contentid=35065&sys_revision=8&sys_variantid=639&sys_context=0&sys_authtype=0&sys_siteid=&sys_folderid=" sys_dependentvariantid="639" sys_dependentid="35065" inlinetype="rxhyperlink" rxinlineslot="103" sys_dependentid="35065" sys_siteid="" sys_folderid=""): This software is designed for comparing two or more experimental groups. There are two options available within software, with one used for analyzing ordered experimental conditions (e.g. time, dose, tumor stages, etc.). Under this option, the software can handle an independent sample case, as well as a dependent sample case (e.g. repeated measurements). The residual bootstrap methodology used in this software is robust to any underlying dependence structure. The method controls the FDR at the desired level. The second option is suitable for pairwise comparisons and is not limited to ordered experimental conditions. Thus, for any given design, the second option allows one to make all desired pairwise comparisons among the experimental groups. In addition it allows one to make directional inferences (such as up or down regulated genes etc.). The method controls for the overall mixed directional false discovery rates (mdFDR).
References:- Guo W, Peddada SD* (2008). Adaptive Choice of the Number of Bootstrap Samples in Large Scale Multiple Testing. Statistical Applications in Genetics and Molecular Biology, 7 (1), Art. 13.
- Peddada SD*, Harris S, Zajd J, Harvey E (2005). ORIOGEN: Order Restricted Inference for Ordered Gene Expression data. Bioinformatics, 21, 3933-3934.
- Peddada SD*, Lobenhofer L, Li L, Afshari C, Weinberg C, Umbach D (2003). Gene selection and clustering for time-course and dose-response microarray experiments using order-restricted inference. Bioinformatics, 19, 834-841.
Selected Publications
- Davidov O, Peddada SD (2013). The linear stochastic order and directed inference for multivariate ordered distributions. Annals of statistics 41(1):1-40. [Abstract (http://www.ncbi.nlm.nih.gov/pubmed/23543786?dopt=Abstract) ]
- Lim C, Sen PK, Peddada SD (2012). Robust analysis of high throughput screening assays. Technometrics, in press.
- Mandal S, Sen PK and Peddada SD (2012). A hierarchical functional data analytic approach for analyzing physiologically based pharmacokinetic models. Environometrics, in press.
- Fernandez M, Rueda C, Peddada SD (2012). Identification of a core set of signature cell-cycle genes whose relative order of time to peak expression is conserved across species. Nucleic Acids Research, 40,2823-32, doi: 10.1093/nar/gkr1077. [Abstract (http://www.ncbi.nlm.nih.gov/pubmed/22135306) ]
- Davidov O, Peddada SD (2011). Order restricted inference for multivariate binary data with application to toxicology. J. Amer. Statist. Assoc., 106, 1394-1404.
- Guo W, Sarkar SK, Peddada, SD (2010). Controlling False Discoveries in Multidimensional Directional Decisions, with Applications to Gene Expression Data on Ordered Categories. Biometrics, 66, 485 - 492. [Abstract (http://www.ncbi.nlm.nih.gov/pubmed/19645703) ]
- Rueda, C, Temprano, M, and Peddada, SD. (2008). Estimation of Parameters Subject to Order Restriction on a Circle with Application to Estimation of Phase Angles of Cell-Cycle Genes. J. Amer. Statist. Assoc., 104, 338-347.
- Peddada, SD, Dinse, G and Kissling, G (2007). Incorporating Historical Control Data When Comparing Tumor Incidence Rates. J. Amer. Statist. Assoc., 102, 1212-1220.
- Peddada SD, Dunson DB, Tan X (2005). Estimation of order-restricted means from correlated data. Biometrika, 92, 703-715.
- Liu D, Umbach D, Peddada SD, Li L, Crockett P and Weinberg C. (2004). A Random-Periods Model for Expression of Cell-Cycle Genes. Proc. of National Academy of Sci., 101, No. 19, 7240-7245.
- Peddada SD, Lobenhofer L, Li L, Afshari C, Weinberg C and Umbach D. (2003). Gene Selection and Clustering for Time-course and Dose-response Microarray Experiments using Order-restricted Inference. Bioinformatics, 19, 834-841. [Abstract (http://www.ncbi.nlm.nih.gov/pubmed/12724293) ]
- Peddada SD and Chang TC (1996). Bootstrap Confidence Region Estimation of the Motion of Rigid Bodies. J. Amer. Statist. Assoc., 81, 231-241.
- Hwang JTG and Peddada SD (1994). Confidence Interval Estimation Subject to Order Restrictions. Annals of Statistics, 22, 67-93.
