Skip Navigation

Your Environment. Your Health.

Shyamal D. Peddada, Ph.D.

Biostatistics Branch

Shyamal D. Peddada, Ph.D.
Shyamal D. Peddada, Ph.D.
Principal Investigator
Tel (919) 541-1122
Fax (919) 541-4311
peddada@niehs.nih.gov
P.O. Box 12233
Mail Drop A3-03
Research Triangle Park, North Carolina 27709
Delivery Instructions

Research Summary

The research program, headed by Shyamal Peddada, Ph.D., has two major components, namely, collaborative and methodological research. The statistical methods developed in this program have broad applications, including the following examples:

 

  • Cell-cycle/circadian clock gene expression studies
  • Dose-response and time-course studies
  • High dimensional data including high throughput screening (HTS) assays
  • Standard two-year rodent cancer bioassay

 

Collaborative research: This research program includes a wide range of collaborations with epidemiologists, pathologists, toxicologists, microbiologists, etc. Some examples include, study of growth of fibroids in pre-menopausal women (Dr. Peddada is the P.I. on this project), epigenetics, analysis of toxicogenomic data to understand the molecular characteristics of chemically induced tumors. In recent years there has been considerable interest in understanding human internal microbial environment [e.g. NIH Human Microbiome Project http://www.hmpdacc.org/] and its impact on human health. Gut microbes play an important role during the early days of after birth. For example they are involved in the development of oral tolerance (i.e. ability for the immune system to recognize substances consumed orally and weaken or suppress the immune response to them), maturation of the immune system, regulation of intestinal angiogenesis and stress responses. As society becomes more hygienic and as the rate of cesarean deliveries and use of antibiotics increases, there is a potential for the disruption of the natural colonization of infant gut microflora. This may potentially result in increased risk of immune-related diseases in childhood. Dr. Peddada is collaborating with Dr. Merete Eggesbo (Norwegian Institute of Public Health) to understand the evolving composition of gut microflora during infancy of a healthy child and how that relates to various health outcomes later in childhood.

 

Methodological research: The methodological research component of this program is largely motivated by applications in environmental health. Methods developed in this program exploit the underlying structure offered by the scientific question and the study design. Researchers are often interested in drawing inferences on unknown population parameters (or probability distributions) when the parameters (or probability distributions) are constrained by inequalities. For example, a cancer biologist may be interested in understanding changes in gene expression over “ordered conditions” such as exposure to different doses and/or duration of exposure to a chemical, tumor stages etc. In some instances the inequality constraints may arise naturally on a unit circle instead of the p-dimensional Euclidean space. For instance, cell-cycle experiments are routinely conducted to determine, among other things, the phase angle associated with each cell-cycle gene. Thus in this case the parameter space is described by points on a unit circle. Based on available literature and known biological functions of cell-cycle genes, one may expect an (isotropic) order among the phase angles around the unit circle. In this research program, Dr. Peddada is developing methods for analyzing data that exploit such inequalities/order. Nonparametric methods for analyzing ordered multivariate data are also being developed in this research program. The resulting methods are often more powerful and efficient than standard methods.

 

Methods are also being developed in this program for analyzing high dimensional data, such as those arising in genomic studies (e.g. gene expression, CpG methylation) and toxicology. For example, toxicologists interested in studying the effects of a toxin on animal’s genome, conduct dose-response microarray studies to compare different dose groups in terms of the expressions of thousands of genes, resulting in large testing known as multiple testing. Quantitative high through screening (qHTS) assays are becoming popular among toxicologists and pharmacologists in screening thousands of compounds inexpensively. Analysis of such assays presents numerous challenges because they use nonlinear statistical models such as the Hill model. The asymptotic p-values obtained from such analysis are not necessarily accurate for small tail probabilities (which are important for multiple testing). Resampling methods are computationally very expensive for implementation. In this program, Dr. Peddada is developing methods for analyzing such complex data.

Software

The following software was developed in this research program and is freely available to download.

 

  • R Code for Estimating of Global Relative Order of Peak Expression Satisfied by a Set of Oscillatory Genes(28KB)
    (Programmed by Ms. Sandra Barragán, University of Valladolid, Spain):
    For a given collection of oscillatory genes (e.g. cell-cycle genes or circadian clock genes) with phase angles estimated from multiple experiments, in this software we estimate the relative order of peak expression among the genes. It contains 2 functions written in R, called Aggregation of Circular Orders (ACO), which is based on a solution to the traveling salesman problem, and Circular Local Minimization (CLM) algorithm which is used to smooth the solution obtained from ACO. To run these programs the user should first download the companion R package called \emph{isocir} from CRAN http://cran.r-project.org/web/packages/isocir/index.html  .

  • Circular FSA("/Rhythmyx/assembler/render?sys_contentid=51492&sys_revision=2&sys_variantid=639&sys_context=0&sys_authtype=0&sys_siteid=&sys_folderid=" sys_dependentvariantid="639" sys_dependentid="51492" inlinetype="rxhyperlink" rxinlineslot="103" sys_dependentid="51492" sys_siteid="" sys_folderid="")
    (Programmed by Prof. Miguel Fernandez and Ms. Sandra Barragán, University of Valladolid, Spain):
    For a given set of estimates of angular parameters, this software can be used for testing whether the corresponding angular parameters satisfy a pre-specified order around a unit circle.
    References:
    • Fernandez M, Rueda C, Peddada SD* (2012) Identification of a core set of signature cell-cycle genes whose relative order of time to peak expression is conserved across species. Nucleic Acids Research, 40,2823-32. Epub 2011/12/03. doi: 10.1093/nar/gkr1077.
    • Rueda C, Fernandez M, Peddada SD* (2009). Estimation of parameters subject to order restriction on a circle with application to estimation of phase angles of cell-cycle genes. J. Amer. Statist. Assoc., 104, 338-347.

  • R code for fitting Random Periods Model ("/Rhythmyx/assembler/render?sys_contentid=51500&sys_revision=2&sys_variantid=639&sys_context=0&sys_authtype=0&sys_siteid=&sys_folderid=" sys_dependentvariantid="639" sys_dependentid="51500" inlinetype="rxhyperlink" rxinlineslot="103" sys_dependentid="51500" sys_siteid="" sys_folderid="")
    (Programmed by Mr. Sai Cheemalapati, High School Summer Intern):
    For a given periodic time-course data (e.g. expression of cell-cycle gene or a circadian clock gene), this program fits the nonlinear random periods model (RPM) and estimates all parameters of the model, namely, the intercept, slope, amplitude, phase, period and attenuation. It also provides the information matrix associated with the estimates.
    Reference:
    • Liu D, Umbach D, Peddada SD, Li L, Crockett P, Weinberg C (2004). A Random-Periods Model for Expression of Cell-Cycle Genes. Proceedings of National Academy of Sciences, 101, No. 19, 7240-7245.

  • ORIOGEN 4.01 - Order Restricted Inference for Ordered Gene Expression and Multiple Pairwise Comparisons("/Rhythmyx/assembler/render?sys_contentid=35065&sys_revision=8&sys_variantid=639&sys_context=0&sys_authtype=0&sys_siteid=&sys_folderid=" sys_dependentvariantid="639" sys_dependentid="35065" inlinetype="rxhyperlink" rxinlineslot="103" sys_dependentid="35065" sys_siteid="" sys_folderid=""):
    This software is designed for comparing two or more experimental groups. There are two options available within software, with one used for analyzing ordered experimental conditions (e.g. time, dose, tumor stages, etc.). Under this option, the software can handle an independent sample case, as well as a dependent sample case (e.g. repeated measurements). The residual bootstrap methodology used in this software is robust to any underlying dependence structure. The method controls the FDR at the desired level. The second option is suitable for pairwise comparisons and is not limited to ordered experimental conditions. Thus, for any given design, the second option allows one to make all desired pairwise comparisons among the experimental groups. In addition it allows one to make directional inferences (such as up or down regulated genes etc.). The method controls for the overall mixed directional false discovery rates (mdFDR).
    References:
    • Guo W, Peddada SD* (2008). Adaptive Choice of the Number of Bootstrap Samples in Large Scale Multiple Testing. Statistical Applications in Genetics and Molecular Biology, 7 (1), Art. 13.
    • Peddada SD*, Harris S, Zajd J, Harvey E (2005). ORIOGEN: Order Restricted Inference for Ordered Gene Expression data. Bioinformatics, 21, 3933-3934.
    • Peddada SD*, Lobenhofer L, Li L, Afshari C, Weinberg C, Umbach D (2003). Gene selection and clustering for time-course and dose-response microarray experiments using order-restricted inference. Bioinformatics, 19, 834-841.

  • Analysis of Compositional Microbiome (ANCOM) data(10KB)
    This software is designed for analyzing microbiome data. The methodology would compare the total abundance of taxa between two populations.


Back to top Back to top

Selected Publications

  1. Joubert BR, Håberg SE, Bell DA, Nilsen RM, Vollset SE, Midttun O, Ueland PM, Wu MC, Nystad W, Peddada SD, London SJ. Maternal smoking and DNA methylation in newborns: in utero effect or epigenetic inheritance?. Cancer epidemiology, biomarkers & prevention: a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology 2014 23(6):1007-1017.[Abstract ]
  2. Davidov O, Peddada S. (2013). Testing for the Multivariate Stochastic Order among Ordered Experimental Groups with Application to Dose-Response Studies. Biometrics 69(4):982-990.[Abstract ]
  3. White RA, Bjørnholt J, Baird DD, Midtvedt T, Harris JR, Pagano M, Hide W, Rudi K, Moen B, Iszatt N, Peddada SD, Eggesbø M (2013). Novel Developmental Analyses Identify Longitudinal Patterns of Early Gut Microbiota that Affect Infant Growth. PLoS Computational Biology 9(5):e1003042.[Abstract ]
  4. Davidov O, Peddada SD (2013). The linear stochastic order and directed inference for multivariate ordered distributions. Annals of statistics 41(1):1-40. [Abstract ]
  5. Lim C, Sen PK, Peddada SD. (2013). Robust Analysis of High Throughput Screening (HTS) Assay Data. Technometrics: a journal of statistics for the physical, chemical, and engineering sciences 55(2):150-160.[Abstract ]
  6. Fernandez M, Rueda C, Peddada SD (2012). Identification of a core set of signature cell-cycle genes whose relative order of time to peak expression is conserved across species. Nucleic Acids Research, 40,2823-32, doi: 10.1093/nar/gkr1077.[Abstract ]
  7. Davidov O, Peddada SD (2011). Order restricted inference for multivariate binary data with application to toxicology. J. Amer. Statist. Assoc., 106, 1394-1404.
  8. Guo W, Sarkar SK, Peddada, SD (2010). Controlling False Discoveries in Multidimensional Directional Decisions, with Applications to Gene Expression Data on Ordered Categories. Biometrics, 66, 485 - 492.[Abstract ]
  9. Peddada, SD, Dinse, G and Kissling, G (2007). Incorporating Historical Control Data When Comparing Tumor Incidence Rates. J. Amer. Statist. Assoc., 102, 1212-1220.
  10. Peddada SD, Lobenhofer L, Li L, Afshari C, Weinberg C and Umbach D. (2003). Gene Selection and Clustering for Time-course and Dose-response Microarray Experiments using Order-restricted Inference. Bioinformatics, 19, 834-841.[Abstract ]
  11. Hwang JTG and Peddada SD (1994). Confidence Interval Estimation Subject to Order Restrictions. Annals of Statistics, 22, 67-93.

Back to Top