Skip Navigation

Your Environment. Your Health.

Shyamal D. Peddada, Ph.D.

Biostatistics & Computational Biology Branch

Shyamal D. Peddada, Ph.D.
Shyamal D. Peddada, Ph.D.
Acting Branch Chief, Biostatistics & Computational Biology Branch and Principal Investigator
Tel (919) 541-1122
Fax (919) 541-4311
P.O. Box 12233
Mail Drop A3-03
Research Triangle Park, NC 27709

Delivery | Postal
Delivery Instructions

Research Summary

Shyamal Peddada, Ph.D., is the Acting Branch Chief and Principal Investigator of the Biostatistics and Computational Biology Branch and holds a secondary appointment in the NIEHS Epidemiology Branch.

We develop broadly applicable statistical methods that are motivated by applications in environmental sciences. Constraints arise naturally in many scientific problems. For example, in dose-response studies the mean response is often expected to be monotonic. Similarly, mean expression of differentially expressed genes may not vary arbitrarily across ordered conditions (e.g. stages of cancer) but may have a systematic pattern of expression over the conditions. These patterns can be nonparametrically represented using mathematical inequalities. In some cases, the parameters (and the data) may be constrained by a pyramid (or “simplex”), as in the case of microbiome data, or a circle, as in the case of circadian clock or cell-cycle gene expression data. Ignoring such additional information may not only result in inefficiency (i.e. loss of power or increase in sample size requirements), but more importantly, may potentially result in misinterpretation of the scientific data. We develop statistical methods that exploit the underlying constraints on the biological parameter and consequently our methods are not only more efficient but they also provide better interpretation of the data. Our group also develops methods that are suitable for analyzing high dimensional data obtained from genomics (e.g. microarray, RNA-Seq, etc.), quantitative high-throughput screening (qHTS) assays, microbiome, etc. Software developed by the group are freely accessible through links provided below.


The following software was developed in this research program and is freely available to download.

  • Analysis of Composition of Microbiomes (ANCOM) - II: Comparison of relative abundances in the presence of structural zeros
    (R-code developed by Dr. Abhishek Kaul, Biostatistics and Computational Biology Branch, NIEHS).
    The original ANCOM software (Mandal et al., 2015) was designed to compare the mean taxa abundance (at the ecosystem level) of individual taxon in two or more populations. This R package (ANCOM-II) is designed to compare the mean relative abundance (at the ecosystem level) of individual taxon in two or more populations. Thus if a user is interested in comparing the mean taxa abundance between two or more populations at the ecosystem level then they should use ANCOM (described in the next bullet) but if the user is interested in comparing the mean relative abundance between two or more populations at the ecosystem level then ANCOM – II may be used.

    ANCOM-II is designed to handle repeated measurements, model covariates, test for trends over experimental groups (e.g. time, or dose, or other ordered conditions), deal with zero counts without assigning arbitrary pseudo counts. It is a residual bootstrap based methodology and hence allows heteroscedasticity (i.e. unequal variances among groups) and makes minimal distributional assumptions.

  • Analysis of Compositional Microbiomes (ANCOM) data
    (R-code developed by Dr. Siddhartha Mandal, Norwegian Institute of Public Health and the Shiny app developed by Dr. Casey Jelsema, Biostatistics and Computational Biology Branch, NIEHS)
    This R package is designed for comparing the abundance of individual taxa in two populations using log-ratios of abundance. This software is based on the ANCOM methodology developed in Mandal et al. (2015)
    • Mandal S, Van Treuren W, White RA, Eggesbø M, Knight R, and Peddada SD (2015). Analysis of composition of microbiomes: a novel method for studying microbial composition. Microbial Ecology in Health and Disease, 26, 1 – 7.

  • R Code for Estimating of Global Relative Order of Peak Expression Satisfied by a Set of Oscillatory Genes(28KB)
    (Programmed by Ms. Sandra Barragán, University of Valladolid, Spain):
    For a given collection of oscillatory genes (e.g. cell-cycle genes or circadian clock genes) with phase angles estimated from multiple experiments, in this software we estimate the relative order of peak expression among the genes. It contains 2 functions written in R, called Aggregation of Circular Orders (ACO), which is based on a solution to the traveling salesman problem, and Circular Local Minimization (CLM) algorithm which is used to smooth the solution obtained from ACO. To run these programs the user should first download the companion R package called \emph{isocir} from CRAN .

  • Circular FSA("/Rhythmyx/assembler/render?sys_contentid=51492&sys_revision=2&sys_variantid=639&sys_context=0&sys_authtype=0&sys_siteid=&sys_folderid=" sys_dependentvariantid="639" sys_dependentid="51492" inlinetype="rxhyperlink" rxinlineslot="103" sys_dependentid="51492" sys_siteid="" sys_folderid="")
    (Programmed by Prof. Miguel Fernandez and Ms. Sandra Barragán, University of Valladolid, Spain):
    For a given set of estimates of angular parameters, this software can be used for testing whether the corresponding angular parameters satisfy a pre-specified order around a unit circle.
    • Fernandez M, Rueda C, Peddada SD* (2012) Identification of a core set of signature cell-cycle genes whose relative order of time to peak expression is conserved across species. Nucleic Acids Research, 40,2823-32. Epub 2011/12/03. doi: 10.1093/nar/gkr1077.
    • Rueda C, Fernandez M, Peddada SD* (2009). Estimation of parameters subject to order restriction on a circle with application to estimation of phase angles of cell-cycle genes. J. Amer. Statist. Assoc., 104, 338-347.

  • R code for fitting Random Periods Model ("/Rhythmyx/assembler/render?sys_contentid=51500&sys_revision=2&sys_variantid=639&sys_context=0&sys_authtype=0&sys_siteid=&sys_folderid=" sys_dependentvariantid="639" sys_dependentid="51500" inlinetype="rxhyperlink" rxinlineslot="103" sys_dependentid="51500" sys_siteid="" sys_folderid="")
    (Programmed by Mr. Sai Cheemalapati, High School Summer Intern):
    For a given periodic time-course data (e.g. expression of cell-cycle gene or a circadian clock gene), this program fits the nonlinear random periods model (RPM) and estimates all parameters of the model, namely, the intercept, slope, amplitude, phase, period and attenuation. It also provides the information matrix associated with the estimates.
    • Liu D, Umbach D, Peddada SD, Li L, Crockett P, Weinberg C (2004). A Random-Periods Model for Expression of Cell-Cycle Genes. Proceedings of National Academy of Sciences, 101, No. 19, 7240-7245.

  • ORIOGEN 4.01 - Order Restricted Inference for Ordered Gene Expression and Multiple Pairwise Comparisons("/Rhythmyx/assembler/render?sys_contentid=35065&sys_revision=8&sys_variantid=639&sys_context=0&sys_authtype=0&sys_siteid=&sys_folderid=" sys_dependentvariantid="639" sys_dependentid="35065" inlinetype="rxhyperlink" rxinlineslot="103" sys_dependentid="35065" sys_siteid="" sys_folderid=""):
    This software is designed for comparing two or more experimental groups. There are two options available within software, with one used for analyzing ordered experimental conditions (e.g. time, dose, tumor stages, etc.). Under this option, the software can handle an independent sample case, as well as a dependent sample case (e.g. repeated measurements). The residual bootstrap methodology used in this software is robust to any underlying dependence structure. The method controls the FDR at the desired level. The second option is suitable for pairwise comparisons and is not limited to ordered experimental conditions. Thus, for any given design, the second option allows one to make all desired pairwise comparisons among the experimental groups. In addition it allows one to make directional inferences (such as up or down regulated genes etc.). The method controls for the overall mixed directional false discovery rates (mdFDR).
    • Guo W, Peddada SD* (2008). Adaptive Choice of the Number of Bootstrap Samples in Large Scale Multiple Testing. Statistical Applications in Genetics and Molecular Biology, 7 (1), Art. 13.
    • Peddada SD*, Harris S, Zajd J, Harvey E (2005). ORIOGEN: Order Restricted Inference for Ordered Gene Expression data. Bioinformatics, 21, 3933-3934.
    • Peddada SD*, Lobenhofer L, Li L, Afshari C, Weinberg C, Umbach D (2003). Gene selection and clustering for time-course and dose-response microarray experiments using order-restricted inference. Bioinformatics, 19, 834-841.

  • Constrained Linear Mixed Effects (CLME) for analyzing mixed and fixed models under inequality constraints.
    (Programmed by Dr. Casey M. Jelsema, Research Fellow, Biostatistics Branch, NIEHS):
    In many applications, such as in dose-response studies or time-course experiments, researchers are interested in testing for specific inequality constraints or patterns among the means of experimental groups. This R package is designed to test for such inequality patterns using a robust residual bootstrap based methodology which does not require the data to be normally distributed. Furthermore, this software can also handle the situation when covariates and/or random effects are present. Thus, for example, this package can be used in the context of repeated measurement designs with covariates. This package comes with a user friendly graphical interface so no programming is necessary to run this package. All the user needs to do is to provide input source of the data and select options from the interface.

Selected Publications

  1. Harvey JB, Hong HH, Bhusari S, Ton TV, Wang Y, Foley JF, Peddada SD, Hooth M, DeVito M, Nyska A, Pandiri AR, Hoenerhoff MJ. F344/NTac Rats Chronically Exposed to Bromodichloroacetic Acid Develop Mammary Adenocarcinomas With Mixed Luminal/Basal Phenotype and Tgfβ Dysregulation. Veterinary pathology 2016 53(1):170-181. [Abstract]
  2. Rueda C, Fernandez MA, Barragan S, Mardia KV, Peddada SD (2016). Circular Piecewise Regression with an Application to Cell-cycle Biology. Biometrics (In press).
  3. Jelsema C, Peddada SD (2016). An R Package for Linear Mixed Effects Models under Inequality Constraints. Journal of Statistical Software (In press).
  4. Rebera SO, Siebler PH, Donner NC, Morton JT, Smith DG, Kopelman JM, Lowe KR, Campbell K, Fox JH, Hassell JE, Greenwood BN, Janscha C, Lechner A, Uschold-Schmidt N, Füchsl AM, Langgartner D, Walker FR, Hale MW, Perez GL, Treuren WV, González A, Halweg-Edwards AL, Fleshner M, Raison CL, Rook GA, Peddada SD, Knight R, Lowry CA (2016). Immunization with a heat-killed preparation of the environmental bacterium Mycobacterium vaccae promotes stress resilience in mice. Proc. National Acad. Sci. (In press).
  5. Joubert BR, den Dekker HT, Felix JF, Bohlin J, Ligthart S, Beckett E, Tiemeier H, van Meurs JB, Uitterlinden AG, Hofman A, Håberg SE, Reese SE, Peters MJ, Andreassen BK, Steegers EAP, Nilsen RM, Vollset SE, Midttun O, Ueland PM, Franco OH, Dehghan A, de Jongste JC, Wu MC, Wang T, Peddada SD, Jaddoe VWV, Nystad W, Duijts L, London1 SJ (2016). Maternal plasma folate impacts differential DNA methylation in an epigenome-wide meta-analysis of newborns. Nature Communications, 7:10577 |DOI: 10.1038/ncomms10577. [Abstract]
  6. Grandhi A, Guo W, Peddada SD. A multiple testing procedure for multi-dimensional pairwise comparisons with application to gene expression studies. BMC bioinformatics 2016 (17)104. [Full Text]
  7. Mandal S, Van Treuren W, White RA, Eggesbø M, Knight R, Peddada SD (2015). Analysis of composition of microbiomes: a novel method for studying microbial composition. Microbial Ecology in Health and Disease, 26, 1 – 7. [Abstract]
  8. Davidov O, Peddada S. (2013). Testing for the Multivariate Stochastic Order among Ordered Experimental Groups with Application to Dose-Response Studies. Biometrics 69(4):982-990. [Abstract]
  9. White RA, Bjørnholt J, Baird DD, Midtvedt T, Harris JR, Pagano M, Hide W, Rudi K, Moen B, Iszatt N, Peddada SD, Eggesbø M (2013). Novel Developmental Analyses Identify Longitudinal Patterns of Early Gut Microbiota that Affect Infant Growth. PLoS Computational Biology 9(5):e1003042. [Abstract]
  10. Davidov O, Peddada SD (2013). The linear stochastic order and directed inference for multivariate ordered distributions. Annals of Statistics 41(1):1-40.  [Abstract]
  11. Lim C, Sen PK, Peddada SD. (2013). Robust Analysis of High Throughput Screening (HTS) Assay Data. Technometrics 55(2):150-160. [Abstract]
  12. Fernandez M, Rueda C, Peddada SD (2012). Identification of a core set of signature cell-cycle genes whose relative order of time to peak expression is conserved across species. Nucleic Acids Research, 40,2823-32, doi: 10.1093/nar/gkr1077. [Abstract]
  13. Davidov O, Peddada SD (2011). Order restricted inference for multivariate binary data with application to toxicology. J. Amer. Statist. Assoc., 106, 1394-1404.
  14. Guo W, Sarkar SK, Peddada, SD (2010). Controlling False Discoveries in Multidimensional Directional Decisions, with Applications to Gene Expression Data on Ordered Categories. Biometrics, 66, 485 - 492. [Abstract]
  15. Peddada, SD, Dinse, G and Kissling, G (2007). Incorporating Historical Control Data When Comparing Tumor Incidence Rates. J. Amer. Statist. Assoc., 102, 1212-1220.
  16. Peddada SD, Lobenhofer L, Li L, Afshari C, Weinberg C and Umbach D. (2003). Gene Selection and Clustering for Time-course and Dose-response Microarray Experiments using Order-restricted Inference. Bioinformatics, 19, 834-841. [Abstract]
  17. Hwang JTG and Peddada SD (1994). Confidence Interval Estimation Subject to Order Restrictions. Annals of Statistics, 22, 67-93.

Back to Top

Share This Page:

Page Options:

Request Translation Services