Skip Navigation

Your Environment. Your Health.

Leping Li, Ph.D.

Biostatistics & Computational Biology Branch

Leping Li, Ph.D.
Leping Li, Ph.D.
Principal Investigator
Tel (919) 541-5168
Fax (919) 541-4311
li3@niehs.nih.gov
P.O. Box 12233
Mail Drop A3-03
Research Triangle Park, NC 27709

Delivery | Postal
Delivery Instructions

Research Summary

Leping Li, Ph.D., and his staff are developing and implementing methods for detecting and discovering functional elements such as the cis-regulatory motifs in a set of sequences using Markov models and Expectation Maximization (EM) methods. Specifically, they developed an efficient sequence alignment algorithm for identifying conserved segments between two divergent sequences, e.g., promoter sequences. Li's group also worked on methods that improve the quality of motif models and a motif identification tool that controls the false discovery rate.

Li recently developed the GADEM software that can be applied to large scale sequence data for unbiased motif discovery. Currently, his group is developing a method that identifies transcription factor and its co-regulatory motifs in ChIP-seq datasets and computational/statistical methods for identifying genomic loci that are differentially enriched in sequence reads counts in ChIP-seq and mRNA-seq data.

  • Methods for identifying transcription factor and its co-regulatory factor motifs
  • De novo motif discovery and identification
  • Methods for identifying enriched genomic loci in ChIP-seq and mRNA-seq data
  • Accurate anchoring alignment of divergent sequences
  • A method for gene set enrichment analysis for continuous non-monotone relationships
  • A genetic algorithm/k-nearest neighbor (GA/KNN) method for microarray and proteomics data analysis

The source code and documentation for GA/KNN and GAPWM may be downloaded from the Biostatistics and Computational Biology Branch Resources page, but more information on GA/KNN appears on Li’s Studies page.

Software

  • coMotif ("/Rhythmyx/assembler/render?sys_contentid=34836&sys_revision=3&sys_variantid=639&sys_context=0&sys_authtype=0&sys_siteid=&sys_folderid=" sys_dependentvariantid="639" sys_dependentid="34836" inlinetype="rxhyperlink" rxinlineslot="103" sys_dependentid="34836" sys_siteid="" sys_folderid="")
    A three-component mixture framework to model the joint distribution of two motifs as well as the situation where some sequences contain only one or none of the motifs.
  • EpiCenter
    ("/Rhythmyx/assembler/render?sys_contentid=34838&sys_revision=3&sys_variantid=639&sys_context=0&sys_authtype=0&sys_siteid=&sys_folderid=" sys_dependentvariantid="639" sys_dependentid="34838" inlinetype="rxhyperlink" rxinlineslot="103" sys_dependentid="34838" sys_siteid="" sys_folderid="")EpiCenter is a powerful analysis tool of genome-wide mRNA-seq or ChIP-seq data for detecting differentially expressed genes or identifying changes in epigenetic modifications.
  • fdrMotif
    ("/Rhythmyx/assembler/render?sys_contentid=34887&sys_revision=3&sys_variantid=639&sys_context=0&sys_authtype=0&sys_siteid=&sys_folderid=" sys_dependentvariantid="639" sys_dependentid="34887" inlinetype="rxhyperlink" rxinlineslot="103" sys_dependentid="34887" sys_siteid="" sys_folderid="")Determines the number of binding sites in each sequence of a probability model by performing statistical tests.
  • GA/KNN
    ("/Rhythmyx/assembler/render?sys_contentid=34892&sys_revision=3&sys_variantid=639&sys_context=0&sys_authtype=0&sys_siteid=&sys_folderid=" sys_dependentvariantid="639" sys_dependentid="34892" inlinetype="rxhyperlink" rxinlineslot="103" sys_dependentid="34892" sys_siteid="" sys_folderid="")Selects the most discriminative variables for sample classification and may be used for analysis of microarray gene expression data, proteomic data or other high-dimensional data.
  • GADEM ("/Rhythmyx/assembler/render?sys_contentid=34890&sys_revision=3&sys_variantid=639&sys_context=0&sys_authtype=0&sys_siteid=&sys_folderid=" sys_dependentvariantid="639" sys_dependentid="34890" inlinetype="rxhyperlink" rxinlineslot="103" sys_dependentid="34890" sys_siteid="" sys_folderid="")
    An unbiased de novo motif discovery tool implementing an expectation-maximization (EM) algorithm.
  • Genetic Algorithm Method for Optimizing a Position Weight Matrix
    ("/Rhythmyx/assembler/render?sys_contentid=34945&sys_revision=4&sys_variantid=639&sys_context=0&sys_authtype=0&sys_siteid=&sys_folderid=" sys_dependentvariantid="639" sys_dependentid="34945" inlinetype="rxhyperlink" rxinlineslot="103" sys_dependentid="34945" sys_siteid="" sys_folderid="")Implements a simple method to improve a poorly estimated position weight matrix using chromatin immunoprecipitation data.
  • T-KDE
    T-KDE will identify the locations of constitutive binding sites. T-KDE, which combines a binary range tree with a kernel density estimator, is applied to ChIP-seq data from multiple cell lines.

 

Selected Publications

  1. Wells, M.L., Washington, O.L., Hicks, S.N., Nobile, C.J., Hartooni, N., Wilson, G.M., Zucconi, B.E., Huang, W., Li, L., Fargo, D.C., Blackshear, P.J. Post-transcriptional regulation of transcript abundance by a conserved member of the tristetraprolin family in Candida albicans. Mol. Microbiol., 2015, 95(6):1036-1053.  [Abstract]
  2. Choi, Y.-J., Lai, W.S., Fedic, R., Stumpo, D.J, Huang, W., Li, L., Perera, L., Brewer, B.Y., Brewer, B.Y., Wilson, G.M., Mason, J.M., Blackshear, P.J. The Drosophila Tis11 protein and its effects on mRNA expression in flies. J. Biol. Chem., 2014, 289(51):35042-60.[Abstract]
  3. Niu L, Huang W, Umbach DM, Li L. IUTA: a tool for effectively detecting differential isoform usage from RNA-Seq data. BMC genomics, 2014, 15:862.[Abstract]
  4. Zhang, X., Li, B., Ma, L., Li, L., Zheng, D., Li W., Chu, M., Mailman, R.B., Archer, T.K., Wang, Y. Transcriptional repression by specific SWI/SNF components affects pluripotency of human embryonic stem cells. Stem Cell Report, 2014, 3(3):460-474.[Abstract]
  5. Hewitt, S.C., Li, L., Grimm, S.A., Winuthayanon, W., Hamilton, K.J., Pockette, B., Rubel, CA., Pedersen, L.C., Fargo, D., Lanz, R.B., DeMayo, F.J., Schutz, G., Korach, K.S. Novel DNA motif binding activity observed in vivo with an estrogen receptor alpha mutant mouse. Mol. Endocrinol. 2014, 28(6):899-911.[Abstract]
  6. Li, Y., Umbach, D.M., Li, L. T-KDE: A method for analyzing genome-wide protein binding pat-terns from ChIP-seq data. BMC Genomics, 2014, 15:27.[Abstract]
  7. Li, Y., Hamilton, K.J., Lai, A.Y., Burns, K.A., Li, L., Wade, P.A., Korach, K.S. Diethylstilbestrol (DES)-stimulated hormonal toxicity is mediated by ERalpha alteration of target gene methylation patterns and epigenetic modifiers (DNMT3A, MBD2, and HDAC2) in the mouse seminal vesicle. Environ. Health Perspect., 2014, 122(3):262-8.[Abstract]
  8. Madenspacher, J., Azzam, K., Gowdy, K., Malcolm, K., Nick, J., Aloor, D. J., Draper, D., Guardiola, J., Shatz, M., Menendez, D., Lowe, J., Lu, J., Bushel, P., Li, Leping, Merrick, A., Resnick, M.A. and Fessler, M. p53 Integrates host defense and cell fate during bacterial pneumonia. J. Experimental Medicine:  891-904, 2013.  [Abstract]
  9. Tennant, B., Robertson, A.G., Kramer, M., Li, L., Zhang, X., Beach, M., Thiessen, N., Chiu, R., Mungall, K., Whiting, C., Sabatini, P., Kim, A., Gottardo, R., Marra, M., Lynn, F., Jones, S.J.M., Hoodless, P.A., Hoffman, B.G. Identification and analysis of pancreatic islet enhancers. Diabetologia, 2013, 56(3):542-552.[Abstract]
  10. Li Y, Huang W, Niu L, Umbach DM, Covo S, Li L. Characterization of constitutive CTCF/cohesin loci: a possible role in establishing topological domains in mammalian genomes, BMC Genomics, 2013, 14:553.[Abstract]
  11. Huang W, Loganantharaj R, Schroeder B, Fargo D, Li L. PAVIS: a tool for Peak Annotation and Visualization, Bioinformatics, 2013, 29(23):3097-9.[Abstract]

Back to Top

Share This Page:

Page Options:

Request Translation Services