Pierre R. Bushel, Ph.D.
Pierre R. Bushel, Ph.D., is involved in and oversees the following research projects within the Biostatistics Branch:
Data Integration and Analysis
Various computational and bioinformatics approaches are investigated to integrate genomic and genetics data with Gene Ontology annotation—biological information about samples or specimens and biomedical literature. A Modk-prototypes application for clustering gene expression data with phenotypic measurements has been developed and published as an approach for systems biology data analysis. Simulated annealing is used to optimize the Modk-prototypes clustering. An approach called theme extraction of microarray profiles, based on ontology-related annotation (TEMPORA), is in development with Elo Leung at the George Mason University Bioinformatics program. Its purpose is to generate biological themes from clusters of genes and to mine biomedical literature with the genes and biological processes that constitute the themes. In collaboration with Jean-Charles Lamirel at the INRIA National Institute for Research in Computer Science and Control in France, a microarray intelligent knowledge EXtraction (AREX) approach for multi-genomics and biological data meta-mining is in development.
Extended Array-based Data Analysis and Genomic Sciences
The following analyses are in progress: analytical strategies dealing with high-dimensional data including the investigation on the data property and the structure of variance components and establishing efficient methodologies to reduce the dimension in an attempt to unveil the information embedded in experiments. In addition to extending existing informatics approaches to other array-based technologies (i.e. ChIP-chip, CGH array, whole-genome CpG islands study), incorporation of other genomic information—DNA or amino acid—and the integration of a series of strategies to address newly defined hypotheses are also being explored. Side interests include general forms of statistical genetics analysis, such as genotyping data (i.e., SNP or haplotype data) and other marker data, which facilitate the inference from association studies and quantitative trait loci (QTL) detection.
Classification and Prediction of Biological Samples
Several classification methodologies are being developed and utilized for predicting biological samples. Some examples include an error-weighted ANOVA model with the k-nearest neighbors classifier, a multi-kernel classifier, Expression Predictor (ExP) and Extracting Patterns and Identifying co-Expressed Genes (EPIG).
Pattern Analysis for Pathway Reconstruction
Computational approaches and statistical models are explored to identify patterns within high-dimensional data. The process entails a series of analytical interventions of the data (pre-processing, normalization, feature extraction, etc.) to extract salient patterns from the data that are coupled to a phenotype or biological end-point. Ways in which to interpret the results and visualize the data are also being investigated. In addition, group members work with the aggregation of genomics data, biological information and Gene Ontology annotation for the reconstruction of more informative biological/gene networks.
Research and Development of Informatics Utilities
Database design and development of process applications are underway to manage and disseminate genomic data and experimental information. In addition, Bushel and staff are involved in the development of an informatics resource to facilitate the merging of environmentally responsive genes with expression data and other biological information.
Microarray & Genome Informatics
The Microarray & Genome Informatics (MGI) service provides bioinformatics, microarray data analysis, computational biology and statistical genetics/genomics support as well as software, database and application development expertise to all institute scientists without cost. The MGI staff includes Pierre R. Bushel, Ph.D., Director; Jeff Chou, Ph.D., Scientific Programmer; Jianying Li, Bioinformatician; Jonathan Miller, Systems Administrator.
MGI offers the following support services:
- Microarray data analysis and statistical modeling
- Extract Patterns and Identify co-expressed Genes (EPIG)
- Systematic Variation Normalization (SVN)
- Mixed linear modeling
- Expression Predicter (ExP)
- Phase-shift Analysis of Gene Expression (PAGE)
- Modk-prototypes clustering
- Run of the mill clustering, data analysis and visualization methods
- Genome and gene expression database and web application development and administration
- MicroArray Project System (MAPS) and ArrayDB customized databases
- Rosetta Resolver gene expression enterprise database
- Gene expression data submission to public repositories (GEO, ArrayExpress and CEBS)
- Genome scanning
- Transcription factor and pathway analysis
- Computational algorithm and analytical methodology implementation and programming
- Bioinformatics, genomic and microarray analysis consultation