Identifying cis-elements by an EM Algorithm Coupled with False Discovery Rate Control

fdrMotif is iterative and alternates between updating the position weight matrix (PWM) and significance testing. It starts with an initial PWM and a set of sequences (e.g., from ChIP experiments). It generates many sets of background (null) sequences under the input sequence probability model. At each model estimation step, fdrMotif determines the number of binding sites in each sequence by performing statistical tests. The FDR in the original dataset is controlled by monitoring the proportion of background subsequences that are declared as binding sites. The PWM is updated using an EM algorithm with two iterative steps (the E and M steps) until convergence. In the E-step, fdrMotif normalizes the sum of the probabilities over all position s in a sequence to the number of binding sites found in the sequence.

This program was developed by Leping Li, Yu Liang and Robert L Bass at the National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina 27709.

License

This work is made available under the GPL v2.

Download

Download the source code for the distribution of fdrMotif along with usage documentation and examples (9MB) .

Building

In the main directory of the distribution, type

  • ./configure
  • Make
  • make install
  • ./configure --prefix=/home/fdrMotif_user

By default, the configure program will direct the executable files to /usr/local/bin which, in most cases, requires the user to "su" to root prior to the "make install" step. The target directory for the executable file can be overridden by specifying the --prefix option during the configure phase. For example,

./configure --prefix=/home/fdrMotif_user

will direct the executables into /home/fdrMotif_user/bin directory.

The configure application accepts several arguments to tailor the build and installation process. Please see the INSTALL file contained in the root directory of the distribution for further details.

The source code and package were developed using Windows and tested on Linux (Fedora). Although the intent was to make the code portable to most U*IX variants, you may encounter minor build issues on other platforms. Feedback regarding any difficulties you may experience will be very helpful in improving the distribution package.

Contact

Leping Li, Ph.D.
Principal Investigator
Tel 984-287-3836
Fax 919-541-4311
[email protected]