INSTALLATION

	Installation is to simply decompress the EpiCenter compressed packages to your installation
       	location.  
	1) LINUX OR MACOS X
		If you download the package epicenter_<OS>.tar.gz to the folder "/home/downloads", 
		and then you can issue the following commands to install the program to the folder 
		"/home/<yourname>/install_dir" at your terminal window:

		cd /home/<yourname>/install_dir
	       	tar xfz /home/downloads/epicenter_<OS>.tar.gz
	       	cd epicenterDIR
	       	if needed, run ./install.sh to cp main executable binary files into "/home/<username>/bin"

	2) WINDOWS 
		EpiCenter's packages are in ZIP packages. Installation is simply to extract the package
		to the desired the installation location. After extraction, open the extracted
	       	folder "epicenterDIR". Then double click the window batch file "launch_epicenter.bat" to
		launch Windows' Command Line Terminal, where you can run EpiCenter programs.

	NOTE: please install and use a 64-bit version program if you have 64-bit operating system. 
	      32 bit packages are for the 32-bit operating system and you may be not able to use it
	      to analyze very large datasets if more than 4GB memory is required for the analysis. 

USER MANUAL
	epicenter_manual.pdf is a user manual for detailed explanation of program usage, parameters
	and related data formats. 

GENOME ANNOTATION FILES
	Under the folder "gene_ann_files" are gene annotation files of four species including human and mouse. 

USAGES
	IMPORTANT: there are two main executable binary files: EpiCenter and EpiCenter_maq_short.   
	Please use EpiCenter for the MAQ alignment format with the maximum read length of 128bp, and
	use EpiCenter_maq_short for the MAQ alignment format with the maximum read length of 64bp. 

	-----------------------------------------------------------------------------------------------------------------------------------
	Usage (TYPE 0-3,31,32): EpiCenter -t type [options] -i file_type aln_sample1 aln_sample2
	Usage (TYPE 4):         EpiCenter -t 4 [options] -i file_type aln_sample
	Usage (TYPE 5):         EpiCenter -t 5 -f locfile [options] -i file_type aln_sample1 [aln_sample2 ...]
	
	GENERAL PARAMETERS
	
	 -t --type:      analysis type requested
	                 0 = whole genome scan using a semi-dynamic window
	                 1 = whole genome scan using a full-dynamic window
	                 2 = whole genome scan using a fixed-size window
	                 3 = analysis of fixed-length flanking regions of selected genomic point locations (e.g. transcription start sites)
	                31 = analysis of selected regions of variable lengths (e.g. genes)
	                32 = analysis of mRNA-seq data mapped to cDNA sequence references
	                 4 = whole genome read coverage analysis of a single sample using a fixed-size window scan
	                 5 = conversion of multiple mRNA/ChIP-seq data files into one read count data matrix
	                     with rows corresponding to genes and columns corresponding to samples/replicates.
	 -i --fformat:   input format of read alignment data files
	                 eland = ELAND export format [default]
	                 maq   = MAQ format 
	                 bam   = BAM format
	                 sam   = SAM format
	 -s --qual:      cutoff of alignment quality scores (default: 10)
	 -o --outdir:    directory for output files [default: current working directory]
	
	PARAMETERS FOR NOISE READ RATE (TYPE 0-4) 
	
	 -1 --nrate1     noise rate of read tags for the first sample
	 -2 --nrate2     noise rate of read tags for the second sample
	                       NOTE: a noise rate is specified by the number of tags per 1kb genomic region
	                       If only nrate1 specified, then nrate2 is estimated by
	                       nrate2=nrate1*#mapped_tag_2/#mapped_tag_1, and vice versa.
	 -g --gsize:     genome size [default: estimate from data]
	                       NOTE: default noise rate = #mapped_tags/genome size
	
	PARAMETERS FOR TYPE 0-3, 31, 32 ANALYSIS
	
	 -q --fdr:       cutoff of false discovery rate (FDR) [default: 5%]
	 -r --ratioexpect: expected rate ratio of read coverage between two samples
	 -e --std:       the standard deviation of log2ratio null distribution
	 -p --pcutoff:   cutoff of p-value of 'unchanged' regions for estimating null distribution [default: 0.01]
	 -B --bonfer:    indicator for adjusting the pcutoff above with Bonferroni correction [default: NO]
	                     Note: Bonferroni correction may lead to overestimate the variation of null distribution,
	                     and hence underestimate number of significant genes
	 -D --depth:     indicator for using peak coverage depth for significant test [default: use read counts]
	 -U --noLenNorm: indicator for not normalizing read count over region length for the rate ratio test
	 -P --fwr:       indicator for printing test decistion (0=accept, 1=reject) of three family wide
	                     type 1 error control methods [default: no print out]
	 -L --sortMaxP   indicator for sorting result report by p_mx [default: by p_rr]
	 -H --rep:       indicator for printing analysis log to the result report file [default: STDOUT]
	
	PARAMETERS FOR TYPE 0-2, 4 ANALYSIS
	
	 -w --wsize:     the window size of a fixed-window scan or the max gap distance allowed
	                     between neighboring tags in a dynamic window [default: 500bp]
	
	PARAMETERS FOR WIG FILES (TYPE 0, 31)
	
	 -W --wigout:    indicator for outputing WIG files [default: no]
	 -X --nozip:     indicator for uncompressed plain text WIG files [default: gzipped WIG]
	
	PARAMETERS FOR DEPTH OF READ COVERAGE FILES (TYPE 32)
	
	 -W --wigout:    indicator for generating the depth of read coverage file [default: no]
	
	PARAMETERS FOR ESTIMATING THE EXPECTED RATE RATIO (TYPE 0)
	
	 -c --ratiochoice: choice of ratio estimate
			     0 = ratio of total mapped tag counts (the "tagRatio" method)
	                     1 = mean, 2 = median, 3 = least-squares regression
	 -a --minQ:      min quantile of data selected for rate ratio estimate  [default:0.85]
	 -b --maxQ:      max quantile of data selected for rate ratio estimate  [default:0.95]
	
	PARAMETERS FOR TYPE 3, 31, 32, 5 ANALYSIS
	
	 -f --locfile:   the input file with genomic positions of regions or genes
	 -u --ulen:      upstream region length (default 500)
	 -d --dlen:      downstream region length (default 500)
	
	PARAMETERS FOR TYPE 3, 31, 32 ANALYSIS
	
	 -R --autoR:     indicator for using epicenter's 'parsimony' method to estimate the expected read ratio
	
	PARAMETERS FOR WHOLE GENOME FULL-DYNAMIC WINDOW SCAN (TYPE 1)
	
	 -h --htest      b=two sided (default), l=less, g=greater
	 -T --twoway:    indicator for making an additional dynamic window scan based on the second data file 
	
	
	PARAMETERS FOR ARRAY DATA CONVERSION (TYPE 5)
	
	 -l --label:     the label for the top-left cell in the array matrix data [default: epicenter]
	 -N --nonorm:    indicator for not adjusting read counts by total number of reads [default: yes]
	
	----------------------------------------------------------------------------------------------------------------------------------

CONTACT AUTHOR
	Weichun Huang
	whduke@gmail.com
	
