Tools for Building Data Files

Preparation of EagleView data file may be the difficult part if you do not know how to programming. However, don't worry. EagleView comes with three assistant tools to help you create READS and EGL data files. Here they are:

Creating READS and EGL from a FASTA file

EagleIndexFasta is an assistant program for EagleView. It extracts read information from a mutiple-line FASTA file for all reads in the ACE file. There are two output files: 1){PREFIX}.egl and 2){PREFIX}.reads, where {PREFIX} is the basename of ACE file. The {PREFIX}.reads contains all reads information, and {PREFIX}.egl is location index of the {PREFIX}.reads. Both files are optional input files for EagleView. EagleIndexFasta works for both 454 and Illumina reads.

Usage:

Usage: EagleIndexFasta.exe [options] aceFile readFile readType
..........................PARAMETERS..............................
aceFile  -- an genome assembly file in ACE format
readFile -- a read information file in FASTA format
readType -- the type of reads (1=454, 2=Illumina/Solexa,AB's SoLID)
...........................OPTIONS................................
 -h      -- Print usage information and exit
 -v      -- Print version information and exit
..................................................................

A read record in the input FASTA file should be in the following two formats:

1) without flow signal information

>read_id	read_type
base_quality_line

2) with flow signal information

>read_id	read_type
base_quality_line
signal_index_line
flow_signal_line

Individual values in each line are delimited by tab. The index in signal_index_line is zero-based index. For example:

>B1_1_135_384_892	2
30	30	30	30	30	30	30	30	...
2	7	11	12	18	23	26	31	...
544	286	9477	909	356	3089	415	5557	...
>B1_1_135_407_899	2
30	30	30	25	30	30	30	30	...
2	6	11	12	16	22	27	31	...
-310	190	8879	129	2329	-655	8317	-35	...

Creating READS and EGL from a 454 SFF file

EagleIndexSff is the assistant tool for this task. It extracts read information from a SFF file for all reads in the input ACE file. There are two output files: {PREFIX}.egl and {PREFIX}.reads, where {PREFIX} is the the basename of the input ACE file. {PREFIX}.reads contains all reads information, and {PREFIX}.egl is location index of the {PREFIX}.reads. Both files are optional input files for EagleView.

Usage:

Usage: EagleIndexSff.exe [options] aceFile sffFile
..........................PARAMETERS..............................
   aceFile  -- an genome assembly file in ACE format
   sffFile  -- a binary 454 SFF file

............................OPTIONS...............................
   -h       -- Print usage information and exit
   -v       -- Print version information and exit
   -f       -- Output flowgram indexes and values, default: NO
..................................................................
 

Creating READS and EGL from multiple SFF files

EagleIndexSffM is the assistant tool for this task. It extracts read information from one or more SFF files for all reads from the ACE file. There are two output files: {PREFIX}.egl and {PREFIX}.reads, where {PREFIX} is the basename of the input ACE file. {PREFIX}.reads contains all reads information, and {PREFIX}.egl is location index of the {PREFIX}.reads. Both files are optional input files for EagleView.

Usage:

Usage: EagleIndexSff.exe [options] aceFile sffFile
..........................PARAMETERS..............................
   aceFile  -- an genome assembly file in ACE format
   sffFile  -- a binary 454 SFF file
............................OPTIONS...............................
   -h       -- Print usage information and exit
   -v       -- Print version information and exit
   -f       -- Output flowgram indexes and values, default: NO
..................................................................