ART
Set of Simulation Tools
ART is a set of simulation tools to generate synthetic next-generation sequencing reads. ART simulates sequencing reads by mimicking real sequencing process with empirical error models or quality profiles summarized from large recalibrated sequencing data. ART can also simulate reads using user own read error model or quality profiles. ART supports simulation of single-end, paired-end/mate-pair reads of three major commercial next-generation sequencing platforms: Illumina's Solexa, Roche's 454 and Applied Biosystems' SOLiD. ART can be used to test or benchmark a variety of method or tools for next-generation sequencing data analysis, including read alignment, de novo assembly, SNP and structure variation discovery. ART was used as a primary tool for the simulation study of the 1000 Genomes Project . ART is implemented in C++ with optimized algorithms and is highly efficient in read simulation. ART outputs reads in the FASTQ format, and alignments in the ALN format. ART can also generate alignments in the SAM alignment or UCSC BED file format.
Availiability
ART is freely available to public. The binary packages of ART are available for three major operating systems: Linux, Macintosh, and Windows. ART is also available as Platform-independent C++ source packages. Each package includes programs, documents and usage examples.
Citation
Accessory Tools and Read Quality Profiles for ART
- ART 454 format converter (1KB) for converting the native format of 454 paired-end reads to the standard format
- ART profiler 454 (2KB) for generating 454 read quality profiles
- ART profiler illumina (5KB) for generating Illumina read quality profiles
- Illumina MiSeq 250bp paired-read quality profile (289KB)
ART-BananaPancakes-04-02-2013 (the latest version)
The release mainly fixed a bug related to "noALN" option for ART Illumina simulator and added a built-in MiSeq 250bp read profile
Binary Packages
Linux MacOS X Windows 64-bit system art-all-Linux64-bin_BP.tar.gz (5MB) art-all-MacOS64-bin_BP.tar.gz (2MB) art-all-Win64-bin_BP.zip (5MB) 32-bit system art-all-Linux32-bin_BP.tar.gz (4MB) art-all-MacOS32-bin_BP.tar.gz (2MB) art-all-Win32-bin_BP.zip (2MB)
Source Packages
- ART Illumina read simulator art_Illumina_src-1.5.1.tar.gz (4MB)
- ART 454 read simulator art_454_src-2.1.8.tar.gz (3MB) (no updates)
- ART SOLiD read simulator art_SOLiD_src-1.0.1.tar.gz (2MB) (no updates)
ART-GrapeWine-08-15-2012
The release mainly fixed bugs for ART 454 simulator and added new features and functions as listed in the following:
- support GS FLX Titanium platform
- provide new built-in 454 read profiles for both GS FLX and GS FLX Titaium platforms
- add a new tool 454_readprofile_art that allows users to generate their own read profiles from new 454 read data
- add an option to change the default flow cycle number
- change to not output ALN files by default
- change the output of DNA sequences from lower case to upper case
- switch automatically to the default indel error profile when user own profile does not provide it
Binary Packages
Linux MacOS X Windows 64-bit system art-all-Linux64-bin_GW.tar.gz (6MB) art-all-MacOS64-bin_GW.tar.gz (2MB) art-all-Win64-bin_GW.zip (5MB) 32-bit system art-all-Linux32-bin_GW.tar.gz (3MB) art-all-MacOS32-bin_GW.tar.gz (2MB) art-all-Win32-bin_GW.zip (1MB)
Source Packages
- ART Illumina read simulator art_Illumina_src-1.5.0.tar.gz (no updates) (5MB)
- ART 454 read simulator art_454_src-2.1.8.tar.gz (3MB)
- ART SOLiD read simulator art_SOLiD_src-1.0.1.tar.gz (2MB)
ART-PeachPie-05-16-2012
Binary Packages
Linux MacOS X Windows 64-bit system art-all-Linux64-bin_PP.tar.gz (4MB) art-all-MacOS64-bin_PP.tar.gz (1MB) art-all-Win64-bin_PP.zip (4MB) 32-bit system art-all-Linux32-bin_PP.tar.gz (3MB) art-all-MacOS32-bin_PP.tar.gz (1MB) art-all-Win32-bin_PP.zip (1MB)
Source Packages
- ART Illumina read simulator art_Illumina_src-1.5.0.tar.gz (5MB)
- ART 454 read simulator art_454_src-1.6.8.tar.gz (no updates) (2MB)
- ART SOLiD read simulator art_SOLiD_src-1.0.1.tar.gz (no updates) (2MB)
ART-CoconutCoffee-01-10-2012
Binary Packages
Linux MacOS X Windows 64-bit system art-all-Linux64-bin_CC.tar.gz (3MB) art-all-MacOS64-bin_CC.tar.gz (1MB) art-all-Win64-bin_CC.zip (4MB) 32-bit system art-all-Linux32-bin_CC.tar.gz (3MB) art-all-MacOS32-bin_CC.tar.gz (1MB) art-all-Win32-bin_CC.zip (1MB)
Source Packages
- ART Illumina read simulator art_Illumina_src-1.3.6.tar.gz (5MB)
- ART 454 read simulator art_454_src-1.6.8.tar.gz (no updates) (2MB)
- ART SOLiD read simulator art_SOLiD_src-1.0.1.tar.gz (no updates) (2MB)
ART-CranberryJuice-11-23-2011
Binary Packages
Linux MacOS X Windows 64-bit system art-all-Linux64-bin_CJ.tar.gz (3MB) art-all-MacOS64-bin_CJ.tar.gz (1MB) art-all-Win64-bin_CJ.tar.gz.zip (4MB) 32-bit system art-all-Linux32-bin_CJ.tar.gz (2MB) art-all-MacOS32-bin_CJ.tar.gz (1MB) art-all-Win32-bin_CJ.tar.gz.zip (1MB)
Source Packages
- ART 454 read simulator art_454_src-1.6.8.tar.gz (2MB)
- ART Illumina read simulator art_Illumina_src-1.3.1.tar.gz (4MB)
- ART SOLiD read simulator art_SOLiD_src-1.0.1.tar.gz (2MB)
ART-ApplePie-04-21-2011
Binary Packages
Linux MacOS X Windows 64-bit system art-all-Linux64-bin_AP.tar.gz (3MB) art-all-MacOS64-bin_AP.tar.gz (1MB) art-all-Win64-bin_AP.tar.gz.zip (4MB) 32-bit system art-all-Linux32-bin_AP.tar.gz (2MB) art-all-MacOS32-bin_AP.tar.gz (1MB) art-all-Win32-bin_AP.tar.gz.zip (1MB)
Source Packages
- ART 454 read simulator art_454_src-1.2.5.tar.gz (877KB)
- ART Illumina read simulator art_Illumina_src-1.1.5.tar.gz (2MB)
- ART SOLiD read simulator art_SOLiD_src-0.9.1.tar.gz (1023KB)
Installation
Compilation and installation from a source package
Compilation of ART from its source codes requires the GNU Scientific Library (GSL). The GSL can be freely downloaded from GNU at GSL Software Site . To compile under a Linux/Unix-like operating system, please first download and unpack a desired source package, then enter the first directory of the unpacked package, and issue the following commands:
./configure make make installInstallation from a binary package
Installation is to simply unpack the binary package to your installation directory. The executable programs are art_454, art_illumina, and art_SOLiD for 454, Illumina, and SOLiD platforms, respectively. Under Linux or MacOS, please use the following command to unpack a *.tar.gz binary package:
tar xfz art_*.tar.gzART binary package for Windows OS is in a ZIP package. You can right click a ZIP package, and click "extract" in the context-menu to unpack the package.
Usages
Simple ART usages and examples are given below. Please refer to the README file in each distribution package for examples and other detail documentation.
454 read simulation
- Single-end reads
art_454 [ -s ] [ -p read_profile ] <INPUT_SEQ_FILE> <OUTPUT_FILE_PREFIX> <FOLD_COVERAGE>
Example:
art_454 seq_reference.fa ./outdir/dat_single_end 20- Paired-end reads
art_454 [ -s ] [ -p read_profile ] <INPUT_SEQ_FILE> <OUTPUT_FILE_PREFIX> <FOLD_COVERAGE> <MEAN_FRAG_LEN> <STD_DE>
Example:
art_454 seq_reference.fa ./outdir/dat_paired_end 20 500 20
Illumina read simulation
- Single-end reads
art_illumina [options] -i <INPUT_SEQ_FILE> -l <READ_LEN> -f <FOLD_COVERAGE> -o <OUTPUT_FILE_PREFIX>
Example:
art_illumina -sam -i seq_reference.fa -l 50 -f 10 -o ./outdir/dat_single_end- Paired-end reads
art_illumina [options] -i <INPUT_SEQ_FILE> -l <READ_LEN> -f <FOLD_COVERAGE> -o <OUTPUT_FILE_PREFIX> -m <MEAN_FRAG_LEN> -s <STD_DE>
Example:
art_illumina -p -sam -i seq_reference.fa -l 50 -f 20 -m 200 -s 10 -o d./outdir/dat_paired_end
- Mate-pair reads
art_illumina [options] -i <INPUT_SEQ_FILE> -l <READ_LEN> -f <FOLD_COVERAGE> -o <OUTPUT_FILE_PREFIX> -m <MEAN_FRAG_LEN> -s <STD_DE>
Example:
art_illumina -mp -sam -i seq_reference.fa -l 50 -f 20 -m 2050 -s 50 -o d./outdir/dat_paired_end
SOLiD read simulation
- Single-end reads
art_SOLiD [ -s ] [ -p read_profile ] <INPUT_SEQ_FILE> <OUTPUT_FILE_PREFIX> <READ_LEN> <FOLD_COVERAGE>
Example:
art_SOLiD -s seq_reference.fa ./outdir/dat_single_end 32 10- Paired-end reads
art_SOLiD [ -s ] [ -p read_profile ] <INPUT_SEQ_FILE> <OUTPUT_FILE_PREFIX> <READ_LEN> <FOLD_COVERAGE> <MEAN_FRAG_LEN> <STD_DE>
Example:
art_SOLiD seq_reference.fa ./outdir/dat_paired_end 25 10 500 20
Contact
-
Weichun Huang, Ph.D. (http://www.niehs.nih.gov/research/atniehs/labs/bb/staff/huang/index.cfm)
Staff Scientist -
Tel (919) 541-4943
Fax (919) 541-4311
huang6@niehs.nih.gov
