The original data were from http://www.fruitfly.org/seq_tools/datasets/Human/GENIE_96/splicesets/ The data partition for 6-fold cross-validation were based on Martin Reese's partitions, whose gene list given in the file "combined_GB.sets" (http://www.fruitfly.org/seq_tools/datasets/Human/GENIE_96/combined_GB.sets). The training and testing datasets for 6-validation are in 6 sub-folders (cv0 - cv5). Each sub-folder contains the following 3 files: 1) DonorTrT.mf -> Training (True sites) 2) DonorTrF.fa -> Training (False sites) 3) DonorTest.fa -> Testing ***Note*** Training and testing datasets in "cv0" sub-folder were directly converted from the original data at http://www.fruitfly.org/seq_tools/datasets/Human/GENIE_96/splicesets/ Prepared by weichun huang, weichun.huang@duke.edu