Introduction The read correction package is a short-read correction tool and part of SOAPdenovo . It is specially designed to correct Illumina GA short reads. This package includes 4 programs: 1. KmerFreq (a kmer frequency counter), 2. Corrector (a program which does the correction work), 3. merge_pair.pl (extract pairs from two files which contains read1 and read2 separately), 4. merge_pair_list.pl (extract pairs from a list of two files). System Requirement This correction tool aims for large plant and animal genomes, although it also works well on bacteria and fungi genomes. It runs on 64-bit Linux system with a minimum of 18G physical memory. If the seed length is set to 17, the program KmerFreq needs 16G memory. And if the seed length is 17, thread number is 4, and the size of each read file is 10M, the program Corrector needs 24G memory. Command Line Options 1. Get it started Pipeline for reads correction: KmerFreq -] unicorn -] merge_pair_list.pl The simplest script to run these programs is: KmerFreq -i input_file_list -o output_file_name_prefix Corrector -i input_file_list -r input_kmer_table perl merge_pair_lst.pl input_file_list_corr Note: input_file_list is a list file which include all the files need correct. input_kmer_table output by KmerFreq input_file_list_corr is a list files outputted by the Corrector. 2. Options: Options for KmerFreq -i [string] input a file list -o [string] output filename prefix -q [int] quality cutoff [default 5] -s [int] seed length [default 17] -n [int] output kmer index along with frequency ? 0: no, 1: yes.[default 0] -f [int] file format: 1: fq, 2: fa.[default 1] -h/-? help Options for Corrector -i [string] input a read file list -r [string] input kmer frequence file name -n [int] kmer frequency along with index?: 0: no, 1: yes. [default 0] -k [int] start of kmer frequence cutoff [default 5] -e [int] end of kmer frequence cutoff [default 5] -d [int] maximum error bases allowed [default 2] -s [int] seed length [default 17] -t [int] thread number [default 4] -f [int] file format: 1: fq, 2: fa. [default 1] -h/-? help 3. Output files These files are output as KmerFreq's results: *.stat *.freq These files are output as Corrector's results: *.corr These files are outputted as merge_pair_list.pl's results: *.pair each file contains reads in pair *.single