README ________________________________________________________________________ These perl scripts are associated with the manuscript "A streamlined method for detecting structural variants in cancer genomes by short read paired-end sequencing". Please find brief descriptions, usage information and module requirements below. ________________________________________________________________________ PE.pl PE.pl is a script that simulates paired end sequencing data and writes fastq files in the Illumina format. It selects random positions in the user-provided genome, normalized for different chromosome lengths. User-defined parameters include the number of read pairs, read length, mean insert size and standard deviation. Requirements: Math::Random Usage: perl PE.pl ________________________________________________________________________ links.pl links.pl takes the SVDetect (http://svdetect.sourceforge.net) "linking" output file (*.links) as input, searches for "imperfect duplicates" (reads with 1-2 bp offset in coordinates) and outputs a modified *.links file with imperfect duplicates removed. The script requires three arguments: (1) The read length, (2) the input filename, and (3) the output filename. ________________________________________________________________________ Preprocess_rm0to22.pl Preprocess_rm_normal.pl Preprocess_rm0to22.pl and Preprocess_rm_normal.pl are scripts that modify BWA sam files. Preprocess_rm0to22.pl removes all read pairs where one or both reads in a pair have mapping qualities 0-22. Preprocess_rm_normal.pl removes concordant read pairs. Usage: Preprocess_rm_normal.pl Output file with extension .rm_normal.sam is produced. Preprocess_rm0to22.pl Output file with extension .rm0to22.sam is produced.