Chromosomes, simulated read data and error-profiles for the paper: "Iterative error correction of long sequencing reads maximizes accuracy and improves contig assembly" Katrin Sameith, Juliana Roscito and Michael Hiller # Description of the files: # Error profiles for paired-end 300 bp MiSeq reads: errorProfile_300bpR1.txt errorProfile_300bpR2.txt # Chicken chr14 both haplotypes chicken.chr14.tar.gz # Chicken simulated 30X read data chicken.simulatedReads30X.tar.gz # Human chr11 both haplotypes human.chr11.tar.gz # Human simulated 30X read data human.simulatedReads30X.tar.gz # Human simulated 60X read data human.simulatedReads60X.tar.gz # Human simulated 100X read data human.simulatedReads100X.tar.gz # Lizard chr4 both haplotypes lizard.chr4.tar.gz # Lizard simulated 30X read data lizard.simulatedReads30X.tar.gz The simulated read tar.gz files contain 8 files for each dataset, e.g. galGal4.chr14.noN.sim300bp.30X.hapA.R1.aln galGal4.chr14.noN.sim300bp.30X.hapA.R1.fq galGal4.chr14.noN.sim300bp.30X.hapA.R2.aln galGal4.chr14.noN.sim300bp.30X.hapA.R2.fq galGal4.chr14.noN.sim300bp.30X.hapB.R1.aln galGal4.chr14.noN.sim300bp.30X.hapB.R1.fq galGal4.chr14.noN.sim300bp.30X.hapB.R2.aln galGal4.chr14.noN.sim300bp.30X.hapB.R2.fq *hapA are reads sampled from the hapA haplotype (hapA.fasta), *hapB are reads sampled from the hapB haplotype (hapB.fasta) *fq are the fastq paired-end reads with errors *aln contain the genomic start position of the read, the true sequence of the read and the sequence after introducing errors. Note that read IDs (e.g. chr14-758083/1) are not unique and occur twice (once in each of the read file for each haplotype).