A set of 33315 fastq files containing sequence reads that have passed a DCC fastq QC process are now available on the FTP site.
The filtered_fastq files contain reads passing the DCC fastq QC process and have been put on the ftp site. The input to the DCC QC pipeline are all fastq files retrieved from ERA, including reads generated by all three pilots and the main project, as of 4 September 2009.
Summary statistics of the current set of files:
Total read count: 143,297,692,374
Total base count: 6,248,070,765,643
Total filtered read count: 130,945,326,161
Total filtered base count: 5,715,866,786,951
Percentage of good reads: 91.38%
Percentage of good bases: 91.48%
Link to additional information: Changelog (list of new files) / QC criteria in README.sequence_data