Variant calls from 1000 Genomes Project data calling against GRCh38

2018-12-17 00:00:00 +0000

An integrated and phased biallelic SNV call set, generated from alignments of the 1000 Genomes phase three low coverage and exome sequence data, is available on our FTP site. These calls were called directly against GRCh38. This data set combines call sets generated using GATK, FreeBayes and BCFtools, with subsequent imputation and phasing carried out using Beagle and SHAPEIT2. A recent poster describing the methods used in generating this data is available and a data note is in preparation. We are also in the process of submitting the data to EVA/dbSNP.

The files include: per chromosome files with genotypes for all samples, a genome wide sites file and genotype files for each of the supporting call sets. The main files contain only unrelated individuals, with details for related individuals available in a separate set of files.

Data files are available at: http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data_collections/1000_genomes_project/release/20181203_biallelic_SNV/

We will work toward releasing a pre-print of the data note as soon as possible but, in the meantime, please contact info@1000genomes.org with any questions.