Can I get cell lines for IGSR samples?

Answer:

In almost all cases, cell lines are available. For most samples, cell lines are held at the Coriell Cell Repository. Samples from HGDP are available at CEPH.

Coriell hold detailed information on the 1000 Genomes Project populations and sample collection process.

Individual sample pages in the data portal include links to the repository where the cell line is available.

The main exception to the above is around 400 samples in the Gambian Genome Variation Project (GGVP), for which cell lines are not available.

Related questions:

Can I get phenotype, gender and family relationship information for the individuals?

Answer:

For the 1000 Genomes Project, due to the freely available nature of the data, no phenotype information was collected for any of the samples. All donors were over 18 and declared themselves to be healthy at the time of collection. We do provide a sample spreadsheet and a pedigree file which contain ethnicity and gender for 1000 Genomes samples.

Related questions:

Can I volunteer to be part of the 1000 genomes project?

Answer:

The 1000 Genomes Project is not accepting volunteers to be sequenced. More information about how samples were recruited please see the About page.

Another large scale resequencing project that does still have rounds of recruitment is the Personal Genomes Project

Related questions:

How many individuals have been sequenced in IGSR projects and how were they selected?

Answer:

There is data from 4973 individuals in IGSR, some related.

Related questions:

Is there gene expression and/or functional annotation available for the samples?

Answer:

Functional annotation

As part of our phase 1 analysis we performed functional annotation of our phase 1 variants with respect to both coding and non-coding annotation from GENCODE and the ENCODE project respectively.

This functional annotation can be found in our phase 1 analysis results directory. We present both the annotation we compared the variants to and VCF files which contain the functional consequences for each variant.

Gene expresssion

The most important available existing expression datasets involving 1000 Genomes individuals are probably the following:

Pre-publication RNA-sequencing data from the Geuvadis project is available

http://www.ebi.ac.uk/arrayexpress/experiments/E-GEUV-1/samples.html
http://www.ebi.ac.uk/arrayexpress/experiments/E-GEUV-2/samples.html

http://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-197

http://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-198
http://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-264

http://www.ebi.ac.uk/arrayexpress/experiments/E-GEOD-19480

References

  1. Reference:Montgomery SB, Sammeth M, Gutierrez-Arcelus M, Lach RP, Ingle C, Nisbett J, Guigo R, Dermitzakis ET. Transcriptome genetics using second generation sequencing in a Caucasian population. Nature. 2010 Apr 1;464(7289):773-7. Epub 2010 Mar 10.
  2. Reference: Stranger,B.E S.B. Montgomery, A.S. Dimas, L. Parts, O. Stegle, C.E. Ingle, M. Sekowska, G. Davey Smith, D. Evans, M. Gutierrez-Arcelus, A. Price, T. Raj J. Nisbett, A.C. Nica, C. Beazley, R. Durbin, P. Deloukas, E.T. Dermitzakis. Patterns of cis regulatory variation in diverse human populations. PLoS Genetics in press
  3. Reference: Pickrell JK, Marioni JC, Pai AA, Degner JF, Engelhardt BE, Nkadori E, Veyrieras JB, Stephens M, Gilad Y, Pritchard JK. Understanding mechanisms underlying human gene expression variation with RNA sequencing. Nature. 2010 Apr 1;464(7289):768-72. Epub 2010 Mar 10.

Related questions:

What do the population codes mean?

Answer:

Each three letter Population code represents a different population, CEU means Northern Europeans from Utah and TSI means Tuscans from Italy. There is a summary of all these codes both in a readme on the ftp site and in the Data Portal.

Related questions:

Which populations are part of your study?

Answer:

You can see details of the different populations used in different parts of IGSR in the populations section of the Data Portal. These can be viewed as markers on a map or in tables with details of data collections or technologies.

Related questions: