In almost all cases, cell lines are available. For most samples, cell lines are held at the Coriell Cell Repository. Samples from HGDP are available at CEPH.
Coriell hold detailed information on the 1000 Genomes Project populations and sample collection process.
Individual sample pages in the data portal include links to the repository where the cell line is available.
The main exception to the above is around 400 samples in the Gambian Genome Variation Project (GGVP), for which cell lines are not available.
For the 1000 Genomes Project, due to the freely available nature of the data, no phenotype information was collected for any of the samples. All donors were over 18 and declared themselves to be healthy at the time of collection. We do provide a sample spreadsheet and a pedigree file which contain ethnicity and gender for 1000 Genomes samples.
The 1000 Genomes Project is not accepting volunteers to be sequenced. More information about how samples were recruited please see the About page.
Another large scale resequencing project that does still have rounds of recruitment is the Personal Genomes Project
There is data from 4973 individuals in IGSR, some related.
As part of our phase 1 analysis we performed functional annotation of our phase 1 variants with respect to both coding and non-coding annotation from GENCODE and the ENCODE project respectively.
This functional annotation can be found in our phase 1 analysis results directory. We present both the annotation we compared the variants to and VCF files which contain the functional consequences for each variant.
The most important available existing expression datasets involving 1000 Genomes individuals are probably the following:
Pre-publication RNA-sequencing data from the Geuvadis project is available
http://www.ebi.ac.uk/arrayexpress/experiments/E-GEUV-1/samples.html
http://www.ebi.ac.uk/arrayexpress/experiments/E-GEUV-2/samples.html
http://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-197
http://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-198
http://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-264
http://www.ebi.ac.uk/arrayexpress/experiments/E-GEOD-19480
References
Each three letter Population code represents a different population, CEU means Northern Europeans from Utah and TSI means Tuscans from Italy. There is a summary of all these codes both in a readme on the ftp site and in the Data Portal.
You can see details of the different populations used in different parts of IGSR in the populations section of the Data Portal. These can be viewed as markers on a map or in tables with details of data collections or technologies.