2011年4月19日星期二

detecting population structure by PCA

(1) several papers:

A genealogical interpretation of principal components analysis 

http://www.ncbi.nlm.nih.gov/sites/entrez/19834557?dopt=Abstract&holding=f1000,f1000m,isrctn

Genome-wide patterns of population structure and admixture in West Africans and African Americans

http://www.ncbi.nlm.nih.gov/sites/entrez/20080753?dopt=Abstract&holding=f1000,f1000m,isrctn

Analysis of population structure: a unifying framework and novel methods based on sparse factor analysis

http://www.ncbi.nlm.nih.gov/sites/entrez/20862358?dopt=Abstract&holding=f1000,f1000m,isrctn


(2) adegent


- a review of applications of multivariate analyses to genetic markers data:


Jombart T, Pontier D, Dufour AB. (2009) Heredity 102: 330-341. doi:10.1038/hdy.2008.130. [link to the journal's pdf - free abstract] Genetic markers in the playground of multivariate analysis.


- the paper presenting the spatial principal component analysis (sPCA, function spca), global and local tests (global.rtest and local.rtest):
Jombart T, Devillard S, Dufour AB, Pontier D (2008) Revealing cryptic spatial patterns in genetic variability by a new multivariate methodHeredity 101: 92-103. doi: 10.1038/hdy.2008.34 [link on the journal's website - free abstract]


- the paper presenting the SeqTrack algorithm (seqTrack), and simulations of genealoies of haplotypes (haploGen):
 

Jombart T, Eggo RM, Dodd PJ, Balloux F (2010) Reconstructing disease outbreaks from genetic data: a graph approach. Heredity. Doi: 10.1038/hdy.2010.78 

- the paper introducing the Discriminant Analysis of Principal Components (DAPC, functions find.clusters and dapc): 

Jombart T , Devillard S and Balloux F (2010) Discriminant analysis of principal components: a new method for the analysis of genetically structured populations. BMC Genetics 11:94. doi:10.1186/1471-2156-11-94 [free pdf] [free html version] [evaluation by Laurent Excoffier on F1000]

 

3 条评论:

  1. Acturally I am willing to run a pca to infer the population structure for my mapping populations, but I don't know if PCs are more practical than Q matrix from STRUCTURE. Do you have any suggestion?

    回复删除
  2. PCA is more practical than whatever implemented in STRUCTURE, by its computation efficiency and accuracy of genetic assignment. You could refer to the last paper listed in this blog.

    Moreover, the new method (DAPCA) can give you much more results than STRUCTURE, as demonstrated in the paper.

    回复删除
  3. Thanks for the clarification, and I'll try DAPCA.

    回复删除