detecting population structure by PCA
(1) several papers:
A genealogical interpretation of principal components analysis
http://www.ncbi.nlm.nih.gov/sites/entrez/19834557?dopt=Abstract&holding=f1000,f1000m,isrctn
Genome-wide patterns of population structure and admixture in West Africans and African Americans
http://www.ncbi.nlm.nih.gov/sites/entrez/20080753?dopt=Abstract&holding=f1000,f1000m,isrctn
Analysis of population structure: a unifying framework and novel methods based on sparse factor analysis
http://www.ncbi.nlm.nih.gov/sites/entrez/20862358?dopt=Abstract&holding=f1000,f1000m,isrctn
(2) adegent
- a review of applications of multivariate analyses to genetic markers data:
Jombart T, Pontier D, Dufour AB. (2009) Heredity 102: 330-341. doi:10.1038/hdy.2008.130. [link to the journal's pdf - free abstract] Genetic markers in the playground of multivariate analysis.
- the paper presenting the spatial principal component analysis (sPCA, function spca), global and local tests (global.rtest and local.rtest):
Jombart T, Devillard S, Dufour AB, Pontier D (2008) Revealing cryptic spatial patterns in genetic variability by a new multivariate method. Heredity 101: 92-103. doi: 10.1038/hdy.2008.34 [link on the journal's website - free abstract]
- the paper presenting the SeqTrack algorithm (seqTrack), and simulations of genealoies of haplotypes (haploGen):
Jombart T, Eggo RM, Dodd PJ, Balloux F (2010) Reconstructing disease outbreaks from genetic data: a graph approach. Heredity. Doi: 10.1038/hdy.2010.78
- the paper introducing the Discriminant Analysis of Principal Components (DAPC, functions find.clusters and dapc):
Jombart T , Devillard S and Balloux F (2010) Discriminant analysis of principal components: a new method for the analysis of genetically structured populations. BMC Genetics 11:94. doi:10.1186/1471-2156-11-94 [free pdf] [free html version] [evaluation by Laurent Excoffier on F1000]
Acturally I am willing to run a pca to infer the population structure for my mapping populations, but I don't know if PCs are more practical than Q matrix from STRUCTURE. Do you have any suggestion?
回复删除PCA is more practical than whatever implemented in STRUCTURE, by its computation efficiency and accuracy of genetic assignment. You could refer to the last paper listed in this blog.
回复删除Moreover, the new method (DAPCA) can give you much more results than STRUCTURE, as demonstrated in the paper.
Thanks for the clarification, and I'll try DAPCA.
回复删除