1. gdsfmt and SNPRelate - Please follow this link to view the tutorial.
gdsfmt and SNPRelate are high-performance computing R packages for multi-core symmetric multiprocessing computer architectures. They are used to accelerate two key computations is GWAS: principal component analysis (PCA) and relatedness analysis using identity-by-descent (IBD) measures. The kernels of our algorithms are written in C/C++, and have been highly optimized. Benchmarks show the uniprocessor implementations of PCA and IBD are ~8 to 50 times faster than the implementations provided by the popular EIGENSTRAT (v3.0) and PLINK (v1.07) programs respectively, and can be sped up to 30~300 folds by utilizing eight cores. SNPRelate can analyze tens of thousands of samples, with millions of SNPs.
subjects based on genetic marker data from single-nucleotide polymorphisms (SNPs). The
package is able to accommodate SNPs in linkage disequibrium (LD), without the need to
thin the markers so that they are approximately independent in the population. Sample
pairs are identiﬁed by superposing their estimated identity-by-descent (IBD) coeﬃcients
on plots of IBD coeﬃcients for pairs of simulated subjects from one of several common
close relationships. The methods are particularly relevant to candidate-gene association
studies, in which dependent SNPs cluster in a relatively small number of genes spread
throughout the genome. The accommodation of LD allows the use of all available genetic data, a desirable property when working with a modest number of dependent SNPs
within candidate genes