2011年6月7日星期二

libsequence and the softwares it supports

http://molpopgen.org/software/lseqsoftware.html

Software dealing with sequence analysis:
analysis - C++ software for evolutionary genetic analysis. This package also requires the GNU Scientific Library to be installed. Many linux distros provide GSL packages, and OS X users can install it using either the fink or darwinports projects, according to their preference. (I prefer darwinports, for what that's worth). Howver, to make life easier on yourself, I recommend that OS X users install the GSL directly from the source code available at the GSL homepage. The reason for this is that I have not modified the build system to be able to deal with lib directories other than /usr/local/lib and /usr/lib. The GSL is used to calculate chi-squared probabilities for the program MKtest. If you're not aware of it, the GSL is a C library for numeric computation, essentially a modern version of "Numerical Recipes in C".
There are manpages for several of the programs in the analysis package (These may be out-of-date. Up-to-date version will be installed with the packages themselves):
  1. compute a "mini-DNAsp" for the Unix command-line
  2. gestimator, Ka/Ks by Comeron's method
  3. kimura80, to calculate divergence using Kimura's (1980) method
  4. polydNdS, to analyze silent and replacement polymorphism
  5. MKtest, to perform McDonald and Kreitman tests
  6. rsq, to summarize linkage disequilibrium in data
  7. descPoly, a program to output a qualitative summary of features of sequence polymorphism data
  8. sharedPoly, a program to calculate number of shared polymorphisms between 2 partitions of an alignment
sequtils -software for sequence manipulation.
manpages are available online for the following programs in the sequtils package (These may be out-of-date. Up-to-date version will be installed with the packages themselves):
  1. clustalwtofasta
  2. revcom
  3. toLDhat
  4. trimallgaps
Software dealing with analysis of coalescent simulation:
msstats Reads in data from Hudson's coalescent simulation program ms and calculates several common summary statistcs. The output is a tab-delimited list of statistics, with a header line so that the file can be easily processed in R.
example usage: ms 50 10000 -t 20 | msstats
msff Applies a frequency filter to the output of Dick Hudson's coalescent simulation. Using the -m flag, it filters on the minor allele frequency. Use -d to filter on the derived allele frequency. The filtered data a printed to stdout. The frequency filter removes sites where the relevant frequency is less than or equal to the input value. Frequencies are input as decimals on the interval [0,1]. For example, to calculate LD-related statistic using my msld package, but filtering out sites where the minor allele frequency is less than or equal to 10% in the sample:
ms 10 10000 -t 20 -r 20 1000 | msff -m 0.10 | msld > out.
rhothetapost Estimate mutation and recombination rates from multilocus polymorphism data. Described in Haddrill et al. (2005) and Thornton and Andolfatto (2006). Documentation is here
omega Calculates Kim and Nielsen's (2004, Genetics 167:1513) "omega_max" statistic which was explored in Jensen et al. (2007, Genetics 176 2371-3279). Please read the source code for documentation. Both Kim and Nielsen and Jensen et al. should be cited if this code is used--the first for the statistic, the latter for the implementation.

没有评论:

发表评论