2012年7月23日星期一

Reconstructing the origin and spread of horse domestication in the Eurasian steppe

http://www.pnas.org/content/109/21/8202.full


Model Fitting.

 
We fitted our model by using an approximate Bayesian computation framework, using the ABC-GLM algorithm implemented in the ABCtoolbox software (28). We used six summary statistics to describe our dataset: After assigning the sampled populations to three groups (west, central, and east, Fig. S2), we computed average within-population heterozygosities within each of the groups (three estimates) and average between-group heterozygosities (west vs. central, west vs. east, and central vs. east, three estimates).
We started by randomly sampling 55 million combinations of parameter values within the following ranges: t ∈ [500, 15,000] generations (corresponding to 6–180 kya), m ∈ [10−6, 10−3], cK ∈ [1, 104], r ∈ [0.005, 1], K ∈ [4,000, 105], K0 ∈ [1, 105], md ∈ [10−4, 0.5], Kd ∈ [500, 104], cdKd ∈ [1, 103], q ∈ [0, 1], and cdKd0∈ [1, 103]. Whereas t and q were sampled according to the uniform distribution of their untransformed values, all other parameters were sampled from the uniform distribution of their log-transformed values. For each parameter value combination, we then generated our six summary statistics for each of the 12 scenarios, combining all possible origins for wild horses and domestication events in our model. In these calculations we took the mutation rate (μ) to be 1.5 × 10−4 per generation (the average of mutation rate estimates for two microsatellite markers, AHT4 and HTG10) (29). Because horse domestication occurred relatively recently, the only parameters that are considerably affected by the mutation rate are the time of the initial expansion of E. ferus in Eurasia (t) and the ancestral population size of E. ferus (K0). For each scenario, we ran ABC-GLM on the accepted parameter combinations (on the basis of the 0.1 percentile of Euclidean distances between simulated and observed summary statistics) to estimate posterior distributions of the model parameters and the likelihood of the summary statistics as estimated from the genetics data (Table S2). We then used Bayes factors, given by the ratio of estimated likelihoods for each pair of scenarios, for model comparison. Because both Bayes factors and posterior distributions for a deme spacing of 100 km (Figs. 2 and 3) were in close agreement with those obtained for deme spacings of 50 km and 200 km, respectively (Figs. S3S5), we refer to the former only in the main text.

没有评论:

发表评论