显示标签为“MCMCglmm”的博文。显示所有博文
显示标签为“MCMCglmm”的博文。显示所有博文

2012年6月12日星期二

classical frequentist and Bayesians are not the only schools of statistics useful for animal models

http://www.quantumforest.com/2011/11/coming-out-of-the-bayesian-closet/#comments


I would also like to point out that classical frequentist and Bayesians are not theonly schools of statistics useful for animal models.

Classical frequentist inference uses marginal likelihood (random effects integratedoutthat includes fixed parameters and only the observations are treated asrandom.
Bayesian inference is a probabilistic framework that combines likelihood and prior information, and treats all parameters and observations as random.
A third important school is the one based on the Extended Likelihood Principle. Assuming simple statistical principles Bjørnstad (1996) showed that all information in the data about the random and fixed effects is included in a joint likelihood including three components: fixed parameters, unobserved random effects, and observations as random. Lee and Nelder's (1996) h-likelihood is an implementation of the Extended Likelihood Principle. The hglm package (that I have developed together with colleagues) is based on h-likelihood theory and allows fitting of animal models.



hglmHierarchical Generalized Linear Models

2012年6月5日星期二

ASReml-R cookbook

http://apiolaza.net/asreml-r/


Recipes

  1. Very basic usage.
  2. Specifying model equations.
  3. Univariate analysis, including basic equivalent models, diallels, clonal trials and multiple-site as single trait.
  4. Extracting results like variance components, fixed and random effects, etc, from the fitted model.
  5. Covariance structures.
  6. Multiple environments.

2012年6月1日星期五

significant additive detected

We also detected significant additive genetic variance for the four morphological traits tested: adult weight, leg length, horn length, and scrotal circumference. The other fitted random effects were statistically significant sources of variance for all traits, except for scrotal circumference.

The multivariate genetic variance-covariance structure is predominantly characterized by positive relationships among the four adult sheep traits.

2012年5月22日星期二

R2BayesX - Bayesian regression with R

R2BayesX, that has more easier interface.

2012年1月3日星期二

Coefplot: New Package for Plotting Model Coefficients


http://blog.revolutionanalytics.com/2012/01/new-package-for-plotting-model-coefficients.html



2011年8月19日星期五

Logistic random effects regression models: a comparison of statistical packages for binary and ordinal outcomes

different statistical packages were compared on logistic random effects regression models.

Abstract: Background: Logistic random effects models are a popular tool to analyze multilevel also called hierarchical data with a binary or ordinal outcome. Here, we aim to compare different statistical software implementations of these models.

Methods: We used individual patient data from 8509 patients in 231 centers with moderate and severe Traumatic Brain Injury (TBI) enrolled in eight Randomized Controlled Trials (RCTs) and three observational studies. We fitted logistic random effects regression models with the 5-point Glasgow Outcome Scale (GOS) as outcome, both dichotomized as well as ordinal, with center and/or trial as random effects, and as covariates age, motor score, pupil reactivity or trial. We then compared the implementations of frequentist and Bayesian methods to estimate the fixed and random effects. Frequentist approaches included R (lme4), Stata (GLLAMM), SAS (GLIMMIX and NLMIXED), MLwiN (p[R]IGLS) and MIXOR, Bayesian approaches included WinBUGS, MLwiN (MCMC), R package MCMCglmm and SAS experimental procedure MCMC. Three data sets (the full data set and two sub-datasets) were analysed using basically two logistic random effects models with either one random effect for the center or two random effects for center and trial. For the ordinal outcome in the full data set also a proportional odds model with a random center effect was fitted.

Results: The packages gave similar parameter estimates for both the fixed and random effects and for the binary (and ordinal) models for the main study and when based on a relatively large number of level-1 (patient level) data compared to the number of level-2 (hospital level) data. However, when based on relatively sparse data set, i.e. when the numbers of level-1 and level-2 data units were about the same, the frequentist and Bayesian approaches showed somewhat different results. The software implementations differ considerably in flexibility, computation time, and usability. There are also differences in the availability of additional tools for model evaluation, such as diagnostic plots. The experimental SAS (version 9.2) procedure MCMC appeared to be inefficient.

Conclusions: On relatively large data sets, the different software implementations of logistic random effects regression models produced similar results. Thus, for a large data set there seems to be no explicit preference (of course if there is no preference from a philosophical point of view) for either a frequentist or Bayesian approach (if based on vague priors). The choice for a particular implementation may largely depend on the desired flexibility, and the usability of the package. For small data sets the random effects variances are difficult to estimate. In the frequentist approaches the MLE of this variance was often estimated zero with a standard error that is either zero or could not be determined, while for Bayesian methods the estimates could depend on the chosen "noninformative" prior of the variance parameter. The starting value for the variance parameter may be also critical for the convergence of the Markov chain. 

meta-analysis of phynotypic effects - a case (with MCMCglmm used)

http://www.sciencedirect.com/science/article/pii/S0003347211001229
Dominance and plumage traits: meta-analysis and metaregression analysis 

Procedures of Meta-analysis and Metaregression

All meta-analyses were conducted in the statistical software S-Plus (TIBCO; http://www.tibco.com/) and R (version 2.11.1; R Development Core Team 2010), using LMMs to perform a random-effect meta-analysis (Nakagawa et al. 2007). In all analyses, we accounted for the hierarchical structure in the data (e.g. with multiple effect sizes from the same population) by including study population and species as nested random effects (Nakagawa et al. 2007). This statistical procedure allowed us to use multiple effect sizes from a study or population in the same analysis without violating the assumption of independence (Nakagawa & Hauber 2011). Our meta-analytical LMM was calculated as an intercept-only model with the restricted maximum likelihood (REML) method (nlme package; Pinheiro & Bates 2000). Reported P values are for the main effects (intercepts) only.
To determine whether a set of effect sizes was homogeneous, we calculated the residual heterogeneity QREML, as random-effects models were used (Nakagawa et al. 2007). When residual heterogeneity was significant, the variance among effect sizes was greater than expected from sampling error, suggesting the existence of important moderator variables. It is worth emphasizing that even though our metaregressions accounted for some moderator variables (Table 1), it is still possible that other (unaccounted) moderators could have introduced heterogeneity in the data. Furthermore, we decided not to include interaction terms among predictor variables in our metaregressions, as the models would be overparameterized (i.e. models would have too many parameters and not enough data points in each category to be robust; Ginzburg & Jensen 2004). We conducted contrast analyses (LMMs) for all metaregression models to check for the effect of different levels of an explanatory variable on the relationship between dominance and plumage. We show the results of contrast analyses only if the difference between levels of a variable (contrasts) was statistically significant (all other results are in Table A2 in Appendix 2).
We also conducted a randomization test to evaluate the importance of the variance components of our random factors (i.e. study and species). If a variance component was significantly different from zero, it indicated the existence of either study or species effects, the latter of which may, but does not necessarily, imply phylogenetic signal in the data (cf. Hadfield & Nakagawa 2010). We tested the null hypothesis that the variance component = 0, against the alternative hypothesis that the variance component > 0 (Nakagawa & Schielzeth 2010). We randomized the original effect size vector 100 000 times, each followed by fitting the meta-analytical LMM to estimate randomized random factor variance components. The P value was determined as the proportion of randomizations that yielded a variance component larger than or equal to the variance component of the original data. Furthermore, we conducted a phylogenetic meta-analysis described in Hadfield & Nakagawa (2010) to account for the lack of independence across species caused by their evolutionary relationships, using the MCMCglmm package in R (Hadfield 2010). As our conclusions did not change with the use of the phylogenetic meta-analysis, we only present results based on the general meta-analysis (the results of the phylogenetic meta-analysis can be found in Appendix 3).



2011年5月30日星期一

advantages of Mixed model/Bayesian/MCMCglmm - gathered from relevant references

1.
Phylogenetic mixed models have mainly been applied to traits which are assumed to be normally distributed (for exceptions, see Felsenstein, 2005; Naya et al., 2006). Generalized linear mixed models extend the linear mixed model to non-Gaussian responses, although model fitting has proved more difficult because the likelihood cannot be obtained in closed form. MCMC techniques solve this
problem by breaking the high-dimensional joint distribution into a series of lower dimensional conditional distributions which are easier to sample from. By repeatedly sampling from these conditional distributions it is possible to very accurately approximate the complete joint distribution, and thereby extract things of interest (often marginal distributions).

Hadfield, J. D., & Nakagawa, S. (2010). General quantitative genetic methods for comparative biology: Phylogenies, taxonomies and multi-trait models for continuous and categorical characters. Journal of Evolutionary Biology, 23(3), 494-508.

2.
The major reason for the popularity of mixed modelling is probably its ability to account for statistical non-independence of data by having random effects as well as fixed effects (the name ‘mixed-effects’ originated from combining these two types of effects) (McCulloch and Searle, 2002). It is difficult to think of an example of any dataset where the data points would be truly independent from one another.


http://www.sciencedirect.com/science/article/pii/S0149763410001028

3.
An important advantage of the mixed model approach relates to the statistical inferences drawn from non-normal data. To date, in neurosciences, non-parametric (NP) tests such as Mann-Whitney and Kruskal–Wallis tests have been often used to deal with small samples sizes, where normality cannot be tested, or with truly non-normally distributed data (Janusonis, 2009)

4.
To test for a genetic change in the population, we used Bayesian methods. Use of Bayesian methods allows the time trends to be estimated directly instead of requiring statistics to be based on the best linear unbiased predictions (BLUPs) from a linear model, which incurs problems in terms of error propagation (Ovaskainen et al., 2008; Hadfield, 2010).

J . EVOL. BI O L . 23 ( 2 0 1 0 ) 935–944
http://onlinelibrary.wiley.com/doi/10.1111/j.1420-9101.2010.01959.x/abstract

5.
Our dataset allowed us to estimate the variance in offspring sex ratio, both at ca 6 days post-hatching and at independence from parental care, and to test how much of the observed variation in sex ratio was accounted for by additive genetic variance as opposed to environmental/non-additive genetic and sampling variance. 

Variance components were estimated by fitting a generalized animal model. An animal model is a specific type of mixed model that explicitly takes into account the resemblance among all relatives. It models an individual's phenotype as a function of a number of fixed and random effects, including a random additive genetic ‘animal’ effect. The variance–covariance structure of the latter is proportional to the pairwise coefficients of relatedness among all individuals in the pedigree. Thereby an animal model allows us to fully exploit all pedigree data, and to simultaneously account for a number of potentially confounding environmental effects [3234]. Fitting an animal model, or any mixed model for that matter, with non-Gaussian traits using (restricted) maximum-likelihood techniques is challenging. Hence, we used Bayesian Markov chain Monte Carlo (MCMC) techniques implemented in the R package MCMCglmm [30,35]

Disentangling the effect of genes, the environment and chance on sex ratio variation in a wild bird population

http://rspb.royalsocietypublishing.org/content/early/2011/02/17/rspb.2010.2763.full

2011年5月14日星期六

Quantitative genetic parameters for wild stream-living brown trout - using MCMCglmm

Quantitative genetic parameters for wild stream-living brown trout: heritability and parental effects

http://onlinelibrary.wiley.com/doi/10.1111/j.1420-9101.2010.02028.x/full

很显然,同样的方法可以用在松树里。

Adaptability depends on the presence of additive genetic variance for important traits. Yet few estimates of additive genetic variance and heritability are available for wild populations, particularly so for fishes. Here, we estimate heritability of length-at-age for wild-living brown trout (Salmo trutta), based on long-term mark-recapture data and pedigree reconstruction based on large-scale genotyping at 15 microsatellite loci. We also tested for the presence of maternal and paternal effects using a Bayesian version of the Animal model. Heritability varied between 0.16 and 0.31, with reasonable narrow confidence bands, and the total phenotypic variance increased with age. When introducing dam as an additional random effect (accounting for c. 7% of total phenotypic variance), the level of additive genetic variance and heritability decreased (0.12–0.21). Parental size (both for sires and for dams) positively influenced length-at-age for juvenile trout – either through direct parental effects or through genotype-environment correlations. Length-at-age is a complex trait reflecting the effects of a number of physiological, behavioural and ecological processes. Our data show that fitness-related traits such as length-at-age can retain high levels of additive genetic variance even when total phenotypic variance is high.