McmcGlmm first steps

Notes to self

sources: The tutorial "MCMCglmm: Markov chain Monte Carlo methods for Generalised Linear Mixed Models."

first model

model1 <- MCMCglmm(PO ~ 1, random = ~FSfamily, data = PlodiaPO, verbose = FALSE)

unit

Units is a reserved variable which has a factor level for each row of the response. The default in MCMCglmm is to specify the residual term as rcov=∼units. In this instance each data point corresponds to a unique level of units and therefore we simply interpret the units variance as we would the residual variance in most models. Often we’re interested in knowing the proportion of total variance explained by the random effect, which would in- volve dividing the family variation by the sum of the two variance components.

"units" in mcmcglmm is "residual variance" in lme4

A nice thing is that we can get the HDP interval ("confidence interval", in frequentist terms) of the proportion of variance explained by the random term by simply issuing:

(In this example FSfamily was the only random term in the model)

> HPDinterval(model1$VCV[, "FSfamily"]/(model1$VCV[, "FSfamily"] + model1$VCV[, "units"]))

This sounds deep and cool:

A valid posterior for any transformation of model parameters can be obtained by applying that transformation to each sample and analysing the result. This model was fitted without explicitly specifying a prior and the default

priors

mcmcglmm does provide default priors, but it is safer to provide a proper but weak prior

aggregated data = no random effects necessary

model:

model3 <- MCMCglmm(cbind(Pupated, Infected) ~ 1, family = "multinomial2", data = PlodiaR, verbose = FALSE)

we have not fitted a random effect. Because the data are counts for each family we cannot fit FSfamily levels as random effects because they would be confounded with the residuals. The residual variance is however estimated and we can interpret this over-dispersion as variation in the binomial probability across families. Figure 4 shows that there is ample over-dispersion (the variance is estimated to be well away from zero) leading us to believe that families vary in their probability of resistance.

If the "residual variance" would have been zero, then all families would have been perfectly described by the intercept, but that is not the case.

multinomial add number of levels to the family name!

eg two levels, family = "multinomial2"

visualising the variance

If we assume that the intercept and the variance is correctly estimated, then we can use these as parameters of normal variable and sample from it and visualise the results (figure 5) in the tutorial "Figure 5: The predicted distribution of resistance probabilities across families assuming the posterior modes of model3 are correct."

(simple) binary models

family: "categorical".

Since the variance is given by the mean, and not independent parameter, it is defined solely by the priors. Hadfield uses 1 for the variance and 0 for the covariance, see fix = 1 below.

> data(PlodiaRB)
> prior = list(R = list(V = 1, nu = 0, fix = 1), G = list(G1 = list(V = 1, nu = 0)))
> model4 <- MCMCglmm(Pupated ~ 1, random = ~FSfamily, family = "categorical", data = PlodiaRB, prior = prior, verbose = FALSE)

An identical procedure for simple models such as this would be to specify a very informative prior (n is large) around V:

> prior = list(R = list(V = 1, nu = 1e+06), G = list(G1 = list(V = 1,

multivariate models

For multivariate models we will usually want to make use of the reserved variable trait which indexes columns of the response. By fitting trait as a fixed effect we allow the two responses to have different means. I usually fit the model ∼trait-1 rather than just ∼trait as this fits trait specific intercepts rather than an intercept for trait 1 (PO) and a contrast for trait 2 (Pupated). Likewise, just fitting a single family variance using ∼FSfamily implies that families have respond in the same way to both traits, so we usually want to form an interaction. The simplest would be ∼trait:FSfamily but this would imply that the family variance for each trait is equal and the family effects for trait 1 are uncorrelated with the family effects for trait 2. More reasonable variance structures can be formed using idh and us. idh allows different variances across the traits but assumes that the family effects for trait 1 are uncorrelated with the family effects for trait 2. us is the most general variance structure and allows different variances across the traits and allows covariances to exist between them.

prior = list(R = list(V = diag(2), nu = 0, fix = 2), G = list(G1 = list(V = diag(2), nu = 1)))
model5 <- MCMCglmm(cbind(PO, Pupated) ~ trait - 1,
  random = ~us(trait):FSfamily,
  rcov = ~idh(trait):units,
  family = c("gaussian", "categorical"),
  data = PlodiaPORB,
  prior = prior, verbose = FALSE)

idh appears to correspond to bar:bal in lmer(foo ~ 1 + (bar:bal)), while us appears to correspond to bar|bal in lmer(foo ~ 1 + (bar|bal))

comments powered by Disqus


Back to the index

Blog roll

R-bloggers, Debian Weekly
Valid XHTML 1.0 Strict [Valid RSS] Valid CSS! Emacs Muse Last modified: oktober 12, 2017