Last week I was at the European Ecological Federation conference in Rome, I presented the results of one experiment that we ran last year (THE one big experiment of my PhD). You can find the slides here.

All in all it was a week with ups and downs, some session were highly interesting and generated intense discussion while others were just no-question-next-talk session. Below is a summary of the stuff that got stuck in my head over the week.

Day 1: I arrived pretty late at the conference so could only listen to a few talk in the evening session, Caroline Müller gave a nice talk on the combined effect of drought and plant chemotypes on insect herbivores (caterpillar). An intriguing results was that plants under shoot herbivory had higher concentration of defensive molecules in the roots than plants with no herbivory. The reason why this happen is left to speculation.

Day 2: The day started at 8.30am sharp by two plenary lectures, the second one given by Prof. Musso (a colleague from the TUM!) explored the ecology of architecture and how to develop new materials and home to ensure sustainable town. Particularly on how to develop incentives to promote low-consuming lifestyles in cities and countries with democracy. I then went to a symposium on scale non-linearity of drivers of environmental changes. There Stefano Larsen gave a nice talk on temporal community shifts of stream invertebrates. His results show that local decline in species richness were due to specialist species extinction that were then not able to re-colonize the area from the regional species pool. In the afternoon I went to the Biodiversity and Ecosystem session which self-organized itself with no chair, leaving a pleading Martin Winter at the end of the session asking: “Somebody should close the session”!

Day 3: Again an early start at 8.30 (not so sharp this time), I went first to the tropical ecology session which started by two great talk from Hannah Tuomisto on fern species distribution in the amazonian forest and Jens-Christian Svenning on historical legacies in palm global species distribution. Jens-Christian did not read my previous post on partial residual plots as he used them to picture most of the relationships he explored, too bad. I then ran to the agricultural ecology session where Emanuelle Porcher gave her talk on the effect of wheat genetic diversity on predation rates, she found weak positive effect of genetic diversity and some seasonal variation that she explained by contrasting climatic conditions. The day continued with two plenaries the second one by Christopher Kennedy on the metabolism of megacities where he presented his approach considering cities as ecosystem and analyzing fluxes of energy entering and exiting the cities and how they moves amongst the compartment in the cities. His results show that despite certain expectation that larger cities are more efficient in using energy, larger cities consume more energy due to larger wealth (GDP) in these big cities. Larger wealth cause larger amount of waste and higher electricity use. This is a particularly challenging issue as more and more are moving towards cities especially in Asia where a new consuming middle class is arising.

Day 4: On thursday morning I went to the high nature value symposium, a concept I was not familiar with but which is basically a framework to identify agrosystems with potentially high diversity and rare/endangered ecosystem type. The talk by James Moran on the implementation of this concept in Ireland was very interesting, he developed a 10-point grading system to assess the ecosystem health (being of course relative to the habitat) and depending on the grade the farmer get more or less money. This system by being pretty close to the 15-point grading system used to assess meat quality (and hence the price paid for a cow) helps farmers grasping the concept of ecosystem health. One quote from this speaker is also worth noting: “If you depend on somebody to translate your results it will be lost in translation”. I then went to a symposium on ecologists’ strategies at science-policy interface with plenty of great talks some given by social scientists other by ecologists involved in this area. One striking talk was by Zoe Nyssa on unexpected negative feedback of conservation action, she did a literature review on this issue in conservation journals and found many instances where conservation programs led to unexpected results. What was particularly interesting was the lively discussion after the end of the session were ecologist and social scientist exchanged on the way to improve communication between ecologists and policy-maker and the society, I have been wanted for a long time to write a post on ecological advocacy, maybe this will motivate me … In the afternoon one quote form a chair asking a question made its way to my notebook: “How did you select your model? The current approach is to fit all possible models and compare them with AIC”, hmmm well depending on your objectives this approach might work, but if you are trying to find mechanisms/test hypothesis this will most certainly not work (see here). Thursday evening I also went out to discover Rome vibrant nightlife with the INGEE people, heavy rain earlier that evening apparently negatively impacted the participation rate, but it was pretty relaxed and nice to chat with some Italian scientists.

So a nice little conference with plenty of things to keep oneself busy (but not too much) and some cool interaction.

UPDATED: Thanks to Ben and Florian comments I’ve updated the first part of the post

A very brief post at the end of the field season on two little “details” that are annoying me in paper/analysis that I see being done (sometimes) around me.

The first one concern mixed effect models where the models built in the contain a grouping **factor** (say month or season) that is fitted as both a fixed effect term and as a random effect term (on the right side of the | in lme4 model synthax). I don’t really understand why anyone would want to do this and instead of spending time writing equations let’s just make a simple simulation example and see what are the consequences of doing this:

library(lme4) set.seed(20150830) #an example of a situation measuring plant biomass on four different month along a gradient of temperature data<-data.frame(temp=runif(100,-2,2),month=gl(n=4,k=25)) modmat<-model.matrix(~temp+month,data) #the coefficient eff<-c(1,2,0.5,1.2,-0.9) data$biom<-rnorm(100,modmat%*%eff,1) #the simulated coefficient for Months are 0.5, 1.2 -0.9 #a simple lm m_fixed<-lm(biom~temp+month,data) coef(m_fixed) #not too bad ## (Intercept) temp month2 month3 month4 ## 0.9567796 2.0654349 0.4307483 1.2649599 -0.8925088 #a lmm with month ONLY as random term m_rand<-lmer(biom~temp+(1|month),data) fixef(m_rand) ## (Intercept) temp ## 1.157095 2.063714 ranef(m_rand) ## $month ## (Intercept) ## 1 -0.1916665 ## 2 0.2197100 ## 3 1.0131908 ## 4 -1.0412343 VarCorr(m_rand) #the estimated sd for the month coeff ## Groups Name Std.Dev. ## month (Intercept) 0.87720 ## Residual 0.98016 sd(c(0,0.5,1.2,-0.9)) #the simulated one, not too bad! ## [1] 0.8831761 #now a lmm with month as both fixed and random term m_fixedrand<-lmer(biom~temp+month+(1|month),data) fixef(m_fixedrand) ## (Intercept) temp month2 month3 month4 ## 0.9567796 2.0654349 0.4307483 1.2649599 -0.8925088 ranef(m_fixedrand) #very, VERY small ## $month ## (Intercept) ## 1 0.000000e+00 ## 2 1.118685e-15 ## 3 -9.588729e-16 ## 4 5.193895e-16 VarCorr(m_fixedrand) ## Groups Name Std.Dev. ## month (Intercept) 0.40397 ## Residual 0.98018 #how does it affect the estimation of the fixed effect coefficient? summary(m_fixed)$coefficients ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 0.9567796 0.2039313 4.691676 9.080522e-06 ## temp 2.0654349 0.1048368 19.701440 2.549792e-35 ## month2 0.4307483 0.2862849 1.504614 1.357408e-01 ## month3 1.2649599 0.2772677 4.562233 1.511379e-05 ## month4 -0.8925088 0.2789932 -3.199035 1.874375e-03 summary(m_fixedrand)$coefficients ## Estimate Std. Error t value ## (Intercept) 0.9567796 0.4525224 2.1143256 ## temp 2.0654349 0.1048368 19.7014396 ## month2 0.4307483 0.6390118 0.6740851 ## month3 1.2649599 0.6350232 1.9919901 ## month4 -0.8925088 0.6357784 -1.4038048 #the numeric response is not affected but the standard error around the intercept and the month coefficient is doubled, this makes it less likely that a significant p-value will be given for these effect ie higher chance to infer that there is no month effect when there is some #and what if we simulate data as is supposed by the model, ie a fixed effect of month and on top of it we add a random component rnd.eff<-rnorm(4,0,1.2) mus<-modmat%*%eff+rnd.eff[data$month] data$biom2<-rnorm(100,mus,1) #an lmm model m_fixedrand2<-lmer(biom2~temp+month+(1|month),data) fixef(m_fixedrand2) #weird coeff values for the fixed effect for month ## (Intercept) temp month2 month3 month4 ## -2.064083 2.141428 1.644968 4.590429 3.064715 c(0,eff[3:5])+rnd.eff #if we look at the intervals between the intercept and the different levels we can realize that the fixed effect part of the model sucked in the added random part ## [1] -2.66714133 -1.26677658 1.47977624 0.02506236 VarCorr(m_fixedrand2) ## Groups Name Std.Dev. ## month (Intercept) 0.74327 ## Residual 0.93435 ranef(m_fixedrand2) #again very VERY small ## $month ## (Intercept) ## 1 1.378195e-15 ## 2 7.386264e-15 ## 3 -2.118975e-14 ## 4 -7.752347e-15 #so this is basically not working it does not make sense to have a grouping factor as both a fixed effect terms and a random term (ie on the right-hand side of the |)

Take-home message don’t put a grouping **factor** as both a fixed and random term effect in your mixed effect model. lmer is not able to separate between the fixed and random part of the effect (and I don’t know how it could be done) and basically gives everything to the fixed effect leaving very small random effects. The issue is abit pernicious because if you would only look at the standard deviation of the random term from the merMod summary output you could not have guessed that something is wrong. You need to actually look at the random effects to realize that they are incredibely small. So beware when building complex models with many fixed and random terms to always check the estimated random effects.

Now this is only true for factorial variables, if you have a continuous variable (say year) that affect your response through both a long-term trend but also present some between-level (between year) variation, it actually makes sense (provided you have enough data point) to fit a model with this variable as both a fixed and random term. Let’s look into this:

#an example of a situation measuring plant biomass on 10 different year along a gradient of temperature set.seed(10) data<-data.frame(temp=runif(100,-2,2),year=rep(1:10,each=10)) modmat<-model.matrix(~temp+year,data) #the coefficient eff<-c(1,2,-1.5) rnd_eff<-rnorm(10,0,0.5) data$y<-rnorm(100,modmat%*%eff+rnd_eff[data$year],1) #a simple lm m_fixed<-lm(y~temp+year,data) summary(m_fixed) Call: lm(formula = y ~ temp + year, data = data) Residuals: Min 1Q Median 3Q Max -2.27455 -0.83566 -0.03557 0.92881 2.74613 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 1.41802 0.25085 5.653 1.59e-07 *** temp 2.11359 0.11230 18.820 < 2e-16 *** year -1.54711 0.04036 -38.336 < 2e-16 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 1.159 on 97 degrees of freedom Multiple R-squared: 0.9508, Adjusted R-squared: 0.9498 F-statistic: 937.1 on 2 and 97 DF, p-value: < 2.2e-16 coef(m_fixed) #not too bad (Intercept) temp year 1.418019 2.113591 -1.547111 #a lmm m_rand<-lmer(y~temp+year+(1|factor(year)),data) summary(m_rand) Linear mixed model fit by REML ['lmerMod'] Formula: y ~ temp + year + (1 | factor(year)) Data: data REML criterion at convergence: 304.8 Scaled residuals: Min 1Q Median 3Q Max -2.0194 -0.7775 0.1780 0.6733 1.9903 Random effects: Groups Name Variance Std.Dev. factor(year) (Intercept) 0.425 0.6519 Residual 1.004 1.0019 Number of obs: 100, groups: factor(year), 10 Fixed effects: Estimate Std. Error t value (Intercept) 1.40266 0.49539 2.831 temp 2.01298 0.10191 19.752 year -1.54832 0.07981 -19.400 Correlation of Fixed Effects: (Intr) temp temp 0.031 year -0.885 0.015 fixef(m_rand) #very close to the lm estimation (Intercept) temp year 1.402660 2.012976 -1.548319 plot(rnd_eff,ranef(m_rand)[[1]][,1]) abline(0,1) #not too bad VarCorr(m_rand) #the estimated sd for the within-year variation Groups Name Std.Dev. factor(year) (Intercept) 0.65191 Residual 1.00189

Interestingly we see in this case that the standard error (and the related t-value) of the intercept and year slope are twice as big (small for the t-values) in the LMM compared to the LM. This means that not taking into account between-year random variation leads us to over-estimate the precision of the long-term temporal trend (we might conclude that there are significant effect when there are a lot of noise not taken into account). I still don’t fully grasp how this work, but thanks to the commenter for pointing this out.

The second issue is maybe a bit older but I saw it appear in a recent paper (which is a cool one excpet for this stats detail). After fitting a model with several predictors one wants to plot their effects on the response, some people use partial residuals plot to do this (wiki). The issue with these plots is that when two variables have a high covariance the partial residual plot will tend to be over-optimistic concerning the effect of variable X on Y (ie the plot will look much nice than it should be). Again let’s do a little simulation on this:

library(MASS) set.seed(20150830) #say we measure plant biomass in relation with measured temperature and number of sunny hours say per week #the variance-covariance matrix between temperature and sunny hours sig<-matrix(c(2,0.7,0.7,10),ncol=2,byrow=TRUE) #simulate some data xs<-mvrnorm(100,c(5,50),sig) data<-data.frame(temp=xs[,1],sun=xs[,2]) modmat<-model.matrix(~temp+sun,data) eff<-c(1,2,0.2) data$biom<-rnorm(100,modmat%*%eff,0.7) m<-lm(biom~temp+sun,data) sun_new<-data.frame(sun=seq(40,65,length=20),temp=mean(data$temp)) #partial residual plot of sun sun_res<-resid(m)+coef(m)[3]*data$sun plot(data$sun,sun_res,xlab="Number of sunny hours",ylab="Partial residuals of Sun") lines(sun_new$sun,coef(m)[3]*sun_new$sun,lwd=3,col="red")

#plot of sun effect while controlling for temp pred_sun<-predict(m,newdata=sun_new) plot(biom~sun,data,xlab="Number of sunny hours",ylab="Plant biomass") lines(sun_new$sun,pred_sun,lwd=3,col="red")

#same stuff for temp temp_new<-data.frame(temp=seq(1,9,length=20),sun=mean(data$sun)) pred_temp<-predict(m,newdata=temp_new) plot(biom~temp,data,xlab="Temperature",ylab="Plant biomass") lines(temp_new$temp,pred_temp,lwd=3,col="red")

The first graph is a partial residual plot, from this graph alone we would be tempted to say that the number of hour with sun has a large influence on the biomass. This conclusion is biased by the fact that the number of sunny hours covary with temperature and temperature has a large influence on plant biomass. So who is more important temperature or sun? The way to resolve this is to plot the actual observation and to add a fitted regression line from a new dataset (sun_new in the example) where one variable is allowed to vary while all others are fixed to their means. This way we see how an increase in the number of sunny hour at an average temperature affect the biomass (the second figure). The final graph is then showing the effect of temperature while controlling for the effect of the number of sunny hours.

Happy modelling!

Count data are widely collected in ecology, for example when one count the number of birds or the number of flowers. These data follow naturally a Poisson or negative binomial distribution and are therefore sometime tricky to fit with standard LMs. A traditional approach has been to log-transform such data and then fit LMs to the transformed data. Recently a paper advocated against the use of such transformation since it led to high bias in the estimated coefficients. More recently another paper argued that log-transformation of count data followed by LM led to lower type I error rate (ie saying that an effect is significant when it is not) than GLMs. What should we do then?

Using a slightly changed version of the code published in the Ives 2015 paper let’s explore the impact of using these different modelling strategies on the rejection of the null hypothesis and the bias of the estimated coefficients.

#load the libraries library(MASS) library(lme4) library(ggplot2) library(RCurl) library(plyr) #download and load the functions that will be used URL<-"https://raw.githubusercontent.com/Lionel68/Jena_Exp/master/stats/ToLog_function.R" download.file(URL,destfile=paste0(getwd(),"/ToLog_function.R"),method="curl") source("ToLog_function.R")

Following the paper from Ives and code therein I simulated some predictor (x) and a response (y) that follow a negative binomial distribution and is linearly related to x.

In the first case I look at the impact of varying the sample size on the rejection of the null hypothesis and the bias in the estimated coefficient between y and x.

######univariate NB case############ base_theme<-theme(title=element_text(size=20),text=element_text(size=18)) #range over n output<-compute.stats(NRep=500,n.range = c(10,20,40,80,160,320,640,1280)) ggplot(output,aes(x=n,y=Reject,color=Model))+geom_path(size=2)+scale_x_log10()+labs(x="Sample Size (log-scale)",y="Proportion of Rejected H0")+base_theme ggplot(output,aes(x=n,y=Bias,color=Model))+geom_path(size=2)+scale_x_log10()+labs(x="Sample Size (log-scale)",y="Bias in the slope coefficient")+base_theme

For this simulation round the coefficient of the slope (b1) was set to 0 (no effect of x on y), and only the sample size varied. The top panel show the average proportion of time that the p-value of the slope coefficient was lower than 0.05 (H0:b1=0 rejected). We see that for low sample size (<40) the Negative Binomial model has higher proportion of rejected H0 (type I error rate) but this difference between the model disappear as we reached sample size bigger than 100. The bottom panel show the bias (estimated value – true value) in the estimated coefficient. For very low sample size (n=10), Log001, Negative Binomial and Quasipoisson have higher bias than Log1 and LogHalf. For larger sample size the difference between the GLM team (NB and QuasiP) and the LM one (Log1 and LogHalf) gradually decrease and both teams converged to a bias around 0 for larger sample size. Only Log0001 is behaved very badly. From what we saw here it seems that Log1 and LogHalf are good choices for count data, they have low Type I error and Bias along the whole sample size gradient.

The issue is that an effect of exactly 0 never exist in real life where most of the effect are small (but non-zero) thus the Null Hypothesis will never be true. Let’s look know how the different models behaved when we vary b1 alone in a first time and crossed with sample size variation.

#range over b1 outpu<-compute.stats(NRep=500,b1.range = seq(-2,2,length=17)) ggplot(outpu,aes(x=b1,y=Reject,color=Model))+geom_path(size=2)+base_theme+labs(y="Proportion of Rejected H0") ggplot(outpu,aes(x=b1,y=Bias,color=Model))+geom_path(size=2)+base_theme+labs(y="Bias in the slope coefficient")

Here the sample size was set to 100, what we see in the top graph is that for a slope of exactly 0 all model have a similar average proportion of rejection of the null hypothesis. As b1 become smaller or bigger the average proportion of rejection show very similar increase for all model expect for Log0001 which has a slower increase. This curves basically represent the power of the model to detect an effect and is very similar to the Fig2 in the Ives 2015 paper. Now the bottom panel show that all the LM models have bad behaviour concerning their bias, they have only small bias for very small (close to 0) coefficient, has the coefficient gets bigger the absolute bias increase. This means that even if the LM models are able to detect an effect with similar power the estimated coefficient is wrong. This is due to the value added to the untransformed count data in order to avoid -Inf for 0s. I have no idea on how one may take into account arithmetically these added values and then remove its effects …

Next let’s cross variation in the coefficient with sample size variation:

#range over n and b1 output3<-compute.stats(NRep=500,b1.range=seq(-1.5,1.5,length=9),n.range=c(10,20,40,80,160,320,640,1280)) ggplot(output3,aes(x=n,y=Reject,color=Model))+geom_path(size=2)+scale_x_log10()+facet_wrap(~b1)+base_theme+labs(x="Sample size (log-scale)",y="Proportion of rejected H0") ggplot(output3,aes(x=n,y=Bias,color=Model))+geom_path(size=2)+scale_x_log10()+facet_wrap(~b1)+base_theme+labs(x="Sample size (log-scale)",y="Bias in the slope coefficient")

The tope panel show one big issue of focussing only on the significance level: rejection of H0 depend not only on the size of the effect but also on the sample size. For example for b1=0.75 (a rather large value since we work on the exponential scale) less than 50% of all models rejected the null hypothesis for a sample size of 10. Of course as the effect sizes gets larger the impact of the sample size on the rejection of the null hypothesis is reduced. However most effect around the world are small so that we need big sample size to be able to “detect” them using null hypothesis testing. The top graph also shows that NB is slightly better than the other models and that Log0001 is again having the worst performance. The bottom graphs show something interesting, the bias is quasi-constant over the sample size gradient (maybe if we would look closer we would see some variation). Irrespective of how many data point you will collect the LMs will always have bigger bias than the GLMs (expect for the artificial case of b1=0)

To finish with in Ives 2015 the big surprise was the explosion of type I error with the increase in the variation in individual-level random error (adding a random normally distributed value to the linear predictor of each data point and varying the standard deviation of these random values) as can be seen in the Fig3 of the paper.

#range over b1 and sd.eps output4<-compute.statsGLMM(NRep=500,b1.range=seq(-1.5,1.5,length=9),sd.eps.range=seq(0.01,2,length=10)) ggplot(output4,aes(x=sd.eps,y=Reject,color=Model))+geom_path(size=2)+facet_wrap(~b1)+base_theme+labs(x="",y="Proportion of rejected H0") ggplot(output4,aes(x=sd.eps,y=Bias,color=Model))+geom_path(size=2)+facet_wrap(~b1)+base_theme+labs(x="Standard Deviation of the random error",y="Bias of the slope coefficient")

Before looking at the figure in detail please note that a standard deviation of 2 in this context is very high, remember that these values will be added to the linear predictor which will be exponentiated so that we will end up with very large deviation. In the top panel there are two surprising results, the sign of the coefficient affect the pattern of null hypothesis rejection and I do not see the explosion of rejection rates for NB or QuasiP that are presented in Ives 2015. In his paper Ives reported the LRT test for the NB models when I am reporting the p-values from the model summary directly (Wald test). If some people around have computing power it would be interesting to see if changing the seed and/or increasing the number of replication lead to different patterns … The bottom panel show again that the LMs bias are big, the NB and QuasiP models have an increase in the bias with the standard deviation but only if the coefficient is negative (I suspect some issue with the exponentiating of large random positive error), as expected the GLMM perform the best in this context.

Pre-conclusion, in real life of course we would rarely have a model with only one predictor, most of the time we will build larger models with complex interaction structure between the predictors. This will of course affect both H0 rejection and Bias, but this is material for a next post :)

Let’s wrap it up; we’ve seen that even if LM transformation seem to be a good choice for having a lower type I error rate than GLMs this advantage will be rather minimal when using empirical data (no effect are equal to 0) and potentially dangerous (large bias). Ecologists have sometime the bad habits to turn their analysis into a star hunt (R standard model output gives stars to significant effects) and focusing only on using models that have the best behavior towards significance (but large bias) does not seem to be a good strategy to me. More and more people call for increasing the predictive power of ecological model, we need then modelling techniques that are able to precisely (low bias) estimate the effects. In this context transforming the data to make them somehow fit normal assumption is sub-optimal, it is much more natural to think about what type of processes generated the data (normal, poisson, negative binomial, with or without hierarchical structure) and then model it accordingly. There are extensive discussion nowadays about the use and abuse of p-values in science and I think that in our analysis we should slowly but surely shifts our focus from “significance/p-values<0.05/star hunting” only to a more balanced mix of effect-sizes (or standardized slopes), p-values and R-square.

With LM and GLM the predict function can return the standard error for the predicted values on either the observed data or on new data. This is then used to draw confidence or prediction intervals around the fitted regression lines. The confidence intervals (CI) focus on the regression lines and can be interpreted as (assuming that we draw 95% CI): “If we would repeat our sampling X times the regression line would fall between this interval 95% of the time”. On the other hand the prediction interval focus on single data point and could be interpreted as (again assuming that we draw 95% CI): “If we would sample X times at these particular value for the explanatory variables, the response value would fall between this interval 95% of the time”. The wikipedia page has some nice explanation about the meaning of confidence intervals.

For GLMM the predict function does not allow one to derive standard error, the reason being (from the help page of predict.merMod): “There is no option for computing standard errors of predictions because it is difficult to define an efficient method that incorporates uncertainty in the variance parameters”. This means there is for now no way to include in the computation of the standard error for predicted values the fact that the fitted random effect standard deviation are just estimates and may be more or less well estimated. We can however still derive confidence or prediction intervals keeping in mind that we might underestimate the uncertainty around the estimates.

library(lme4) #first case simple lmer, simulate 100 data points from 10 groups with one continuous fixed effect variable x<-runif(100,0,10) f<-gl(n = 10,k = 10) data<-data.frame(x=x,f=f) modmat<-model.matrix(~x,data) #the fixed effect coefficient fixed<-c(1,0.5) #the random effect rnd<-rnorm(10,0,0.7) #the simulated response values data$y<-rnorm(100,modmat%*%fixed+rnd[f],0.3) #model m<-lmer(y~x+(1|f),data) #first CI and PI using predict-like method, using code posted here: http://glmm.wikidot.com/faq newdat<-data.frame(x=seq(0,10,length=20)) mm<-model.matrix(~x,newdat) newdat$y<-mm%*%fixef(m) #predict(m,newdat,re.form=NA) would give the same results pvar1 <- diag(mm %*% tcrossprod(vcov(m),mm)) tvar1 <- pvar1+VarCorr(m)$f[1] # must be adapted for more complex models newdat <- data.frame( newdat , plo = newdat$y-1.96*sqrt(pvar1) , phi = newdat$y+1.96*sqrt(pvar1) , tlo = newdat$y-1.96*sqrt(tvar1) , thi = newdat$y+1.96*sqrt(tvar1) ) #second version with bootMer #we have to define a function that will be applied to the nsim simulations #here we basically get a merMod object and return the fitted values predFun<-function(.) mm%*%fixef(.) bb<-bootMer(m,FUN=predFun,nsim=200) #do this 200 times #as we did this 200 times the 95% CI will be bordered by the 5th and 195th value bb_se<-apply(bb$t,2,function(x) x[order(x)][c(5,195)]) newdat$blo<-bb_se[1,] newdat$bhi<-bb_se[2,] plot(y~x,data) lines(newdat$x,newdat$y,col="red",lty=2,lwd=3) lines(newdat$x,newdat$plo,col="blue",lty=2,lwd=2) lines(newdat$x,newdat$phi,col="blue",lty=2,lwd=2) lines(newdat$x,newdat$tlo,col="orange",lty=2,lwd=2) lines(newdat$x,newdat$thi,col="orange",lty=2,lwd=2) lines(newdat$x,newdat$bhi,col="darkgreen",lty=2,lwd=2) lines(newdat$x,newdat$blo,col="darkgreen",lty=2,lwd=2) legend("topleft",legend=c("Fitted line","Confidence interval","Prediction interval","Bootstrapped CI"),col=c("red","blue","orange","darkgreen"),lty=2,lwd=2,bty="n")

This looks pretty familiar, the prediction interval being always bigger than the confidence interval.

Now in the help page for the predict.merMod function the authors of the lme4 package wrote that bootMer should be the prefered method to derive confidence intervals from GLMM. The idea there is to simulate N times new data from the model and get some statistic of interest. In our case we are interested in deriving the bootstrapped fitted values to get confidence interval for the regression line. bb$t is a matrix with the observation in the column and the different bootstrapped samples in the rows. To get the 95% CI for the fitted line we then need to get the [0.025*N,0.975*N] values of the sorted bootstrapped values.

The bootstrapped CI falls pretty close to the “normal” CI, even if for each bootstrapped sample new random effect values were calculated (because use.u=FALSE per default in bootMer)

Now let’s turn to a more complex example, a Poisson GLMM with two crossed random effects:

#second case more complex design with two crossed RE and a poisson response x<-runif(100,0,10) f1<-gl(n = 10,k = 10) f2<-as.factor(rep(1:10,10)) data<-data.frame(x=x,f1=f1,f2=f2) modmat<-model.matrix(~x,data) fixed<-c(-0.12,0.35) rnd1<-rnorm(10,0,0.7) rnd2<-rnorm(10,0,0.2) mus<-modmat%*%fixed+rnd1[f1]+rnd2[f2] data$y<-rpois(100,exp(mus)) m<-glmer(y~x+(1|f1)+(1|f2),data,family="poisson") #for GLMMs we have to back-transform the prediction after adding/removing the SE newdat<-data.frame(x=seq(0,10,length=20)) mm<-model.matrix(~x,newdat) y<-mm%*%fixef(m) pvar1 <- diag(mm %*% tcrossprod(vcov(m),mm)) tvar1 <- pvar1+VarCorr(m)$f1[1]+VarCorr(m)$f2[1] ## must be adapted for more complex models newdat <- data.frame( x=newdat$x, y=exp(y), plo = exp(y-1.96*sqrt(pvar1)) , phi = exp(y+1.96*sqrt(pvar1)) , tlo = exp(y-1.96*sqrt(tvar1)) , thi = exp(y+1.96*sqrt(tvar1)) ) #second version with bootMer predFun<-function(.) exp(mm%*%fixef(.)) bb<-bootMer(m,FUN=predFun,nsim=200) bb_se<-apply(bb$t,2,function(x) x[order(x)][c(5,195)]) newdat$blo<-bb_se[1,] newdat$bhi<-bb_se[2,] #plot plot(y~x,data) lines(newdat$x,newdat$y,col="red",lty=2,lwd=3) lines(newdat$x,newdat$plo,col="blue",lty=2,lwd=2) lines(newdat$x,newdat$phi,col="blue",lty=2,lwd=2) lines(newdat$x,newdat$tlo,col="orange",lty=2,lwd=2) lines(newdat$x,newdat$thi,col="orange",lty=2,lwd=2) lines(newdat$x,newdat$bhi,col="darkgreen",lty=2,lwd=2) lines(newdat$x,newdat$blo,col="darkgreen",lty=2,lwd=2) legend("topleft",legend=c("Fitted line","Confidence interval","Prediction interval","Bootstrapped CI"),col=c("red","blue","orange","darkgreen"),lty=2,lwd=2,bty="n")

Again in this case the bootstrapped CI falled pretty close to the “normal” CI. We have here seen three different way to derive intervals representing the uncertainty around the regression lines (CI) and the response points (PI). Choosing among them would depend on what you want to see (what is the level of uncertainty around my fitted line vs if I sample new observations which value will they take), but also for complex model on computing power, as bootMer can take some time to run for GLMM with many observations and complex model structure.

Biological diversity (or biodiversity) is a complex concept with many different aspects in it, like species richness, evenness or functional redundancy. My field of research focus on understanding the effect of changing plant diversity on higher trophic levels communities but also ecosystem function. Even if the founding papers of this area of research already hypothesized that all components of biodiversity could influence ecosystem function (See Fig1 in Chapin et al 2000), the first experimental results were focusing on taxonomic diversity (ie species richness, shannon diversity, shannon evenness …). More recently the importance of functional diversity as the main driver of changes in ecosystem function has been emphasized by several authors (ie Reiss et al 2009). The idea behind functional diversity is basically to measure a characteristic (the traits) of the organisms under study, for example the height of a plant or the body mass of an insect, and then derive an index of how diverse these traits values are in a particular sites. A nice introduction into the topic is the Chapter from Evan Weiher in Biological Diversity.

Now as taxonomic diversity has many different indices so do functional diversity, recent developments of a new multidimensional framework and of an R package allow researchers to easily derive functional diversity index from their dataset. But finding the right index for his system can be rather daunting and as several studies showed there is not a single best index (See Weiher Ch.13 in Biological Diversity) but rather a set of different index each showing a different facet of the functional diversity like functional richness, functional evenness or functional divergence.

Here I show a little Shiny App to graphically explore in a 2D trait dimension the meaning of a set of functional diversity indices. The App is still in its infancy and many things could be added (ie variation in trait distribution …) but here it is:

#load libraries library(shiny) library(MASS) library(geometry) library(plotrix) library(FD) #launch App runGitHub("JenaExp",username = "Lionel68",subdir = "FD/Shiny_20150426")

All codes are available here: https://github.com/Lionel68/JenaExp/tree/master/FD

Literature:

Chapin III, F. Stuart, et al. “Consequences of changing biodiversity.” Nature 405.6783 (2000): 234-242.

Reiss, Julia, et al. “Emerging horizons in biodiversity and ecosystem functioning research.” Trends in ecology & evolution 24.9 (2009): 505-514.

Weiher, E. “A primer of trait and functional diversity.” Biological diversity, frontiers in measurement and assessment (2011): 175-193.

Villéger, Sébastien, Norman WH Mason, and David Mouillot. “New multidimensional functional diversity indices for a multifaceted framework in functional ecology.” Ecology 89.8 (2008): 2290-2301.

As always a more colourful version of this post is available on rpubs.

Even if LM are very simple models at the basis of many more complex ones, LM still have some assumptions that if not met would render any interpretation from the models plainly wrong. In my field of research most people were taught about checking ANOVA assumptions using tests like Levene & co. This is however not the best way to check if my model meet its assumptions as p-values depend on the sample size, with small sample size we will almost never reject the null hypothesis while with big sample even small deviation will lead to significant p-values (discussion). As ANOVA and linear models are two different ways to look at the same model (explanation) we can check ANOVA assumptions using graphical check from a linear model. In R this is easily done using *plot(model)*, but people often ask me what amount of deviation makes me reject a model. One easy way to see if the model checking graphs are off the charts is to simulate data from the model, fit the model to these newly simulated data and compare the graphical checks from the simulated data with the real data. If you cannot differentiate between the simulated and the real data then your model is fine, if you can then try again!

EDIT: You can also make the conclusion form your visual inspection more rigorous following the protocols outlined in this article. (Thanks for Tim comment)

Below is a little function that implement this idea:

lm.test<-function(m){ require(plyr) #the model frame dat<-model.frame(m) #the model matrix f<-formula(m) modmat<-model.matrix(f,dat) #the standard deviation of the residuals sd.resid<-sd(resid(m)) #sample size n<-dim(dat)[1] #get the right-hand side of the formula #rhs<-all.vars(update(f, 0~.)) #simulate 8 response vectors from model ys<-lapply(1:8,function(x) rnorm(n,modmat%*%coef(m),sd.resid)) #refit the models ms<-llply(ys,function(y) lm(y~modmat[,-1])) #put the residuals and fitted values in a list df<-llply(ms,function(x) data.frame(Fitted=fitted(x),Resid=resid(x))) #select a random number from 2 to 8 rnd<-sample(2:8,1) #put the original data into the list df<-c(df[1:(rnd-1)],list(data.frame(Fitted=fitted(m),Resid=resid(m))),df[rnd:8]) #plot par(mfrow=c(3,3)) l_ply(df,function(x){ plot(Resid~Fitted,x,xlab="Fitted",ylab="Residuals") abline(h=0,lwd=2,lty=2) }) l_ply(df,function(x){ qqnorm(x$Resid) qqline(x$Resid) }) out<-list(Position=rnd) return(out) }

This function print the two basic plots: one looking at the spread of the residuals around the fitted values, the other one look at the normality of the residuals. The function return the position of the real model in the 3×3 window, counting from left to right and from top to bottom (ie position 1 is upper left graph).

Let’s try the function:

#a simulated data frame of independent variables dat<-data.frame(Temp=runif(100,0,20),Treatment=gl(n = 5,k = 20)) contrasts(dat$Treatment)<-"contr.sum" #the model matrix modmat<-model.matrix(~Temp*Treatment,data=dat) #the coefficient coeff<-rnorm(10,0,4) #simulate response data dat$Biomass<-rnorm(100,modmat%*%coeff,1) #the model m<-lm(Biomass~Temp*Treatment,dat) #model check chk<-lm.test(m)

Can you find which one is the real one? I could not, here is the answer:

chk $Position [1] 4

Happy and safe modelling!

[UPDATED: I modified a bit the code of the function, now you do not need to pass as character the random effect terms]

This article may also be found on RPubs: http://rpubs.com/hughes/63269

In the list of worst to best way to test for effect in GLMM the list on http://glmm.wikidot.com/faq state that parametric bootstrapping is among the best options. PBmodcomp in the pbkrtest package implement such parametric bootstrapping by comparing a full model to a null one. The function simulate data (the response vector) from the null model then fit these data to the null and full model and derive a likelihood ratio test for each of the simulated data. Then we can compare the observed likelihood ratio test to the null distribution generated from the many simulation and derive a p-value. The advantage of using such a method over the classical p-values derived from a chi-square test on the likelihood ratio test is that in the parametric bootstrap we do not assume any null distribution (like chi-square) but instead derive our own null distribution from the model and the data at hand. We do not make the assumption then that the likelihood ratio test statistic is chi-square distributed. I have made a little function that wraps around the PBmodcomp function to compute bootstrapped p-values for each term in a model by sequentially adding them. This lead to anova-like table that are typically obtained when one use the command anova on a glm object.

#the libraries used library(lme4) library(arm) library(pbkrtest) #the function with the following arguments #@model the merMod model fitted by lmer or glmer #@w the weights used in a binomial fit #@seed you can set a seed to find back the same results after bootstrapping #@nsim the number of bootstrapped simulation, if set to 0 return the Chisq p-value from the LRT test #@fixed should the variable be dropped based on the order given in the model (TRUE) or the the dropping goes from worst to best variable (FALSE) anova_merMod<-function(model,w=FALSE,seed=round(runif(1,0,100),0),nsim=0,fixed=TRUE){ require(pbkrtest) data<-model@frame if(w){ weight<-data[,which(names(data)=="(weights)")] data<-data[,-which(names(data)=="(weights)")] } f<-formula(model) resp<-as.character(f)[2] rand<-lme4:::findbars(f) rand<-lapply(rand,function(x) as.character(x)) rand<-lapply(rand,function(x) paste("(",x[2],x[1],x[3],")",sep=" ")) rand<-paste(rand,collapse=" + ") #generate a list of reduced model formula fs<-list() fs[[1]]<-as.formula(paste(resp,"~ 1 +",rand)) nb_terms<-length(attr(terms(model),"term.labels")) if(nb_terms>1){ #the next two line will make that the terms in the formula will add first the most important term, and the least important one at the end #going first through the interactions and then through the main effects mat<-data.frame(term=attr(terms(model),"term.labels"),SSQ=anova(model)[,3],stringsAsFactors = FALSE) mat_inter<-mat[grep(":",mat$term),] mat_main<-mat[!rownames(mat)%in%rownames(mat_inter),] if(!fixed){ mat_main<-mat_main[do.call(order,list(-mat_main$SSQ)),] mat_inter<-mat_inter[do.call(order,list(-mat_inter$SSQ)),] mat<-rbind(mat_main,mat_inter) } for(i in 1:nb_terms){ tmp<-c(mat[1:i,1],rand) fs[[i+1]]<-reformulate(tmp,response=resp) } } else{ mat<-data.frame(term=attr(terms(model),"term.labels"),stringsAsFactors = FALSE) } #fit the reduced model to the data fam<-family(model)[1]$family if(fam=="gaussian"){ m_fit<-lapply(fs,function(x) lmer(x,data,REML=FALSE)) } else if(fam=="binomial"){ m_fit<-lapply(fs,function(x) glmer(x,data,family=fam,weights=weight)) } else{ m_fit<-lapply(fs,function(x) glmer(x,data,family=fam)) } if(nb_terms==1){ if(fam=="gaussian"){ m_fit[[2]]<-lmer(formula(model),data,REML=FALSE) } else if(fam=="binomial"){ m_fit[[2]]<-glmer(formula(model),data,family=fam,weights=weight) } else{ m_fit[[2]]<-glmer(formula(model),data,family=fam) } } #compare nested model with one another and get LRT values (ie increase in the likelihood of the models as parameters are added) tab_out<-NULL for(i in 1:(length(m_fit)-1)){ if(nsim>0){ comp<-PBmodcomp(m_fit[[i+1]],m_fit[[i]],seed=seed,nsim=nsim) term_added<-mat[i,1] #here are reported the bootstrapped p-values, ie not assuming any parametric distribution like chi-square to the LRT values generated under the null model #these p-values represent the number of time the simulated LRT value (under null model) are larger than the observe one tmp<-data.frame(term=term_added,LRT=comp$test$stat[1],p_value=comp$test$p.value[2]) tab_out<-rbind(tab_out,tmp) print(paste("Variable ",term_added," tested",sep="")) } else{ comp<-anova(m_fit[[i+1]],m_fit[[i]]) term_added<-mat[i,1] tmp<-data.frame(term=term_added,LRT=comp[2,6],p_value=comp[2,8]) tab_out<-rbind(tab_out,tmp) } } print(paste("Seed set to:",seed)) return(tab_out) }

You pass your GLMM model to the function together with the random part as character (see example below), if you fitted a binomial GLMM you also need to provide the weights as a vector, you can then set a seed and the last argument is the number of simulation to do, it is set by default to 50 for rapid checking purpose but if you want to report these results larger values (ie 1000, 10000) should be used.

Let’s look at a simple LMM example:

data(grouseticks) m<-lmer(TICKS~cHEIGHT+YEAR+(1|BROOD),grouseticks) summary(m) ## Linear mixed model fit by REML ['lmerMod'] ## Formula: TICKS ~ cHEIGHT + YEAR + (1 | BROOD) ## Data: grouseticks ## ## REML criterion at convergence: 2755 ## ## Scaled residuals: ## Min 1Q Median 3Q Max ## -3.406 -0.246 -0.036 0.146 5.807 ## ## Random effects: ## Groups Name Variance Std.Dev. ## BROOD (Intercept) 87.3 9.34 ## Residual 28.1 5.30 ## Number of obs: 403, groups: BROOD, 118 ## ## Fixed effects: ## Estimate Std. Error t value ## (Intercept) 5.4947 1.6238 3.38 ## cHEIGHT -0.1045 0.0264 -3.95 ## YEAR96 4.1910 2.2424 1.87 ## YEAR97 -4.3304 2.2708 -1.91 ## ## Correlation of Fixed Effects: ## (Intr) cHEIGH YEAR96 ## cHEIGHT -0.091 ## YEAR96 -0.726 0.088 ## YEAR97 -0.714 0.052 0.518 anova_merMod(model=m,nsim=100) ## [1] "Variable cHEIGHT tested" ## [1] "Variable YEAR tested" ## [1] "Seed set to: 23" ## term LRT p_value ## 1 cHEIGHT 14.55054 0.00990099 ## 2 YEAR 14.40440 0.00990099

The resulting table show for each term in the model the likelihood ratio test, which is basically the decrease in deviance when going from the null to the full model and the p value, you may look at the PBtest line in the details of ?PBmodcomp to see how it is computed.

Now let’s see how to use the function with binomial GLMM:

#simulate some binomial data x1<-runif(100,-2,2) x2<-runif(100,-2,2) group<-gl(n = 20,k = 5) rnd.eff<-rnorm(20,mean=0,sd=1.5) p<-1+0.5*x1-2*x2+rnd.eff[group]+rnorm(100,0,0.3) y<-rbinom(n = 100,size = 10,prob = invlogit(p)) prop<-y/10 #fit a model m<-glmer(prop~x1+x2+(1|group),family="binomial",weights = rep(10,100)) summary(m) ## Generalized linear mixed model fit by maximum likelihood (Laplace ## Approximation) [glmerMod] ## Family: binomial ( logit ) ## Formula: prop ~ x1 + x2 + (1 | group) ## Weights: rep(10, 100) ## ## AIC BIC logLik deviance df.resid ## 288.6 299.1 -140.3 280.6 96 ## ## Scaled residuals: ## Min 1Q Median 3Q Max ## -2.334 -0.503 0.181 0.580 2.466 ## ## Random effects: ## Groups Name Variance Std.Dev. ## group (Intercept) 1.38 1.18 ## Number of obs: 100, groups: group, 20 ## ## Fixed effects: ## Estimate Std. Error z value Pr(>|z|) ## (Intercept) 0.748 0.287 2.61 0.0092 ** ## x1 0.524 0.104 5.02 5.3e-07 *** ## x2 -2.083 0.143 -14.56 < 2e-16 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Correlation of Fixed Effects: ## (Intr) x1 ## x1 0.090 ## x2 -0.205 -0.345</code></pre> #takes some time anova_merMod(m,w = rep(10,100),nsim=100)</pre> ## [1] "Variable x1 tested" ## [1] "Variable x2 tested" ## [1] "Seed set to: 98" ## term LRT p_value ## 1 x1 0.0429 0.80392 ## 2 x2 502.0921 0.01961

For binomial model, the model must be fitted with proportion data and a vector of weights (ie the number of binomial trial) must be passed to the ‘w’ argument. Some warning message may pop up at the end of the function, this comes from convergence failure in PBmodcomp, this do not affect the results, you may read the article from the pbkrtest package: http://www.jstatsoft.org/v59/i09/ to understand better where this comes from.

Happy modeling and as Ben Bolker say: “When all else fails, don’t forget to keep p-values in perspective: http://www.phdcomics.com/comics/archive.php?comicid=905 “