This is something that bugged me for some time, how do we add up standard errors? This is relevant when you fit a model with interaction terms and you are interested not only in the deviation between different categories in your data (like male, female juvenils) but also whether the effect of some covariates on … Continue reading Adding standard errors for interaction terms

# Tag: Statistics

While reading the method section of a recent article by Solivares et al, I came upon the following paragraph: "The inclusion of many predictors in statistical models increases the chance of type I error (false positives). To account for this we used a Bernoulli process to detect false discovery rates, where the probability (P) of … Continue reading How not to control for multiple testing

Science is also about convincing others, be it your supervisors or your collaborators, that what you intend to do is relevant to get the answers sought. This “what” can be anything from designing an experiment, to collecting samples or formatting the data. This post was inspired by a recent discussion with one of my supervisor … Continue reading Making a case for hierarchical generalized models

Below I will expand on previous posts on bayesian regression modelling using STAN (see previous instalments here, here, and here). Topic of the day is modelling crossed and nested design in hierarchical models using STAN in R. Crossed design appear when we have more than one grouping variable and when data are recorded for each … Continue reading Crossed and Nested hierarchical models with STAN and R

Real-world data sometime show complex structure that call for the use of special models. When data are organized in more than one level, hierarchical models are the most relevant tool for data analysis. One classic example is when you record student performance from different schools, you might decide to record student-level variables (age, ethnicity, social … Continue reading Hierarchical models with RStan (Part 1)

[Updated 22nd January 2017, corrected mistakes for getting the fixed effect estimates of factor variables that need to be averaged out] Once models have been fitted and checked and re-checked comes the time to interpret them. The easiest way to do so is to plot the response variable versus the explanatory variables (I call them … Continue reading Plotting regression curves with confidence intervals for LM, GLM and GLMM in R

Count data are widely collected in ecology, for example when one count the number of birds or the number of flowers. These data follow naturally a Poisson or negative binomial distribution and are therefore sometime tricky to fit with standard LMs. A traditional approach has been to log-transform such data and then fit LMs to … Continue reading Count data: To Log or Not To Log

Before going into complex model building, looking at data relation is a sensible step to understand how your different variable interact together. Correlation look at trends shared between two variables, and regression look at causal relation between a predictor (independent variable) and a response (dependent) variable. Correlation: As mentionned above correlation look at global movement … Continue reading Correlation and Linear Regression in R

Here you will learn about transforming, merging, ordering a data frame, changing the column order, removing a variable, subsetting and indexing Transforming; This means put the rows as columns and the columns as the rows, this is done very easily in one line: > data(mtcars) > mtcars<-t(mtcars) Merging two Data Frame; Often it happen that … Continue reading Manipulating data frame in R

In this article I will introduce to you the functions that make your life in R so much easier. For example purposes I will use the “mtcars” data frame. The ? functions: This is the most important one, it print the help content related to the functions, for example: > ?plot Will provide you with … Continue reading The 6 functions that save your life (in R)