General linear models are one of the most widely used statistical toolin the biological sciences. This may be because they are so flexible andthey can address many different problems, that they provide usefuloutputs about statistical significance AND effect sizes, or just thatthey are easy to run in many common statistical packages. The maths underlying General Linear Models (and Generalized linearmodels, which are a related but different class of model) may seemmysterious to many, but are actually pretty accessible. You would havelearned the basics in high school maths. We will cover some of those basics here. Linear equations As the name suggests General Linear Models rely on a linear equation,which in its basic form is simply: yi = α + βx*i* + ϵ*i The equation for a straight line, with some error added on. If you aren’t that familiar with mathematical notation, notice a fewthings about this…

Original Post: General Linear Models The Basics

# Posts by Bluecology blog

## Checking residual distributions for non-normal GLMs

Quantile-quantile plots If you are fitting a linear regression with Gaussian (normallydistributed) errors, then one of the standard checks is to make sure theresiduals are approximately normally distributed. It is a good idea to do these checks for non-normal GLMs too, to makesure your residuals approximate the model’s assumption. Here I explain how to create quantile-quantile plots for non-normaldata, using an example of fitting a GLM using Student-t distributederrors. Such models can be appropriate when the residuals areoverdispersed. First let’s create some data. We will make a linear predictor (ie thetrue regression line) eta and then simulate some data by addingresiduals. We will simulate two data-sets that have the same linearpredictor, but the first will have normally distributed errors and thesecond will have t distributed errors: n <- 100 phi <- 0.85 mu <- 0.5 set.seed(23) x <- rnorm(n) eta…

Original Post: Checking residual distributions for non-normal GLMs

## Some thoughts about Bayesian Kriging in INLA

I have been playing around with spatial modelling in the R INLA package. This blog just records a few thoughts I have had about using INLA for kriging (spatial interpolation). I am keen to discuss these ideas with others. Kriging is super useful tool for ‘filling in the gaps’ between sampling sites. e.g. see this map . Handy if you want to make a map, or need to match up two spatial data sets that overlap in extent, but have samples at different locations. You can do kriging the old fashioned way in R or even in ARC GIS. The advantage of using INLA though is that you can use Bayesian inference to do your kriging. This means you can interpolate non-normal error structures (like counts or presence/absence data). You can also include other fixed covariates, like a spatial layer…

Original Post: Some thoughts about Bayesian Kriging in INLA

## Impact of the conservation optimism hashtag

Impact of the conservation optimism hashtag The hashtag #conservationoptimism became popular during the recent International Congress for Conservation Biology symposium. Michael Burgass asked me what its twitter impact was, so here is a quick analysis. Michael also asked about #iamaconservationist, but that has too few tweets so far to make for a meaningful analysis (about 100). To be honest, not too much has happened on this hashtag so far, so it is hard to say too much from a broad brush quantitative analysis like this, but here are a few stats. The first tweet I got was only on 2017-07-26 and until now there have been 813 tweets from 390 users of which 133 are original tweets and the rest RTs. I think that spike on Wednesday corresponds to EJ Milner-Gulland’s plenary at ICCB. The distribution of tweets per…

Original Post: Impact of the conservation optimism hashtag

## Memorable dataviz with the R program, talk awarded people’s choice prize

“Memorable dataviz with the R program” awarded people’s choice prize For the past two years Dr Nick Hamilton has invited me to give a talk on creating data visuals with the R program at the wonderful UQ Winterschool inBioinformatics. This year I was lucky enough to be awarded a prize for my talk – best speaker from a mid-career presenter, as voted by the audience. Nick and the UQ Winterschool team have been kind enough to post my talk on Vimeo, so I am sharing it here in the hope that others find it useful. You can also get all the talk notes (and code) on my blog here I think it is something of a feat to have a talk win a people’s choice award, when that talk is fundamentally about computer programming. The talk’s success speaks not…

Original Post: Memorable dataviz with the R program, talk awarded people’s choice prize

## What are people saying about ICCB2017 on Twitter?

What are people saying about International Congress for Conservation Biology 2017 on Twitter? Here’s a brief analysis of tweets from the International Congress for Conservation Biology to date. The conference started on the starting 23rd July I was curious to see what people are talking about there and who is doing the talking on twitter. As of 25th July I could access 6978 tweets and retweets from the conference, starting on 2017-07-15. If you are wondering how these stats compare to other conferences, check out my post from ICRS last year. Who is talking There have been about 1500 users on #ICCB2017 so far. Clearly, the people talking on twitter are a biased selection of people at ICCB2017 and may also include people not there (like me). As usual, a lot people only tweet once or twice. The users…

Original Post: What are people saying about ICCB2017 on Twitter?

## What analysis program do conservation scientists use?

International Congress for Conservation Biology: What analysis program do conservation scientists use? With the International Congress for Conservation Biology starting 23rd July I was wondering, what analysis programs are most commonly used by conservation scientists? And, what do they use for spatial analysis and mapping? To find out, if you are a conservation scientist please participate in these interactive polls: Favourite analysis programs Favourite programs for spatial analysis Please share this post with your friends and colleagues. I will follow up with a discussion once we get some results in, hopefully during ICCB2017 in Columbia. If you choose one of the ‘other’ options, let me know via twitter what program you do use. Related
To leave a comment for the author, please follow the link and comment on their blog: Bluecology blog. R-bloggers.com offers daily e-mail updates about…

Original Post: What analysis program do conservation scientists use?

## Data visuals notes for my talks in 2017

Data visuals: notes for my talks in 2017 Supplementary notes for CJ Brown’s talks on dataviz in 2017 for GriffithUniversity’s honours students and the UQ Winterschool inBioinformatics. Skip to the quiz Visualsing sexual dimorphism in elephant seals I picked this example to demonstrate a simple barchart for representingthe sizes of different things. Comparing volume and area Compare these. Note that if we compare circles we should use area, notthe radius or diameter to scale their size. Exploration of data Let’s create a point cloud to demonstrate some data explorationtechniques set.seed(42) x Can’t see alot here. A linear model might help us explore if there isany trend going on: What about identifying extreme points, that may be worth investigatingfurther? We can pick out points that are greater than 2SDs from thetrend: modresid sd2) plot(x,y, pch = 16, col = grey(0.5,0.5),…

Original Post: Data visuals notes for my talks in 2017

## Smoothing a time-series with a Bayesian model

Smoothing a time-series with a Bayesian model Recently I looked at fitting a smoother to atime-series usingBayesian modelling.Now I will look at how you can control the smoothness by using more orless informative priors on the precision (1/variance) of the randomeffect. We will use the same dataset as the lastpost. To control the priors for an R-INLA model, weuse the hyper argument (not hyperactive, but hyper-parameters): library(INLA) f3 We can control the level of smoothing through param=c(theta1,0.01). Avalue of 1 (theta1) is a reasonable starting point (based on the INLAdocumentation).Lower values will result in a smoother fit. The pc.param stands for Penalized complexity parameters (you couldalso use a loggamma prior here). My understanding of penalizedcomplexity priors is that they shrinkthe parameter estimate towards a ‘base-model’ that is less flexible. Inthis case, we are shrinking the standard deviation (AKA the…

Original Post: Smoothing a time-series with a Bayesian model

## Quantifying the magnitude of a population decline with Bayesian time-series modelling

Quantifying the magnitude of a population decline with Bayesian time-series modelling Population abundances tend to vary year to year. This variation can makeit make it hard detect a change and hard to quantify exactly what thatchange is. Bayesian time-series analysis can help us quantify a decline and putuncertainty bounds on it too. Here I will use the R-INLApackage to fit a time-series model to apopulation decline. For instance, take the pictured time-series. Quantifying change as thedifference between the first and last time-points is obviouslymisleading. Doing so would imply that abundance has declined by 77% fromthe historical value. Another approach would be to compare the average of the first and lastdecades. Doing so would yield a 72% decline. A better way might be to model the population trend over time and thenestimate our change from the model. An advantage of…

Original Post: Quantifying the magnitude of a population decline with Bayesian time-series modelling