In this example, we compare the Bayesian model output with the linear model fit. Although the interpretation of these two fits is different, we expect similar numerical results provided the priors are not informative.

See the introduction for an overview. Load the libraries: (you may need to install them first)

library(ggplot2)
library(INLA)
library(faraway)
library(gridExtra)
library(brinla)

Data

Load in the Chicago Insurance dataset as analyzed in Linear Models with R:

data(chredlin, package="faraway")

We take involact as the response. Make some plots of the data:

p1=ggplot(chredlin,aes(race,involact)) + geom_point() 
p2=ggplot(chredlin,aes(fire,involact)) + geom_point() 
p3=ggplot(chredlin,aes(theft,involact)) + geom_point()
p4=ggplot(chredlin,aes(age,involact)) + geom_point() 
p5=ggplot(chredlin,aes(income,involact)) + geom_point() 
grid.arrange(p1,p2,p3,p4,p5)

Linear model analysis

Fit the standard linear model for reference purposes:

lmod <- lm(involact ~ race + fire + theft + age + log(income),  chredlin)
sumary(lmod)
            Estimate Std. Error t value Pr(>|t|)
(Intercept) -1.18554    1.10025   -1.08  0.28755
race         0.00950    0.00249    3.82  0.00045
fire         0.03986    0.00877    4.55  4.8e-05
theft       -0.01029    0.00282   -3.65  0.00073
age          0.00834    0.00274    3.04  0.00413
log(income)  0.34576    0.40012    0.86  0.39254

n = 47, p = 6, Residual SE = 0.335, R-Squared = 0.75

INLA default model

Run the default INLA model and compare to the lm output:

formula <- involact ~ race + fire + theft + age + log(income)
imod <- inla(formula, family="gaussian", data=chredlin)
cbind(imod$summary.fixed[,1:2],summary(lmod)$coef[,1:2])
                  mean        sd   Estimate Std. Error
(Intercept) -1.1853880 1.1019176 -1.1855396  1.1002549
race         0.0095020 0.0024934  0.0095022  0.0024896
fire         0.0398555 0.0087800  0.0398560  0.0087661
theft       -0.0102944 0.0028224 -0.0102945  0.0028179
age          0.0083354 0.0027483  0.0083356  0.0027440
log(income)  0.3457059 0.4007275  0.3457615  0.4001234

The first two columns are the posterior means and SDs from the INLA fit while the third and fourth are the corresponding values from the lm fit. We can see they are almost identical. We expect this.

We can get the posterior for the SD of the error as:

bri.hyperpar.summary(imod)
                                    mean       sd  q0.025    q0.5  q0.975    mode
SD for the Gaussian observations 0.33242 0.036498 0.27002 0.32913 0.41322 0.32269

which compares with the linear model estimate of:

summary(lmod)$sigma
[1] 0.33453

which gives a very similar result. But notice that the Bayesian output gives us more than just a point estimate.

The function inla.hyperpar() is advertized to improve the estimates of the hyperparameters (just sigma here) but makes very little difference in this example:

bri.hyperpar.summary(inla.hyperpar(imod))
                                    mean       sd  q0.025    q0.5  q0.975    mode
SD for the Gaussian observations 0.33245 0.036554 0.27004 0.32914 0.41339 0.32274

Plots of the posteriors

For the error SD

bri.hyperpar.plot(imod)

For the fixed parameters:

bri.fixed.plot(imod)