rstanarm.Rmd

# (PART\*) Part II: rstanarm {-}


# Getting Started with rstanarm

- <span class="pack">rstan</span> installation required.
- Installed as any other R package


# Basic GLM

Attendance behavior of high school juniors at two schools  

Target:

- Number of days of absence

Predictors:

- Type of program in which the student is enrolled: 
    - General, Vocational, Academic
- Standardized test in math
- Gender

## Traditional GLM

```{r basic_poisson_data_prep, eval=FALSE, echo=-10}
library(tidyverse)
attendance = haven::read_dta("https://stats.idre.ucla.edu/stat/stata/dae/nb_data.dta")
attendance <- attendance %>% 
  mutate(
    prog = factor(prog, levels = 1:3, labels = c("General", "Academic", "Vocational")),
    prog = fct_relevel(prog, c('Vocational', 'General', 'Academic')),
    gender = factor(gender, labels = c('Female', 'Male')),
    id = factor(id)
  )
saveRDS(attendance, 'data/attendance.Rds')
```


We'll use Poisson regression[^poisson] to model the count of the number of days absent


```{r basic_poisson, echo=-1, eval=-3}
attendance = readRDS('data/attendance.Rds')
attendance_glm <- glm(daysabs ~ math + gender + prog,
                      data = attendance, 
                      family = poisson)
summary(attendance_glm)
```

```{r basic_poisson_result, echo=FALSE, eval=TRUE}
tidy(attendance_glm) %>% 
  kable_df()
```


## rstanarm: GLM

<span class="pack">rstanarm</span> uses the same nomenclature and general approach as base R

```{r bayesian_poisson0, echo=FALSE}
attendance_bglm <- stan_glm(daysabs ~ math + gender + prog,
                            data = attendance, 
                            family = poisson)
attendance_brms <- brm(daysabs ~ math + gender + prog,
                       data = attendance, 
                       family = poisson, cores=4, thin=10)
save(attendance, attendance_bglm, attendance_brms, attendance_brms_add_re, attendance_brms_inter, file = 'data/attendance_bayes.RData')
```

```{r bayesian_poisson}
library(rstanarm)
attendance_bglm <- stan_glm(daysabs ~ math + gender + prog,
                            data = attendance, 
                            family = poisson)
summary(attendance_bglm, digits = 2, prob=c(.025, .5, .975))
```

```{r bayesian_poisson_pretty, eval=TRUE, echo=FALSE, cache.rebuild=F}
load('data/attendance_bayes.RData')
summary(attendance_bglm, digits = 2, prob=c(.5, .025, .975))
```

Summary Info:

This is the same as you see in every other regression model:

- mean: the point estimate for the parameter
- sd: standard error for the point estimate
- quantiles: are whatever you want, but here represent the median and 95% <span class="emph">uncertainty inteval</span>

Additional:

- mean_PPD: mean of the posterior predictive distribution (hopefully on par with the mean of the target variable (`daysabs`))
- log-posterior: similar to the log-likelihood from maximum likelihood, but for the Bayesian case


Diagnostics for quick eyeball inspection:

- <span class="emph">Monte Carlo Standard Error</span>: The standard error of the *mean* of the posterior draws. Want mcse than 10% of the posterior standard deviation.
- <span class="emph">$n_{eff}$</span>: is an estimate of the effective number of independent draws from the posterior distribution of the estimand of interest. Because the draws within a chain are not independent if there is autocorrelation, the effective sample size will be smaller than the total number of iterations. Should be greater than 10% of max.
- <span class="emph">$\hat{R}$</span>: measures the ratio of the average variance of samples within each chain to the variance of the pooled samples across chains; if all chains are at equilibrium, these will be the same and R̂  will be one. Desire less than 1.1.

### Adding more options

Typical configuration would involve setting priors, as well as MCMC options such as iterations, warm-up, etc.

```{r bayesian_poisson_options, echo=T}
attendance_bglm <- stan_glm(daysabs ~ math + gender + prog,
                            data = attendance, 
                            family = poisson, 
                            prior = student_t(df = 7), 
                            prior_intercept = student_t(df = 7),
                            iter = 5000,
                            warmup = 2000,
                            thin = 10,
                            cores = 4, 
                            seed = 1234)
```


## rstanarm: Mixed Model

Let's look at a mixed model for another demonstration

- The average reaction time per day for subjects in a sleep deprivation study
- On day 0 the subjects had their normal amount of sleep
- Subsequently restricted to 3 hours of sleep per night
- The observations represent the average reaction time on a series of tests 

We'll have a random intercept and random coefficient for Days

```{r mixed, eval=TRUE}
library(lme4)
sleepstudy_lmer <- lmer(Reaction ~ Days + (1 + Days|Subject), 
                        data = sleepstudy)
summary(sleepstudy_lmer)
```

Again, <span class="pack">rstanarm</span> sticks with the same style

```{r bayesian_mixed, echo=1:3}
sleepstudy_blmer <- stan_lmer(Reaction ~ Days + (1 + Days|Subject), 
                              data = sleepstudy)
summary(sleepstudy_blmer, digits = 3)
save(sleepstudy_lmer, sleepstudy_blmer, sleepstudy_brms, file = 'data/sleepstudy_bayes.RData')
```

```{r load_mixed_models, echo=FALSE, eval=TRUE}
load('data/sleepstudy_bayes.RData')
print(sleepstudy_blmer, digits = 3)
```


In the Bayesian model, the random effects are not BLUPS, but are parameters estimates in the model

In this case, we see a little more shrinkage relative to the standard approach

The following are obtained from the same <span class="func">ranef</span> function used in <span class="pack">lme4</span>

```{r ranef_compare, eval=TRUE, echo=FALSE}
data_frame(lme4 = ranef(sleepstudy_lmer)[[1]][['(Intercept)']],
           bayesian = ranef(sleepstudy_blmer)[[1]][['(Intercept)']]) %>% 
  kable_df(digits=1)
```


## rstanarm: Other Models

- ANOVA
- Beta regression
- Conditional logistic
- GLM including negative binomial models
- Generalized additive models
- Nonlinear and Generalized mixed models
- 'Joint' models for longitudinal and time-to-event (e.g. survival)
- Multivariate
- Ordinal models


# Priors

## Default priors

Depends on the model

For most models: 

- intercepts are treated differently
- regression coefficients have mean zero with some specific variance
- scale parameters (e.g. residual variance) will have appropriate priors

Basically, <span class="pack">rstanarm</span> is going to be okay for you to use defaults

If you want to change, there are plenty of resources about priors:

`?prior_summary`

`?priors`


http://mc-stan.org/rstanarm/articles/priors.html

https://github.com/stan-dev/stan/wiki/Prior-Choice-Recommendations


## Getting priors

```{r getting_priors, eval=TRUE}
prior_summary(attendance_bglm)
```

## Setting priors

One can set priors with the appropriate arguments to the model function


| Argument       | Used in  | Applies to |
| ------------- | ------------- | ------------- |
| `prior_intercept` | All modeling functions except `stan_polr` and `stan_nlmer`| Model intercept, after centering predictors.|
| `prior` |  All modeling functions| Regression coefficients. Does _not_ include coefficients that vary by group in a multilevel model (see `prior_covariance`).|
| `prior_aux` | `stan_glm`\*, `stan_glmer`\*, `stan_gamm4`, `stan_nlmer`| Auxiliary parameter, e.g. error SD (interpretation depends on the GLM).|
| `prior_covariance` | `stan_glmer`\*, `stan_gamm4`, `stan_nlmer`| Covariance matrices in multilevel models with varying slopes and intercepts. See the [`stan_glmer` vignette](http://mc-stan.org/rstanarm/articles/glmer.html) for details on this prior.|


<br> 

The `stan_polr`, `stan_betareg`, and `stan_gamm4` functions also provide
additional arguments specific only to those models:

| Argument       | Used only in  | Applies to |
| ------------- | ------------- | ------------- |
| `prior_smooth` | `stan_gamm4` | Prior for hyper-parameters in GAMs (lower values yield less flexible smooth functions). |
| `prior_counts` | `stan_polr` | Prior counts of an _ordinal_ outcome (when predictors at sample means). |
| `prior_z`  |  `stan_betareg`| Coefficients in the model for `phi`.|
| `prior_intercept_z` | `stan_betareg`| Intercept in the model for `phi`. |
| `prior_phi` | `stan_betareg`| `phi`, if not modeled as function of predictors. |


### Example

```{r setting_priors}
attendance_bglm <- stan_glm(daysabs ~ math + gender + prog,
                            data = attendance, 
                            family = poisson, 
                            prior = student_t(df = 7))
```


[^poisson]: If not familiar with Poisson regression, we are modeling the log counts as a function of the covariates.  Often the exponentiated coefficients are reported.  For example, `exp(coef(attendance_glm)['genderMale'])` is `r round(exp(coef(attendance_glm)['genderMale']),3)`. Subtracting 1 tells us there is a `r round(exp(coef(attendance_glm)['genderMale']) -1, 3)*100`% decrease in the incident rate of days absent as we switch from female to male.