zoukankan      html  css  js  c++  java
  • Fitting Bayesian Linear Mixed Models for continuous and binary data using Stan: A quick tutorial

    I want to give a quick tutorial on fitting Linear Mixed Models (hierarchical models) with a full variance-covariance matrix for random effects (what Barr et al 2013 call a maximal model) using Stan. 

    For a longer version of this tutorial, see: Sorensen, Hohenstein, Vasishth, 2016

    Prerequisites: You need to have R and preferably RStudio installed; RStudio is optional. You need to have rstan installed. See here. I am also assuming you have fit lmer models like these before:

    lmer(log(rt) ~ 1+RCType+dist+int+(1+RCType+dist+int|subj) + (1+RCType+dist+int|item), dat)
    

    If you don't know what the above code means, first read chapter 4 of my lecture notes.

    The code and data format needed to fit LMMs in Stan

    The data

    I assume you have a 2x2 repeated measures design with some continuous measure like reading time (rt) data and want to do a main effects and interaction contrast coding. Let's say your main effects are RCType and dist, and the interaction is coded as int. All these contrast codings are ±1. If you don't know what contrast coding is, see these notes and read section 4.3 (although it's best to read the whole chapter). I am using an excerpt of an example data-set from Husain et al. 2014.

    "subj" "item" "rt""RCType" "dist" "int"
    1       14    438  -1        -1      1
    1       16    531   1        -1     -1
    1       15    422   1         1      1
    1       18   1000  -1        -1      1 
    ...
    

    Assume that these data are stored in R as a data-frame with name rDat.

    The Stan code

    Copy the following Stan code into a text file and save it as the file matrixModel.stan. For continuous data like reading times or EEG, you never need to touch this file again. You will only ever specify the design matrix X and the structure of the data. The rest is all taken care of.

    data {
      int<lower=0> N;               //no trials
      int<lower=1> P;               //no fixefs
      int<lower=0> J;               //no subjects
      int<lower=1> n_u;             //no subj ranefs
      int<lower=0> K;               //no items
      int<lower=1> n_w;             //no item ranefs
      int<lower=1,upper=j> subj[N]; //subject indicator
      int<lower=1,upper=k> item[N]; //item indicator
      row_vector[P] X[N];           //fixef design matrix
      row_vector[n_u] Z_u[N];       //subj ranef design matrix
      row_vector[n_w] Z_w[N];       //item ranef design matrix
      vector[N] rt;                 //reading time
    }
    
    parameters {
      vector[P] beta;               //fixef coefs
      cholesky_factor_corr[n_u] L_u;  //cholesky factor of subj ranef corr matrix
      cholesky_factor_corr[n_w] L_w;  //cholesky factor of item ranef corr matrix
      vector<lower=0>[n_u] sigma_u; //subj ranef std
      vector<lower=0>[n_w] sigma_w; //item ranef std
      real<lower=0> sigma_e;        //residual std
      vector[n_u] z_u[J];           //spherical subj ranef
      vector[n_w] z_w[K];           //spherical item ranef
    }
    
    transformed parameters {
      vector[n_u] u[J];             //subj ranefs
      vector[n_w] w[K];             //item ranefs
      {
        matrix[n_u,n_u] Sigma_u;    //subj ranef cov matrix
        matrix[n_w,n_w] Sigma_w;    //item ranef cov matrix
        Sigma_u = diag_pre_multiply(sigma_u,L_u);
        Sigma_w = diag_pre_multiply(sigma_w,L_w);
        for(j in 1:J)
          u[j] = Sigma_u * z_u[j];
        for(k in 1:K)
          w[k] = Sigma_w * z_w[k];
      }
    }
    
    model {
      //priors
      L_u ~ lkj_corr_cholesky(2.0);
      L_w ~ lkj_corr_cholesky(2.0);
      for (j in 1:J)
        z_u[j] ~ normal(0,1);
      for (k in 1:K)
        z_w[k] ~ normal(0,1);
      //likelihood
      for (i in 1:N)
        rt[i] ~ lognormal(X[i] * beta + Z_u[i] * u[subj[i]] + Z_w[i] * w[item[i]], sigma_e);
    }
    

    Define the design matrix

    Since we want to test the main effects coded as the columns RCType, dist, and int, our design matrix will look like this:

    # Make design matrix
    X <- unname(model.matrix(~ 1 + RCType + dist + int, rDat))
    attr(X, "assign") <- NULL
    

    Prepare data for Stan

    Stan expects the data in a list form, not as a data frame (unlike lmer). So we set it up as follows:

    # Make Stan data
    stanDat <- list(N = nrow(X),
    P = ncol(X),
    n_u = ncol(X),
    n_w = ncol(X),
    X = X,
    Z_u = X,
    Z_w = X,
    J = nlevels(rDat$subj),
    K = nlevels(rDat$item),
    rt = rDat$rt,
    subj = as.integer(rDat$subj),
    item = as.integer(rDat$item))
    

    Load library rstan and fit Stan model

    library(rstan) 
    rstan_options(auto_write = TRUE)
    options(mc.cores = parallel::detectCores())
    
    # Fit the model
    matrixFit <- stan(file = "matrixModel.stan", data = stanDat,
    iter = 2000, chains = 4)
    

    Examine posteriors

    print(matrixFit)
    

    This print output is overly verbose. I wrote a simple function to get the essential information quickly.

    stan_results<-function(m,params=paramnames){
      m_extr<-extract(m,pars=paramnames)
      par_names<-names(m_extr)
      means<-lapply(m_extr,mean)
      quantiles<-lapply(m_extr,
                        function(x)quantile(x,probs=c(0.025,0.975)))
      means<-data.frame(means)
      quants<-data.frame(quantiles)
      summry<-t(rbind(means,quants))
      colnames(summry)<-c("mean","lower","upper")
      summry
    }
    

    For example, if I want to see only the posteriors of the four beta parameters, I can write:

    stan_results(matrixFit, params=c("beta[1]","beta[2]","beta[3]","beta[4]"))
    

    For more details, such as interpreting the results and computing things like Bayes Factors, seeNicenboim and Vasishth 2016.

    FAQ: What if I don't want to fit a lognormal?

    In the Stan code above, I assume a lognormal function for the reading times:

     rt[i] ~ lognormal(X[i] * beta + Z_u[i] * u[subj[i]] + Z_w[i] * w[item[i]], sigma_e);
    

    If this upsets you deeply and you want to use a normal distribution (and in fact, for EEG data this makes sense), go right ahead and change the lognormal to normal:

     rt[i] ~ normal(X[i] * beta + Z_u[i] * u[subj[i]] + Z_w[i] * w[item[i]], sigma_e);
    

    FAQ: What if I my dependent measure is binary (0,1) responses?

    Use this Stan code instead of the one shown above. Here, I assume that you have a column called response in the data, which has 0,1 values. These are the trial level binary responses.

    data {
      int<lower=0> N;               //no trials
      int<lower=1> P;               //no fixefs
      int<lower=0> J;               //no subjects
      int<lower=1> n_u;             //no subj ranefs
      int<lower=0> K;               //no items
      int<lower=1> n_w;             //no item ranefs
      int<lower=1,upper=j> subj[N]; //subject indicator
      int<lower=1,upper=k> item[N]; //item indicator
      row_vector[P] X[N];           //fixef design matrix
      row_vector[n_u] Z_u[N];       //subj ranef design matrix
      row_vector[n_w] Z_w[N];       //item ranef design matrix
      int response[N];                 //response
    }
    
    parameters {
      vector[P] beta;               //fixef coefs
      cholesky_factor_corr[n_u] L_u;  //cholesky factor of subj ranef corr matrix
      cholesky_factor_corr[n_w] L_w;  //cholesky factor of item ranef corr matrix
      vector<lower=0>[n_u] sigma_u; //subj ranef std
      vector<lower=0>[n_w] sigma_w; //item ranef std
      vector[n_u] z_u[J];           //spherical subj ranef
      vector[n_w] z_w[K];           //spherical item ranef
    }
    
    transformed parameters {
      vector[n_u] u[J];             //subj ranefs
      vector[n_w] w[K];             //item ranefs
      {
        matrix[n_u,n_u] Sigma_u;    //subj ranef cov matrix
        matrix[n_w,n_w] Sigma_w;    //item ranef cov matrix
        Sigma_u = diag_pre_multiply(sigma_u,L_u);
        Sigma_w = diag_pre_multiply(sigma_w,L_w);
        for(j in 1:J)
          u[j] = Sigma_u * z_u[j];
        for(k in 1:K)
          w[k] = Sigma_w * z_w[k];
      }
    }
    
    model {
      //priors
      beta ~ cauchy(0,2.5);
      sigma_u ~ cauchy(0,2.5);
      sigma_w ~ cauchy(0,2.5);
      L_u ~ lkj_corr_cholesky(2.0);
      L_w ~ lkj_corr_cholesky(2.0);
      for (j in 1:J)
        z_u[j] ~ normal(0,1);
      for (k in 1:K)
        z_w[k] ~ normal(0,1);
      //likelihood
      for (i in 1:N)
        response[i] ~ bernoulli_logit(X[i] * beta + Z_u[i] * u[subj[i]] + Z_w[i] * w[item[i]]);
    }
    

    For reproducible example code

    See here.

  • 相关阅读:
    Uva 10494 If We Were a Child Again
    01 words & sentences BYOD
    Uva 465 Overflow
    354E
    MySQL/mariadb从删库到跑路——备份
    MySQL/mariadb知识点——日志记录(2)二进制日志
    MySQL/mariadb知识点——日志记录(1)
    MySQL/mariadb知识点——函数
    MySQL/mariadb知识点——数据库变量
    MySQL/mariadb知识点——事务Transactions
  • 原文地址:https://www.cnblogs.com/payton/p/6633131.html
Copyright © 2011-2022 走看看