The solution to this problem will be shown at exercise classes, hence, no report is required from students.
This exercise we will demonstrate how to sample an MCMC via Just
Another Gibbs Sampler, download it from JAGS. Also, install
library(runjags) to be able to rung JAGS from R. The
capabilities of the latest version of JAGS are described in the
following manual.
The data file toenail.txt (values separated by spaces) comes from a longitudinal dermatological clinical study, whose main objective was to compare the efficacy of two treatments of toenail infection.
# Setting up the directories (change depending on your structure)
WD <- getwd()
DATA <- file.path(WD, "data")
FIG <- file.path(WD, "fig")
# Loading the data fro directory DATA
toenail <- read.csv(file.path(DATA, "toenail.txt"), sep="")
head(toenail, 20)
It contains the following variables:
idnr - identification number of a patient;infect - indicator of severity of infection (0 \(=\) no or weak infection, 1 \(=\) medium or severe infection);trt - treatment indicator (0 \(=\) treatment A, 1 \(=\) treatment B);time - time of a visit (months);visit - number of a visit.Let \(Y_{i,j}\) denote a random variable representing the indicator of the infection severity for \(i\)th patient at the \(j\)th visit (\(i=1,\ldots,n\), \(j=1,\ldots,n_{i}\)) which was conducted at time \(t_{i,j}\) of months. Let \(x_{i,j}\in\{0,\,1\}\) denote the treatment indicator of the \(i\)th patient.
Assume the following (hierarchical) model (genuine parameters and regressors are not indicated in the conditions when specifying the distributions): \[\begin{alignat*}{2} B_i &\sim \mathsf{N}(\beta_0,\,\tau_0^{-1}), &\quad &i=1,\ldots,n,\\[1ex] Y_{i,j}\,|\,B_i &\sim \mathcal{A}\Bigl(\pi(B_i)\Bigr), &\quad &i=1,\ldots,n,\;j=1,\ldots,n_i,\\[1ex] \log\biggl\{\frac{\pi(B_i)}{1 - \pi(B_i)}\biggr\} &= B_i + \beta_1\,x_i + \beta_2\,t_{i,j} + \beta_3\,x_i\,t_{i,j}, &\quad &i=1,\ldots,n,\;j=1,\ldots,n_i. \end{alignat*}\] In a non-Bayesian terminology this would be the logistic regression with a random intercept. As primary () parameters, consider the following: \[\begin{equation*} \boldsymbol{\psi} = \left(\boldsymbol{\beta}^\top,\,\tau_0\right)^\top,\qquad \boldsymbol{\beta} = \left(\beta_0,\,\beta_1,\,\beta_2,\,\beta_3\right)^\top. \end{equation*}\] Assume the following prior distribution for the primary parameters: \[\begin{gather*} p( \boldsymbol{\beta},\,\tau_0) = p( \boldsymbol{\beta})\,p(\tau_0), \\[1ex] \boldsymbol{\beta} \sim \mathsf{N}_4\bigl(\boldsymbol{0},\,\mbox{diag}(10^2,\ldots,10^2)\bigl), \;\;\tau_0 \sim \Gamma(1,\,0.005). \end{gather*}\] As latent data consider the random effects values \(\mathbf{B} = \left(B_1,\ldots,B_n\right)^\top\).
Next, answer the following questions:
Implement the above model in JAGS and using
library(rjags) generate two Markov chains whose limit
distribution will be the posterior distribution for the model under
consideration.
Draw the trajectories (traceplots) for the primary
parameters of the model and also for the model deviance. It is necessary
to add "deviance" among the monitored parameters. Next,
monitor the following variables: "pd", "popt",
"dic", "ped". Their meaning will be explained
later. Draw both chains in one plot with two different colors. Draw the
estimates of the autocorrelation functions (for at least one of the
generated chains).
Assess whether the convergence of the Markov chain to the limit distribution can be assumed and whether the chain exhibits acceptable autocorrelation.
Assess whether, given the variability of the posterior distribution for \(\boldsymbol{\beta}\), the prior distribution used for \(\boldsymbol{\beta}\) can be considered as weakly informative.
Calculate the basic characteristics of the posterior distribution for the following parameters:
For above defined parameters \(d_0\), \(\gamma_1\), \(\gamma_2\), \(\gamma_3\), calculate 95% credible intervals (ET as well as HPD) and plot estimates of posterior densities.
For parameter \(\gamma_3\) calculate (using the generated Markov chain) a value \(p\) which satisfies \[\begin{equation*} p = \inf\bigl\{\alpha:\,0\notin C(\alpha)\bigr\}, \end{equation*}\] where \(C(\alpha)\) is the \((1 - \alpha)100\)% ET credible interval for \(\gamma_3\).
Remember that the calculated \(p\) can be interpreted as a P-value of a test of the null hypothesis \(\gamma_3 = 0\).
For the purpose of this task we need to introduce the notation \(\boldsymbol{\beta}_{-0} = (\beta_1, \beta_2, \beta_3)^\top\) which excludes the coefficient used in the prior for random effects. Moreover, \(\mathbf{z}_{ij} = (x_i, t_{ij}, x_i t_{ij})^\top\).
First, we need to write down all the pdfs:
\[\begin{align*} p(\beta_0) &\propto \exp\left\{- \frac{1}{2} \frac{1}{10^2} \beta_0^2\right\}, \\ p(\boldsymbol{\beta}_{-0}) &\propto \exp\left\{- \frac{1}{2} \frac{1}{10^2} \|\boldsymbol{\beta}_{-0}\|^2 \right\}, \\ p(\tau_0) &\propto \tau_0^{1-1} \exp\{-0.005 \tau_0\}, \\ p(\mathbf{B} | \beta_0, \tau_0) &\propto \prod\limits_{i=1}^n \sqrt{\frac{\tau_0}{2\pi}} \exp \left\{- \frac{\tau_0}{2} (B_i - \beta_0)^2 \right\} \propto \tau_0^{\frac{n}{2}} \exp \left\{- \frac{\tau_0}{2} \sum\limits_{i=1}^n (B_i - \beta_0)^2 \right\}, \\ p\left(\mathbb{Y} | \mathbf{B}, \boldsymbol{\beta}_{-0}\right) &\propto \prod\limits_{i=1}^n \prod\limits_{j=1}^{n_i} \left[\sigma(B_i + \boldsymbol{\beta}_{-0}^\top \mathbf{z}_{ij})\right]^{Y_{ij}} \left[\sigma(-B_i - \boldsymbol{\beta}_{-0}^\top \mathbf{z}_{ij})\right]^{1 - Y_{ij}}, \end{align*}\] where \(\sigma(z) = \dfrac{1}{1+ \exp(-z)}\) is the sigmoid function (inverse function to logit).
The full-posterior is by Bayes theorem proportional to \[ p(\mathbf{B}, \boldsymbol{\beta}, \tau_0 | \mathbb{Y}) \,\propto\, p\left(\mathbb{Y} | \mathbf{B}, \boldsymbol{\beta}_{-0}\right) \, p(\mathbf{B} | \beta_0, \tau_0) \, p(\tau_0) \, p(\boldsymbol{\beta}_{-0}) \, p(\beta_0) \] and so are the full-conditional distributions: \[\begin{align*} p(\beta_0 | \cdots) &\propto \exp\left\{- \frac{1}{2} \left[\frac{1}{10^2} \beta_0^2 + \tau_0 \sum\limits_{i=1}^n (B_i - \beta_0)^2\right]\right\}, \\ p(\boldsymbol{\beta}_{-0} | \cdots) &\propto \exp\left\{- \frac{1}{2} \frac{1}{10^2} \|\boldsymbol{\beta}_{-0}\|^2 \right\} \prod\limits_{i=1}^n \prod\limits_{j=1}^{n_i} \left[\sigma(B_i + \boldsymbol{\beta}_{-0}^\top \mathbf{z}_{ij})\right]^{Y_{ij}} \left[\sigma(-B_i - \boldsymbol{\beta}_{-0}^\top \mathbf{z}_{ij})\right]^{1 - Y_{ij}}, \\ p(\tau_0 | \cdots) &\propto \tau_0^{\frac{n}{2} + 1-1} \exp\left\{-\tau_0 \left[0.005 + \frac{1}{2}\sum\limits_{i=1}^n (B_i - \beta_0)^2\right]\right\}, \\ p(\mathbf{B} | \cdots) &\propto \prod\limits_{i=1}^n \left[ \exp \left\{- \frac{\tau_0}{2} (B_i - \beta_0)^2 \right\} \prod\limits_{j=1}^{n_i} \left[\sigma(B_i + \boldsymbol{\beta}_{-0}^\top \mathbf{z}_{ij})\right]^{Y_{ij}} \left[\sigma(-B_i - \boldsymbol{\beta}_{-0}^\top \mathbf{z}_{ij})\right]^{1 - Y_{ij}} \right]. \end{align*}\]
For \(\beta_0\) we see that on-log scale we have a quadratic function of \(\beta_0\), which means that the full-conditional distribution for \(\beta_0\) is normal distribution. In particular, \[ \beta_0 \left|\, \mathbb{Y}, \mathbf{B}, \tau_0 \right. \sim \mathsf{N} \left( \dfrac{\frac{1}{10^2} 0 + \tau_0 n \overline{B}_n}{\tau^\star}, \left[\tau^\star\right]^{-1} \right), \qquad \text{where} \quad \tau^\star = \frac{1}{10^2} + \tau_0 n. \] Then, the full-conditional distribution of \(\tau_0\) yields gamma distribution \[ \tau_0 |\, \mathbf{B}, \beta_0 \sim \Gamma\left(1+\frac{n}{2}, 0.005 + \frac{1}{2}\sum\limits_{i=1}^n (B_i - \beta_0)^2\right). \] However, for the rest of the parameters we clearly cannot recognize any pdf of a well known distribution due to appearance both in \(\exp(\cdot)\) and \(\sigma(\cdot)\). Hence, approximative methods of sampling from these distributions have to be used. Luckily, JAGS is able to do that for us, see the manual.
The only thing we can say about the full-conditional distribution of \(\mathbf{B}\) is that it is divided into \(n\) independent blocks for each random intercept.
First, we need to prepare the data for JAGS and be careful about indexing within the given data:
table(toenail[, "idnr"]) # n_i - observations per ID
(nsubj <- length(unique(toenail[, "idnr"]))) # 294 unique IDs
# Not all numbers between 1 - 383 are used in the idnr column!
# Create new IDs with values 1 - 294
toenail <- toenail[order(toenail[, "idnr"], toenail[, "visit"]), ] # order observations by id and by visit
tidnr <- table(toenail[, "idnr"]) # n_i ordered by idnr
toenail[, "id"] <- rep(1:nsubj, tidnr) # replicate new ID n_i times
tail(toenail) # last i: idnr=383, but id=294
(n <- nrow(toenail)) # total number of observations
# Definition of a list that will be given to JAGS
# Variables within JAGS will be called by these names
Data <- list(n = n, # sum(n_i)
nsubj = nsubj, # n
id = toenail[, "id"], # id
y = toenail[, "infect"], # binary response Y_{ij}
trt = toenail[, "trt"], # binary covariate x_i
time = toenail[, "time"]) # time covariate t_{ij}
Next, we need to implement the hierarchical model for JAGS, which will be a very long string. This is the place to describe how the data are generated, including prior distributions. The distributions are defined in JAGS terminology and parametrization which may differ from the one in R!
dnorm, but is
parametrized through precision and not standard
deviation.p are
parametrized as dbern(p).dgamma().Distributions are assigned via stochastic relation ~,
while deterministic relation is given by <- (for
auxiliary quantities). Overall the notation resembles R code:
c(),:, e.g. 1:n,for cycle has the same syntax, including combining
commands with {...}.Model <- "
model{
for (i in 1:n){
eta[i] <- b[id[i]] + betaTrt * trt[i] + betaTime * time[i] + betaTrtTime * trt[i] * time[i]
pi[i] <- exp(eta[i]) / (1 + exp(eta[i]))
y[i] ~ dbern(pi[i])
}
for (i in 1:nsubj){
b[i] ~ dnorm(beta0, tau0)
}
beta0 ~ dnorm(0, 0.01)
betaTrt ~ dnorm(0, 0.01)
betaTime ~ dnorm(0, 0.01)
betaTrtTime ~ dnorm(0, 0.01)
tau0 ~ dgamma(1, 0.005)
d0 <- 1 / sqrt(tau0)
}
"
Within this definition we consider the following notation:
eta[1:n] - predictor values for each patient given
current value of other parameters,b[1:nsubj] - random intercepts for each patient,betaTrt - coefficient for treatment effect (difference
between B and A) at time 0,betaTime - coefficient (slope) for time with treatment
A,betaTrtTime - interaction coefficient for treatment and
time,pi[1:n] - auxiliary vector for individual
probabilities,beta0 - prior mean for random intercepts,tau0 - prior precision for random intercepts,d0 - prior standard deviation (transformation of
tau0),other variables (y, n, nsubj,
id, trt, time) are taken from the
list Data.
Now we need to propose starting values for the model parameters. The
values should be in list with names corresponding to
parameter names. Suitable initial values could be found by a
frequentistic approach with a simpler model.
# Logistic regression without random effects (assuming independence among all observations)
m0 <- glm(infect ~ trt*time, family = binomial(link = logit), data = toenail)
summary(m0)
# Each chain is started with different values (list of lists)
# .RNG.... is setting up the seed for each chain + type of the random number generator
# ... for reproducibility purposes
Init <- list(list(.RNG.seed = 20251110, .RNG.name = "lecuyer::RngStream",
beta0 = -0.5, betaTrt = 0, betaTime = -0.2, betaTrtTime = 0, tau0 = 1),
list(.RNG.seed = 20251117, .RNG.name = "lecuyer::RngStream",
beta0 = 0, betaTrt = -0.05, betaTime = -0.4, betaTrtTime = 0.1, tau0 = 2))
# Parameters to monitor (including deviance and related metrics)
parmDev <- c("deviance", "pd", "popt", "dic", "ped")
parameters.toenail <- c(parmDev, "beta0", "betaTrt", "betaTime", "betaTrtTime", "d0")
# Libraries that are essential to run JAGS via R
library(runjags)
library(coda) # environment for monitoring MCMC samples
# Suppress administrative output (just try it)
# runjags.options(silent.runjags = TRUE, silent.jags = FALSE)
# Set runjags to use multiple cores in parallel
# library(parallel)
# my_cluster <- makeCluster(2)
# runjags.options(method = "rjparallel")
runjags.options(method = "rjags") # sets it back to run everything on just one core
Finally, everything is ready. To sample MCMC we just need to
run.jags(). For reproducibility reasons we first set the
seed. Then, we call run.jags where we declare
number of chains, the length of burnin etc. Sampling (took about 1 min
on decent PC). MCMC length below is perhaps too short, sample should be
at least 10-times higher, i.e., for real results, use at least
sample = 30000 and perhaps also burnin should
be slightly longer.
# Comment/uncomment with Ctrl+Shift+C
set.seed(20010911)
jagsLogit <- run.jags(model = Model, # model definition as a string
monitor = parameters.toenail, # vector of monitored parameters
data = Data, # list of used data
inits = Init, # list of initial values for each chain
adapt = 1000, # number of adaptive iterations (at the start)
n.chains = 2, # number of chains
thin = 1, # thinning (keep only every thin-th value)
# method = "parallel", # parallel sampling of the chains
# cl = my_cluster, # computation on more cores
burnin = 1500, # length of the burnin period
sample = 5000) # the length of the sample used for inference
# stopCluster(my_cluster)
# After time-consuming sampling, it is always good idea to save the sampled values for future use
save(list = "jagsLogit", file = file.path(DATA, "mcmc_toenail.RData"))
load(file.path(DATA, "mcmc_toenail.RData"))
Let’s explore the contents of jagsLogit:
# str(jagsLogit) # quite a long list of computed values
attributes(jagsLogit)
# The sampled states are within `$mcmc` - which is a list for each chain
class(jagsLogit$mcmc) # mcmc.list - class from coda package
head(jagsLogit$mcmc[[1]]) # first 6 sampled states in first chain (after burnin)
jagsLogit$burnin
jagsLogit$sample
jagsLogit$thin
jagsLogit$model
jagsLogit$monitor
jagsLogit$timetaken
jagsLogit$end.state # last sampled values for each parameter
# Could be used for continuation in sampling, if insufficient
jagsLogit2 <- run.jags(model = Model, monitor = parameters.toenail, data = Data,
inits = jagsLogit$end.state, # starting a new chain from the last value
adapt = 0, # already adapted
n.chains = length(jagsLogit$mcmc), # the same as before
thin = jagsLogit$thin, # the same as before
burnin = 0, # no burning required
sample = 100)
head(jagsLogit2$mcmc[[1]])
Basic posterior summary
print(jagsLogit, digits = 3) # runjags summaries
jagsLogit$summary$statistics # coda posterior Mean, SD, Naive SE, corrected SE
jagsLogit$summary$quantiles # coda posterior quantiles
jagsLogit$summaries # combination + MCMC diagnostic parameters
# print(jagsLogit$deviance.table) # contribution to deviance by each observation (too long)
print(jagsLogit$deviance.sum) # devaince and related quantities
Brief check of convergence and properties of the chain is explored with the * traceplots, * posterior CDF’s based on different chains, * histograms of sampled values and * autocorrelation function.
Author of the runjags package provided us with plotting
functions designed for the outputs.
# plot(jagsLogit, layout = c(2, 2)) # all plots together (too much)
# This way we can plot all types of plots for each monitored variable separately:
plot(jagsLogit, vars = "beta0")
plot(jagsLogit, vars = "betaTrt")
plot(jagsLogit, vars = "betaTime")
plot(jagsLogit, vars = "betaTrtTime")
plot(jagsLogit, vars = "d0")
plot(jagsLogit, vars = "deviance")
Only traceplots for selected parameters
Parm <- c("beta0", "betaTrt", "betaTime", "betaTrtTime", "d0", "deviance")
plot(jagsLogit, vars = Parm, plot.type = "trace", layout = c(2, 3))
Only ECDF for selected parameters
plot(jagsLogit, vars = Parm, plot.type = "ecdf", layout = c(2, 3))
Only histograms for selected parameters
plot(jagsLogit, vars = Parm, plot.type = "histogram", layout = c(2, 3))
Only autocorrelation function (ACF) for selected parameters
plot(jagsLogit, vars = Parm, plot.type = "autocorr", layout = c(2, 3))
Crosscorrelations between chains of different parameters
plot(jagsLogit, plot.type = "crosscorr")
Gelman-Rubin convergence diagnostic
gelman.diag(jagsLogit$mcmc) # see help(coda::gelman.diag) for details
Geweke’s diagnostic (for beta’s only, separate set of plots for each chain)
geweke.diag(jagsLogit$mcmc) # see help(coda::geweke.diag) for details
par(mfrow = c(2, 3))
geweke.plot(jagsLogit$mcmc[[1]]) # see help(coda::geweke.plot) for details
geweke.plot(jagsLogit$mcmc[[2]])
Let us now focus more on the estimates for regression coefficients. The basic summary is found here:
jagsLogit$summary
The posterior SD for beta coefficients could be
extracted the following way:
beta_names <- rownames(jagsLogit$summary$statistics)[grep("^beta", rownames(jagsLogit$summary$statistics))]
(beta_post_sd <- jagsLogit$summary$statistics[beta_names, "SD"])
10 / beta_post_sd # How many times the prior SD is larger than the posterior
We clearly see that the prior distribution has much larger variability and, thus, could be considered weakly informative (prior does not determine the results).
The credible intervals included within the summary are actually
Equal-Tailed (ET) credible intervals. These are constructed simply by
applying quantile with probabilities \(\frac{\alpha}{2}\) and \(1-\frac{\alpha}{2}\) to the relevant
samples. Compare for yourself:
(betaET <- jagsLogit$summary$quantiles[beta_names, paste0(c(2.5, 97.5),"%")])
apply(jagsLogit$mcmc[[1]][, beta_names], 2, quantile, prob = c(0.025, 0.975)) # based only on chain 1
beta_ET_manual <- t(apply(rbind(jagsLogit$mcmc[[1]][, beta_names],
jagsLogit$mcmc[[2]][, beta_names]),
2, quantile, prob = c(0.025, 0.975)))
all.equal(betaET, beta_ET_manual) # all values are equal up to a tolerance of cca 1e-8
Highest Posterior Density (HPD) credible intervals are already calculated and available in the output.
print(jagsLogit$hpd)
print(jagsLogit$HPD) # the same
HPDinterval(jagsLogit$mcmc) # computation for each chain separately
See help(coda::HPDinterval) for some details. For each
parameter the interval is constructed from the ECDF of the
sample as the shortest interval for which the
difference in the ECDF values of the endpoints is the nominal
probability. Assuming that the distribution is not
severely multimodal, this is the HPD interval.
When the samples are cleared of high autocorrelation (higher
thin), basically any approach for approximation of a
characteristic of a distribution can be used to approximate the same
characteristic of posterior distribution. For example, let us use kernel
density estimators for the coefficients:
COL <- c("blue", "red")
par(mfrow = c(2,2), mar = c(4,4,2,0.5))
for(beta in beta_names){
dens <- lapply(jagsLogit$mcmc, function(mcmc){density(mcmc[,beta])})
rangex <- range(unlist(lapply(dens, function(d){range(d$x)})))
rangey <- range(unlist(lapply(dens, function(d){range(d$y)})))
plot(0,0,type = "n", xlim = rangex, ylim = rangey,
xlab = beta, ylab = "Density", main = beta)
for(i in 1:length(dens)){
lines(dens[[i]], col = COL[i], lty = 2)
abline(v = jagsLogit$summary$quantiles[beta,c("2.5%", "97.5%")], col = "darkgreen", lty = 3)
abline(v = jagsLogit$HPD[beta, c("Lower95", "Upper95")], col = "goldenrod", lty = 3)
}
legend("topleft", legend = 1:length(dens), title = "Chain", col = COL, lty = 2, bty = "n")
legend("topright", c("ET", "HPD"), title = "95% CI", col = c("darkgreen", "goldenrod"), lty = 3, bty = "n")
}
Or even a bivariate version including the contours:
library(MASS)
# ?MASS::kde2d
dens <- lapply(jagsLogit$mcmc, function(mcmc){kde2d(x = mcmc[,"betaTime"],
y = mcmc[,"betaTrtTime"],
n = 40)})
# rangex <- range(unlist(lapply(dens, function(d){range(d$x)})))
# rangey <- range(unlist(lapply(dens, function(d){range(d$y)})))
rangez <- range(unlist(lapply(dens, function(d){range(d$z)})))
par(mfrow = c(1,2), mar = c(4,4,2,0.5))
for(ch in 1:length(dens)){
image(dens[[ch]],
zlim = rangez, # xlim = rangex, ylim = rangey,
main = paste("Chain", ch), xlab = "betaTime", ylab = "betaTrtTime")
contour(dens[[ch]], add = T)
}
When we read carefully the assignment, we decode that * \(d_0 = \tau_0^{-1/2}\) = d0,. *
\(\gamma_1 = \beta_2\) =
betaTime, * \(\gamma_2 =
\beta_2+\beta_3\) = betaTime + betaTrtTime, * \(\gamma_3(t) = \beta_1+ t \beta_3\) =
betaTime + t * betaTrtTime, (B compared to A), which
depends on the time t.
First two parameters are already included within out sampled states.
The posterior summary is already at our disposal within
jagsLogit object.
summary(jagsLogit, vars = c("d0", "betaTime"))
The posterior density estimate + credible intervals (based on both chains):
COL <- c("blue", "red")
par(mfrow = c(1,2), mar = c(4,4,2,0.5))
for(var in c("d0", "betaTime")){
dens <- lapply(jagsLogit$mcmc, function(mcmc){density(mcmc[,var])})
rangex <- range(unlist(lapply(dens, function(d){range(d$x)})))
rangey <- range(unlist(lapply(dens, function(d){range(d$y)})))
plot(0,0,type = "n", xlim = rangex, ylim = rangey,
xlab = var, ylab = "Density", main = var)
for(i in 1:length(dens)){
lines(dens[[i]], col = COL[i], lty = 2)
abline(v = jagsLogit$summary$quantiles[var,c("2.5%", "97.5%")], col = "darkgreen", lty = 3)
abline(v = jagsLogit$HPD[var,c("Lower95", "Upper95")], col = "goldenrod", lty = 3)
}
legend("topleft", legend = 1:length(dens), title = "Chain", col = COL, lty = 2, bty = "n")
legend("topright", c("ET", "HPD"), title = "95% CI", col = c("darkgreen", "goldenrod"), lty = 3, bty = "n")
}
The other parameters are not included among the monitored parameters
(unless time t=0). One way to obtain all necessary
estimates would be to include them in the Model definition,
the same way as d0 was. But we would have to run the
sampling algorithm once again, which is time and energy consuming.
However, we already have the samples at our disposal. The quantities of interest are actually a simple transformation of the other available parameters. Their samples are then corresponding deterministic transformation of the available samples.
t <- 5 # chosen value for time
my_mcmc <- list()
for(ch in 1:2){
my_mcmc[[ch]] <- data.frame(jagsLogit$mcmc[[ch]][,"betaTime"] + jagsLogit$mcmc[[ch]][,"betaTrtTime"],
jagsLogit$mcmc[[ch]][,"betaTrt"] + t * jagsLogit$mcmc[[ch]][,"betaTrtTime"])
colnames(my_mcmc[[ch]]) <- c("gamma2", "gamma3t")
start_end <- attr(jagsLogit$mcmc[[ch]], "mcpar")
my_mcmc[[ch]] <- mcmc(my_mcmc[[ch]], # creating my own class(mcmc)
start = start_end[1], # jagsLogit$burnin + 1
end = start_end[2], # jagsLogit$burnin + jagsLogit$sample
thin = jagsLogit$thin) # start_end[3]
}
class(my_mcmc)
my_mcmc <- as.mcmc.list(my_mcmc) # now of class mcmc.list
(sum_my_mcmc <- summary(my_mcmc)) # uses corresponding summary method for class mcmc.list
With the new mcmc.list object we can use all methods as
directly for any mcmc object.
par(mfrow = c(1,2), mar = c(4,4,2,0.5))
plot(my_mcmc, trace = TRUE, density = FALSE)
It remains to compute HPD credible intervals (ET already within the summary).
# chains separately
HPDinterval(my_mcmc)
# both chains together
(gammaHPD <- HPDinterval(as.mcmc(rbind(my_mcmc[[1]], my_mcmc[[2]]))))
COL <- c("blue", "red")
par(mfrow = c(1,2), mar = c(4,4,2,0.5))
for(var in c("gamma2", "gamma3t")){
dens <- lapply(my_mcmc, function(mcmc){density(mcmc[,var])})
rangex <- range(unlist(lapply(dens, function(d){range(d$x)})))
rangey <- range(unlist(lapply(dens, function(d){range(d$y)})))
plot(0,0,type = "n", xlim = rangex, ylim = rangey,
xlab = var, ylab = "Density", main = var)
for(i in 1:length(dens)){
lines(dens[[i]], col = COL[i], lty = 2)
abline(v = sum_my_mcmc$quantiles[var,c("2.5%", "97.5%")], col = "darkgreen", lty = 3)
abline(v = gammaHPD[var,c("lower", "upper")], col = "goldenrod", lty = 3)
}
legend("topleft", legend = 1:length(dens), title = "Chain", col = COL, lty = 2, bty = "n")
legend("topright", c("ET", "HPD"), title = "95% CI", col = c("darkgreen", "goldenrod"), lty = 3, bty = "n")
}
Once we have samples of model parameters, exploring the posterior of some parametric function is only a matter of transformation. Other potential parametric functions of interest:
The goal is to find the lowest coverage for ET intervals such that 0 is not contained within the ET interval. We will showcase this on \(\gamma_3(t)\).
Lets start slowly by finding the posterior median:
(gamma3t_med <- sum_my_mcmc$quantiles["gamma3t", "50%"])
(which_side <- ifelse(gamma3t_med < 0, "upper", "lower"))
Now we know which end of the interval (lower or upper) will touch 0. Let us formulate the function that measures how far is the upper bound from zero.
# First, for simplicity, take all samples of the parameter into one vector
gamma3t <- c(my_mcmc[[1]][,"gamma3t"], my_mcmc[[2]][,"gamma3t"])
dist_ET_from_0 <- function(alpha){
quantile(gamma3t, prob = 1-alpha/2) - 0 # not a distance (also negative values)
}
The Bayesian version of p-value is now root of the defined function:
gamma3t_pvalue <- uniroot(dist_ET_from_0, interval = c(0, 1))
gamma3t_pvalue$root
# check, whether it truly satisfies the property
quantile(gamma3t, probs = c(gamma3t_pvalue$root/2, 1-gamma3t_pvalue$root/2))
Let me summarize the process into a function:
bayes_pvalue <- function(x, x0 = 0){
medx <- median(x)
rangex <- range(x)
if((rangex[1]-x0) * (rangex[2]-x0) > 0){ # same signs (all sampled values on the same side from x0)
return(0) # we are pretty sure, parameter x is far away from x0
}else{ # opposite signs (x0 between sampled values)
if(medx < x0){
dist_ET_from_x0 <- function(alpha){
quantile(x, prob = 1-alpha/2) - x0
}
}else{
dist_ET_from_x0 <- function(alpha){
quantile(x, prob = alpha/2) - x0
}
}
root <- uniroot(dist_ET_from_x0, interval = c(0, 1))
return(root$root)
}
}
Lets check whether it works (one both chains):
# betaTrt
jagsLogit$summary$quantiles["betaTrt",]
bayes_pvalue(c(jagsLogit$mcmc[[1]][,"betaTrt"], jagsLogit$mcmc[[2]][,"betaTrt"]))
# betaTime
jagsLogit$summary$quantiles["betaTime",]
bayes_pvalue(c(jagsLogit$mcmc[[1]][,"betaTime"], jagsLogit$mcmc[[2]][,"betaTime"]))
bayes_pvalue(c(jagsLogit$mcmc[[1]][,"betaTime"], jagsLogit$mcmc[[2]][,"betaTime"]), -0.3)
# betaTrtTime
jagsLogit$summary$quantiles["betaTrtTime",]
bayes_pvalue(c(jagsLogit$mcmc[[1]][,"betaTrtTime"], jagsLogit$mcmc[[2]][,"betaTrtTime"]))