# Applications

## Benefit-Risk Analysis

Costa and Drury [14] present a simulation involving mixed bivariate outcomes to demonstrate the use of joint copula models for benefit-risk analysis. The simulation assumes a two-arm parallel design with total sample size of n=200 and 1:1 randomization to either treatment $$t=1$$ (placebo) or $$t=2$$ (active drug). The bivariate response for individual $$i$$ is $$y_i=(y_{i1},y_{i2})$$ where $$y_{i1}$$ is the efficacy outcome assumed to be normal with mean $$\mu_{t}$$ and variance $$\sigma_{t}^2$$ and $$y_{i2}$$ is a binary indicator of whether the participant experienced an adverse event (AE) which follows a Bernoulli distribution with probability $$p_{t}$$. A vector of indicators $$x_i=(x_{i1},x_{i2})$$ specifies treatment in either the placebo or active arm. Probit regression – which assumes a latent normal distribution underlying the binary observations – was used for the marginal safety outcome while an OLS model was used for efficacy. A normal conditional copula that allows different dependence in each treatment group was used to connect the two marginal models. The overall model specification is: $\begin{gather*} y_{i1} \sim Normal(\mu_{t},\sigma_{t}), \,\, \mu_{t} = x_{i1}\beta_{11} + x_{i2}\beta_{12}, \,\, \sigma_{t} = x_{i1}s_1 + x_{i2}s_2\\ y_{i2} \sim Bernoulli(p_{t}), \,\, \Phi^{-1}(p_i) = x_{i1}\beta_{21} + x_{i2}\beta_{22}\\ H_{\theta_{t}}(y_{i1},y_{i2})=C_{\theta_{t}}^{Norm}(F_1(y_{i1}|\mu_{t},\sigma_{t}),F_2(y_{i2}|p_{t})), \,\, \theta_{t} = x_{i1}\omega_{1} + x_{i2} \omega_{2} \end{gather*}$

where the $$\beta_{jt}$$ are effect parameters, $$s_t$$ are dispersion parameters, and $$\omega_t$$ are copula dependency parameters for treatment $$t$$. The copula dependence parameter, $$\theta_t$$, is the poly-serial correlation which measures the correlation between the normal efficacy outcome and the latent normal distribution assumed to underlie the binary safety outcome. The Pearson correlation between the outcomes is $$\rho_t=\theta_t\phi[\Phi^{-1}(p_t)]/\sqrt{p_t(1-p_t)}$$ where $$\phi$$ is the standard normal density. Note that the correlation is a function of both $$\theta_t$$ and the marginal $$p_t$$ parameter.

The values used for the simulation are representative of data from a real respiratory clinical trial. For the placebo group $$\mu_1 = -150$$, $$\sigma_1^2=100^2$$, $$p_1=0.1$$ and $$\rho_1=0.1$$ and for the treatment group $$\mu_2 = -50$$, $$\sigma_2^2=100^2$$, $$p_2=0.4$$ and $$\rho_2=0.6$$. The efficacy, AE rate, and correlation between them are different by intervention group with weak dependence for placebo and moderate dependence for active drug.

## Benefit-Risk application

# Simulate data
set.seed(4283)

# function to get copula parameter given rho and p; see Costa section 3.1.2
getTheta <- function(rho,p){  (rho*sqrt(p*(1-p))) / dnorm(qnorm(p)) }

# number of samples per arm
n<-100

# placebo group
mu_1 <- -150
sigma2_1 <- 100^2
p_1 <- 0.1
rho_1 <- 0.1

# normal copula
nc_p<-normalCopula( getTheta(rho=rho_1, p=p_1)  )

pbo_dist <- mvdc(nc_p, margins = c("norm","binom"),
paramMargins = list(list(mean = mu_1, sd = sqrt(sigma2_1)),
list(size = 1, prob = p_1)) )

pbo_samps<-rMvdc(n, pbo_dist)

if (0){ # check simulated values
mean(pbo_samps[,1]) # mu_1
sd(pbo_samps[,1]) # sigma_1
mean(pbo_samps[,2]) #p_1
cor(pbo_samps[,1],pbo_samps[,2]) # rho_1 (Pearson corr)
}

# treatment
mu_2 <- -50
sigma2_2 <- 100^2
p_2 <- 0.4
rho_2 <- 0.6

# normal copula
nc_t<-normalCopula( getTheta(rho=rho_2, p=p_2)  )

trt_dist <- mvdc(nc_t, margins = c("norm","binom"),
paramMargins = list(list(mean = mu_2, sd = sqrt(sigma2_2)),
list(size = 1, prob = p_2)) )

trt_samps<-rMvdc(n, trt_dist)

if (0){ # check simulated values
mean(trt_samps[,1]) # mu_2
sd(trt_samps[,1]) # sigma_2
mean(trt_samps[,2]) #p_2
cor(trt_samps[,1],trt_samps[,2]) # rho_2 (Pearson corr)
}

#combine placebo and treatment data
dat <- rbind(pbo_samps,trt_samps) %>% cbind(sort(rep(c(0,1),n)),
sort(rep(c(0,1),n),decreasing=TRUE),
sort(rep(c(0,1),n))) %>% as.data.frame()
names(dat) <- c("efficacy","safety","treatment","trt1","trt2")

dat_lab <- dat
dat_lab %<>% mutate(treatment=factor(treatment, labels=c("placebo","active")),
safety=factor(safety, labels=c("no AE","AE")))

The histograms of efficacy by treatment and AE status for the simulated data in Figure 8 confirm that the highest efficacy was observed for those in the active treatment group with an adverse event.

ggplot(dat_lab, aes(x=efficacy, fill=treatment)) +
geom_histogram(bins=20, alpha=0.75) + facet_grid(treatment~safety)

Weakly informative normal priors were used for $$\beta$$ parameters and an inverse gamma with shape and scale parameters both equal to 0.001 was used for $$\sigma$$. For the copula parameter $$\theta$$, a flat uniform distribution on the interval [-1,1] was used. To explore the effect of fitting the joint copula model, each marginal model was also fit separately assuming independence between the outcomes. The R package rstan [59] was used to draw samples from the posterior distribution using the default No-U-Turn-Sampler (NUTS) MCMC algorithm, an implementation of Hamiltonian Monte Carlo. For each model, 4 chains with 1000 warmup iterations and 2000 sampling iterations were used resulting in 8000 samples from each posterior distribution. Diagnostics for the MCMC samples are shown in the Technical Supplement.

## Stan code for B-R model
# http://mc-stan.org/rstan/
rstan_options(auto_write = TRUE)

# marginal models assuming independence
mod0a_code <- "
data {
int N;
matrix[N, 2] x;
vector[N] y1;
}
parameters {
// params for continuous (efficacy) outcome
vector[2] beta1;
real<lower=0> sigma;
}
model {
vector[N] mu;

// priors
beta1 ~ normal(0,1000);
sigma ~ inv_gamma(0.001,0.001);

// marginal for continuous (efficacy) outcome
mu = beta1[1]*x[,1] + beta1[2]*x[,2];
y1 ~ normal(mu, sigma);
}
"

mod0b_code <- "
data {
int N;
matrix[N, 2] x;
int<lower=0, upper=1> y2[N];
}
parameters {
//params for binary (safety) outcome
vector[2] beta2;
}
model {
vector[N] p;

// priors
beta2 ~ normal(0,1000);

// marginal for binary (safety) outcome
p = beta2[1]*x[,1] + beta2[2]*x[,2];
y2 ~ bernoulli(Phi(p));
}
generated quantities {
vector[2] p;

p[1] = Phi(beta2[1]);
p[2] = Phi(beta2[2]);
}
"

# joint model
mod_code <- "
data {
int N;
matrix[N, 2] x;
vector[N] y1;
int<lower=0, upper=1> y2[N];
}
parameters {
// params for continuous (efficacy) outcome
vector[2] beta1;
vector<lower=0>[2] s;

//params for binary (safety) outcome
vector[2] beta2;

// copula dependence param
vector<lower=-1, upper=1>[2] omega;
}
model {
vector[N] mu;
vector[N] sigma;
vector[N] p;
vector[N] theta;

// priors
beta1 ~ normal(0,1000);
beta2 ~ normal(0,1000);
s ~ inv_gamma(0.001,0.001);

// marginal for continuous (efficacy) outcome
mu = beta1[1]*x[,1] + beta1[2]*x[,2];
sigma = s[1]*x[,1] + s[2]*x[,2];

// marginal for binary (safety) outcome
p = Phi(beta2[1]*x[,1] + beta2[2]*x[,2]);

// copula dependence parameter
theta = omega[1]*x[,1]+omega[2]*x[,2];

// build log-likelihood
for(i in 1:N){
target += normal_lpdf(y1[i]|mu[i],sigma[i]);
if (y2[i]==0) {
target += normal_lcdf((inv_Phi(1-p[i])-theta[i]*
inv_Phi(normal_cdf(y1[i],mu[i],sigma[i])))/sqrt(1-theta[i]^2)|0,1);
} else {
target += normal_lccdf((inv_Phi(1-p[i])-theta[i]*
inv_Phi(normal_cdf(y1[i],mu[i],sigma[i])))/sqrt(1-theta[i]^2)|0,1);
}
}

}
generated quantities {
vector[2] mu;
vector[2] p;
vector[2] theta;
vector[2] rho;

mu[1] = beta1[1];
mu[2] = beta1[2];

p[1] = Phi(beta2[1]);
p[2] = Phi(beta2[2]);

theta[1] = omega[1];
theta[2] = omega[2];

rho[1] = theta[1]*exp(normal_lpdf(inv_Phi(p[1])|0,1))/sqrt(p[1]*(1-p[1]));
rho[2] = theta[2]*exp(normal_lpdf(inv_Phi(p[2])|0,1))/sqrt(p[2]*(1-p[2]));
}
"

Table 2 summarizes the mean, Monte Carlo standard error, standard deviation, and quantiles of the posterior distributions for parameters $$\mu_t$$, $$p_t$$, $$\rho_t$$, and $$\theta_t$$ from the normal copula model. The number of effective samples – a function of the correlation between sample draws – along with Rhat – a measure of MCMC convergence – are also shown. The model performs reasonable well with posterior median estimates close to the true values for all parameters except the correlation between outcomes $$\rho_1$$ and adverse event rate $$p_1$$ in the placebo group.

br_tab <- round(summary(br_fit, pars=c("mu","p","rho","theta"))$summary,2)  Table 2: Benefit-Risk Copula Model Posterior Summary Mean MCSE Mean SD 2.5% 25% 50% 75% 97.5% eff. num. samps Rhat $$\mu_1$$ -152.97 0.11 10.17 -172.84 -159.84 -153.08 -146.05 -133.05 8000.00 1 $$\mu_2$$ -49.60 0.13 9.88 -68.88 -56.19 -49.58 -42.92 -30.33 5976.45 1 $$p_1$$ 0.17 0.00 0.04 0.11 0.15 0.17 0.20 0.25 8000.00 1 $$p_2$$ 0.38 0.00 0.05 0.29 0.35 0.38 0.41 0.48 5486.21 1 $$\rho_1$$ 0.01 0.00 0.09 -0.17 -0.05 0.01 0.08 0.19 8000.00 1 $$\rho_2$$ 0.61 0.00 0.05 0.49 0.58 0.61 0.64 0.69 7065.79 1 $$\theta_1$$ 0.02 0.00 0.14 -0.26 -0.08 0.02 0.11 0.29 8000.00 1 $$\theta_2$$ 0.77 0.00 0.06 0.63 0.74 0.78 0.82 0.88 7167.84 1 Using the posterior samples, the difference in mean efficacy response $$\mu_2-\mu_1$$ and the difference in probability of adverse events $$p_2 - p_1$$ was calculated for the normal copula and independence models. Figures 9 and 10 plot the results along with overlaid density curves and marginal histograms. The marginal histograms are nearly identical for both models, but the efficacy and safety treatment differences from the copula model have a clear positive dependence while there is no relationship between them in the independence model as expected. #calculate treatment effect (efficacy) and risk difference (safety) posterior_mu <- extract(br_fit, pars=c("mu[1]","mu[2]")) mu <- do.call(cbind.data.frame, posterior_mu) %>% mutate(mu_diff=mu[2]-mu[1]) posterior_p <- extract(br_fit, pars=c("p[1]","p[2]")) p <- do.call(cbind.data.frame, posterior_p) %>% mutate(p_diff=p[2]-p[1]) diffs<-cbind(mu,p) # assuming independence model posterior_mu0 <- extract(fit0a, pars=c("beta1[1]","beta1[2]")) mu0 <- do.call(cbind.data.frame, posterior_mu0) %>% mutate(mu_diff=beta1[2]-beta1[1]) posterior_p0 <- extract(fit0b, pars=c("p[1]","p[2]")) p0 <- do.call(cbind.data.frame, posterior_p0) %>% mutate(p_diff=p[2]-p[1]) diffs0<-cbind(mu0,p0) save(br_fit, br_tab, diffs, diffs0, file="br_mod_out.RData") # scatterplot with histogram margins mu_diff_hist <- plot_ly(x=diffs$mu_diff, type="histogram", nbinsx = 25,
color=I("steelblue"), showlegend=FALSE)

p_diff_hist <- plot_ly(y=diffs$p_diff,type="histogram", nbinsy = 25, color=I("steelblue"),showlegend=FALSE) scatterplt <- plot_ly(x=diffs$mu_diff,y=diffs$p_diff) %>% add_histogram2dcontour(showscale=FALSE, ncontours=10, contours = list(coloring='none'), color=I("steelblue"), line=list(width=2,smoothing=1.1), showlegend=FALSE) %>% add_markers(x = diffs$mu_diff, y = diffs\$p_diff, color=I("black"),
marker=list(size=3), alpha=.25,showlegend=FALSE) %>%
layout(xaxis=list(title ="Treatment Difference (Efficacy)"),
yaxis=list(title = "Treatment Difference (Safety)"))

plt_emp <- plotly_empty(type="scatter",mode="markers")

marg_plot<-subplot(mu_diff_hist, plt_emp, scatterplt, p_diff_hist,
nrows = 2, heights = c(.2, .8), widths = c(.8,.2),
shareX=TRUE, shareY=TRUE)

marg_plot