RU01 R 5

6-7: Sensitivity Analysis and Bounds – Solutions
April 18, 2024
Sensitivity Analysis Rationale
So far in our explorations of observational studies we have considered various methods of accounting for measured confounders W as well as some cases in which we can account for unmeasured confounders U
(e.g. if we have measured a suitable instrumental variable).
However, it is quite often the case that we are simply unable to control for unmeasured confounding in any meaningful way. Another strategy then is to quantify the uncertainty due to unmeasured confounding. Doing so is commonly referred to as Sensitivity Analysis. (Note: The term “Sensitivity Analysis” can be used much more broadly to refer to methods of varying any modeling assumptions, but here we use the more specific definition)
Recall that we already use methods of quantifying uncertainty due to random variation (e.g. standard error, confidence intervals, hypothesis testing). However, this random uncertainty is distinct from the systematic uncertainty due to unmeasured confounding.
Simulation
Let us again consider a version of our AspiTyleCedrin example. Much like in our Instrumental Variables lab, in this version both exposure to AspiTyleCedrin and the outcome of experiencing a migraine are affected by watching cable news, since AspiTyleCedrin ads are commonly shown on cable news channels, and stress from excessive cable news watching can trigger migraines. However, in this case we have not been able to measure an instrumental variable such as living near a pharmacy which sells AspiTyleCedrin.
Thus we have the following variables:
Endogenous variables:
• A: Treatment variable indicating whether the individual i took AspiTyleCedrin (Ai = 1) or not (Ai = 0). • Y: Outcome variable indicating whether the individual experienced a migraine (Yiobs = 1) or not
(Yiobs = 0).
• W: Variable representing sex assigned at birth, with W = 0 indicating AMAB (assigned male at birth),
W = 1 indicating AFAB (assigned female at birth), and W = 2 indicating an X on the birth certificate, possibly representing an intersex individual or left blank.
Exogenous variables:
• U: Unmeasured confounding variable, cable news watching, which affects the exposure A and the outcome Y ,
And our DAG is as follows:

Simulate the dataset:
# specify the number of observations we want for our simulated data # ———-
n = 1e4 # Number of individuals
# NOTE: Again, don’t worry too much about how we’re creating this dataset, # this is just an example.
# create simulated dataframe
# ———-
df <- data.frame(U = rbernoulli(n, p = 0.34), W = sample(0:2, size = n, replace = TRUE, prob = c(0.49,0.50,0.01)) ) %>%
# create covariates
mutate(Y_0 = as.numeric(rbernoulli(n,
p = (0.87 + 0.035*(W > 0) +
0.05*(U > 0)))),
p = (0.34 + 0.035*(W > 0) +
0.3*(U > 0)))),
p = (0.03 + 0.06*(W > 0) + 0.21*(U == 1)))),
ITE = Y_1 – Y_0,
Y = as.numeric((A & Y_1) | (!A & Y_0))
Y_1 = as.numeric(rbernoulli(n,
A = as.numeric(rbernoulli(n,
## Warning: ‘rbernoulli()‘ was deprecated in purrr 1.0.0.
## This warning is displayed once every 8 hours.
## Call ‘lifecycle::last_lifecycle_warnings()‘ to see where this warning was
## generated.

# summarize # ———- head(df)
## ## 1 ## 2 ##3 ##4 ##5 ##6
U W Y_0 Y_1 A ITE Y
FALSE 1 FALSE 1 TRUE0 TRUE 1 FALSE 1 FALSE 0
1 10 01 1 00 -11 1 01 -10 0 10 10 1 00 -11 1 10 01
## Mode :logical Min. :0.0000 Min. :0.0000 Min. :0.0000
summary(df)
## FALSE:6618
## TRUE :3382
## A
## Min. :0.0000
## 1st Qu.:0.0000 1st Qu.:-1.0000 1st Qu.:1.0000
## Median :0.0000 Median : 0.0000 Median :1.0000
## Mean :0.1311 Mean :-0.4354 Mean :0.8517
## 3rd Qu.:0.0000 3rd Qu.: 0.0000 3rd Qu.:1.0000
## Max. :1.0000 Max. : 1.0000 Max. :1.0000
## [1] -0.4354
1st Qu.:0.0000
Median :0.0000
Mean :0.4643
3rd Qu.:1.0000
Max. :1.0000
1st Qu.:0.0000
Median :1.0000
Mean :0.5202
3rd Qu.:1.0000
Max. :2.0000
1st Qu.:1.0000
Median :1.0000
Mean :0.8997
3rd Qu.:1.0000
Max. :1.0000
Min. :-1.0000
Min. :0.0000
# calculate ATE “true” as a reference – remember we’d likely not have this in practice # ———-
ATE_true <- mean(df$ITE) ATT_true <- mean((df %>% filter(A == 1))$ITE)
# shirk dataframe to only include treatment (A), covariates (W: sex at birth), and outcome # ———-
df %>% select(A, W, Y)
Recall that the Average Treatment Effect (ATE) is the average difference in the pair of potential outcomes averaged over the entire population of interest (at a particular moment in time), or rather, it is just the average (or expected value) of the individual-level treatment effect.
ATE = E[Yi(1) − Yi(0)]
QUESTION 1: Use the group_by() and summarize() functions to find the estimated average treatment
effect using the following formula. Compare this result to the actual ATE (saved as ATE_true). ˆ
ATE=E[Yi|Ai =1,Wi =wi]−E[Yi|Ai =0,Wi =wi]

# calculate mean of outcome by treatment # ———-
# group by treatment
group_by(A) %>%
# calculate mean of each outcome by treatment summarise(E_Y = mean(Y))
# calculate the ATE
# ———-
ATE_crude <- est$E_Y[2] - est$E_Y[1] ATE_crude ## [1] -0.32444 ATE ≈ ATEcrude The following code chunk uses the base R package confint() to estimate the ATE in a similar manner, but also allows us to calculate confidence intervals and p-values for this estimate: ATE = ATEtrue ˆ # estimate the treatment effect assuming no confounding # ---------- ATE_est <- # run a linear model lm(Y ~ A, data = .) # calculate confidence for ATE using confint() package confint(ATE_est) ## 2.5 % 97.5 % ## (Intercept) 0.8871234 0.9013448 ## A -0.3440787 -0.3048014 ## [1] -0.4354 QUESTION 2: Report the estimate calculated by the confint() function call above, as well as the 95% confidence interval and p-value. p-value ≈ summary(ateout)$Estimate[3, 6] QUESTION 3: What would you conclude from the information you reported in the previous question if you did not know the true ATE? From what you do know the true ATE and the DAG, do you think the confidence interval appropriately captures the level of uncertainty of that estimate? Explain. ANSWER: These results show a very narrow confidence interval and a tiny p-value, which would generally lead us to be very confident in this estimate. However, we can see the true ATE value is not included in # compare to "true" ATE ATE ≈ summary(ateout)$Estimate[3,1] 95% CI ˆ ≈ [summary(ateout)$Estimate[3, 3], summary(ateout)$Estimate[3, 4]] this CI, and furthermore we know from the DAG that there is unmeasured confounding U that we have not controlled for. This is a stark reminder that our interpretation of measures such as p-values, CIs, etc. are only valid if our assumptions are correct! Here we know they are not. The CI and p-value calculated above account only for the random uncertainty (and systematic uncertainty from measured confounding W. Now we will consider different methods of quantifying the uncertainty due to the unmeasured confounding U. Manski Bounds One method for quantifying the uncertainty due to the unmeasured confounding U is by quantifying logical bounds upon necessary parameters and propagating those bounds through to the parameter of interest. First, let us consider the ATE in more detail. We can rewrite the formula from above as: ATE = E[Yi(1) − Yi(0)] = E[Yi(1)] − E[Yi(0)] = μt − μc where μt is simply the average outcome under the counterfactual in which everyone received the treatment (A = 1 for everyone) and μc is simply the average outcome under the counterfactual in which no one received the treatment (A = 0 for everyone). NOTE: For simplicity we will ignore W for now. Note that by the definition of expectation we can further break down these μ values like so: μt =P(A=1)·E[Yi(1)|A=1]+P(A=0)·E[Yi(1)|A=0] μc =P(A=1)·E[Yi(0)|A=1]+P(A=0)·E[Yi(0)|A=0] μt =p·μt,1 +(1−p)·μt,0 μc =p·μc,1 +(1−p)·μc,0 • p is the probability of treatment. • μt,1 is the average outcome of treatment among those who actually receive the treatment. • μt,0 is the average outcome of treatment among those who do not receive the treatment. • μc,1 is the average outcome of not receiving treatment among those who actually receive the treatment. • μc,0 is the average outcome of not receiving treatment among those who do not receive the treatment. In practice, we can reasonably estimate p, μt,1 and μc,0, but not μt,0 and μc,1. QUESTION 4: Explain why the previous statement is true. ANSWER: We can estimate p from the proportion of treated individuals in our dataset. Furthermore, our observed data does contain the outcome of treatment among individuals receiving the treatment and the outcome of no treatment among individuals receiving the treatment, but it is impossible for us to observe what would happen to individuals under the opposite treatment condition than actually happened. However, we do know that since the outcome Y is binary, μt,0 and μc,1 must by definition lie inside the interval [0,1]. This knowledge allows us to construct bounds on μt and μc like so: Code Help
μt ∈[p·μt,1,p·μt,1 +(1−p)]
μc ∈[(1−p)·μc,0,(1−p)·μc,0 +p]
QUESTION 5: Show how the bounds for μt and μc above follow from bounds of [0,1] on μt,0 and μc,1.
LLμt =μt,[μt,0=0] =p·μt,1 +(1−p)·0
ULμt =μt,[μt,0 =1]
=p·μt,1 +(1−p)·1
=p·μt,1 +(1−p) LLμc =μc,[μc,1=0]
= p · 0 + (1 − p) · μc,0
= (1−p)·μc,0 ULμc =μc,[μc,1 =1]
= p · 1 + (1 − p) · μc,0 = p + (1 − p) · μc,0
We can then use these bounds of μt and μc in the formula for the ATE to get bounds for the ATE itself.
ATE = μt − μc LLATE =LLμt −ULμc
= [p · μt,1] − [p + (1 − p) · μc,0] ULATE =ULμt −LLμc
= [p·μt,1 +(1−p)]−[(1−p)·μc,0]
QUESTION 6: Why are the bounds for the ATE calculated in this way? For example, why not find the
difference of both lower bounds and both upper bounds, respectively?
ANSWER: The ATE is the difference in the average outcome among the treatment and control, so this difference will be smallest when the average outcome of the treatment is high and average outcome of the control is low (because they will be closer to each other), and vice versa for the upper bound.
We now have a formula for the ATE using only p, μt,1 and μc,0, which we said earlier can be reasonably estimated from our data. Thus we can use plug-in estimators for these values to estimate the bounds for the ATE, like so:
LLATE = [pˆ· μˆt,1] − [pˆ+ (1 − pˆ) · μˆc,0]
ULATE =[pˆ·μˆt,1 +(1−pˆ)]−[(1−pˆ)·μˆc,0]
QUESTION 7: Calculate these bounds on the ATE using the data. 6

# get necessary statistics
# ———-
p_hat <- nrow(df %>% filter(A == 1)) / nrow(df) # filter only treated units, calculate % u mu.t1_hat <- df %>% filter(A == 1) %>% summarise(mean = mean(Y)) %>% pull(mean) # calculat mu.c0_hat <- df %>% filter(A == 0) %>% summarise(mean = mean(Y)) %>% pull(mean) # calculat
# calculate lower and upper bounds of ATE
# ———-
LL_ATE <- p_hat*mu.t1_hat - (p_hat + (1 - p_hat)*mu.c0_hat) UL_ATE <- p_hat*mu.t1_hat + (1 - p_hat) - (1 - p_hat)*mu.c0_hat ## [1] -0.8334 ## [1] 0.1666 ATE ∈ [‘LLATE‘,‘ULATE‘] Note that by definition this interval must contain zero and must have a width of one. Do not confuse this with a confidence interval! Also note that this does not include the random variation associated with our estimates. QUESTION 8: Compare this interval to the true ATE. ANSWER: The interval does indeed include the true ATE value. What’s more, while the interval crosses zero, it clearly has more negative range than positive, which is encouraging considering the true value is in fact negative. Rosenbaum Sensitivity Analysis While Manski Bounds are useful to estimate a “worst case” range of possible values, since zero is necessarily included they are not especially informative. It is useful, then, to be able to use outside knowledge or other reasonable assumptions about possible ranges of values to possibly restrict this interval. The Rosenbaum-Rubin approach incorporates a bit more complexity than do Manski bounds. First, we imagine the unmeasured confounding U as binary with some probability q, where: q=P(Ui =1)=1−P(Ui =0) We then specify models for the relationships between the unmeasured confounding U and the other variables: treatment assignment A, outcome under treatment Y (1), and outcome under no treatment Y (0). Note that we are still ignoring the measured confounder W for now. Since A, Y (1), and Y (0) are all binary, we can specify logistic models for these relationships, like so: logit[P(Ai =1|Ui =u)]=γ0 +γ1 ·u logit[P(Yi(1)=1|Ui =u)]=α0 +α1 ·u logit[P(Yi(0)=1|Ui =u)]=β0 +β1 ·u nder treatment e outcome mean e outcome mean Programming Help, Add QQ: 749389476 P(Ai =1|Ui =u)=logit−1[γ0 +γ1 ·u] = eγ0 +γ1 ·u 1 + eγ0+γ1·u P(Yi(1) = 1|Ui = u) = logit−1[α0 + α1 · u] = eα0 +α1 ·u 1 + eα0+α1·u P(Yi(0) = 1|Ui = u) = logit−1[β0 + β1 · u] = eβ0 +β1 ·u Now we recall that we can write the ATE as: AT E = μt − μc =[p·μt,1 +(1−p)·μt,0]−[p·μc,1 +(1−p)·μc,0] =p·(μt,1 −μc,1)+(1−p)·(μt,0 −μc,0) Following the derivations in Chapter 22 of Imbens/Rubin, we can write each of these parameters (p, μt,1, μt,0, μc,1, μc,0) in terms of the parameters seen above (q, γ0, γ1, α0, α1, β0, β1). Chapter 22.4 details the translations of our reasonably estimable values p, μt,1, and μc,0, and shows that these relationships can allow us to find estimate values for γ0, α0, and β0. If we then postulate values for q, γ1, α1, and β1, we can then estimate values for μt,0 and μc,1. Our sensitivity analysis then comes down to the decisions we make for postulating q, γ1, α1, and β1. Forexample,ifwefixq=pandletγ1 →∞,α1 →∞,andβ1 →∞,wefind: AT E = p · μt,1 − p − (1 − p) · μc,0 which is equivalent to the lower limit derived from the Manski bounds! Furthermore, if we fix q = p and let γ1 →∞,α1 →−∞,andβ1−→∞,wefind: AT E = p · μt,1 + (1 − p) − (1 − p) · μc,0 which is equivalent to the upper limit derived from the Manski bounds! Therefore we can see Manski bounds as simply a special case of this type of sensitivity analysis. This approach has been implemented via a Shiny App here. This app allows you to upload your data and adjust two of the parameters discussed above (q, γ1, α1, β1) while holding the other two constant. QUESTION 9: Upload our data to the app and play around with a few different values/ranges of the four parameters. Take a screenshot of at least one of your iterations and include it below. Discuss what you see and interpret in terms of our original analysis. (Note: You will need to modify the format of our dataset slightly before uploading–read instructions on the app webpage for details) 1 + eβ0+β1·u # create dataframe for uploading # ---------- df.shiny <- df %>% mutate(Treatment = A,
Outcome = Y) %>% select(-W)
# save as .csv
# ———-
write_csv(df.shiny, “df.shiny_new.csv”)
Another technique we will look at is the E-Value. The basic logic of the E-value is that it is the necessary strength of association between an unobserved confounder would need with both the treatment and outcome to erase an effect estimate. A small E-value implies only a small amount of confounding is necessary to erase the results, whereas a large E-value implies a large amount of confounding would be necessary.
To run the E-value calculation, we can use the EValue package. THe main function we will look at is evalues.RR() which evaluates the E-Value using a risk ratio (the probability of an outcome occurring in the exposed group relative to the probability of an outcome in the unexposed group). To run this analysis we simply need this one line:
## point lower upper
## RR 0.3434709 0.3152002 0.3717417
## E-values 5.2705042 NA 4.8222394
Using our ATE estimates from earlier, we see that we get an E-value of 5.27. This E-value implies that a confounder would need to be associated with a 5.27-fold increase in the individual experiencing a migraine, AND be 5.27 times more prevalent among people who received the drug.
We can use the bias_plot() function to get a sense of the range of values for which unobserved confounding would invalidate our inference.
# calculate Evalue’s Risk Ratio: prob outcome in treatment / prob outcome in control # ———-
evalues.RR(est = 0.3434709, lo = 0.3152002, hi =0.3717417)
# look at a range of values
# ———- bias_plot(0.3434709, xmax = 20)

D RU01 R 5
( 5.27, 5.27 )
RREURRUD (RREU + RRUD − 1) = RR
5 10 15 20
In the example above, we see that if the exposure-confounder parameter (RREU ) were about 3.5, meaning
that the confounder(s) is 3.5 times more likely among the treatment group (people who receive the drug), the
(RRUD) parameter would have to be about 20 (migraine) for it to even be possible that confounding explains
the entire observed association (remember in the E-value literature, E is exposure or treatment and D is outcome).
QUESTION 10: Do you find this figure to be reassuring?
ANSWER: Probably not, it is easy to imagine an exogenous variable that increases the likelihood someone
both takes the drug and develops a migraine anyway, and a 5-fold increase in this risk is not especially high.
Like with p-values though, we should be careful about using arbitrary cutoffs to determine whether our
results are valid.
output returns an estimate of the percent of the estimate that would have to be due to bias to invalidate the the correlational strength of a possible confounder need to be to invalidate inference? To get at this, the Finally, another similar sensitivity option is konfound. The logic is similar to that of the E-value–what would inference. One added bonus with this method is that it will give you the number of observations that would
have to be replaced with a case that has a null effect to invalidate inference, which is a very intuitive way to explain sensitivity to readers.
Take a look at the documentation and a useful tutorial.
# before we have conducted an analysis
# ————————————————

# estimate a model # ——— ATE_est <- # run a linear model lm(Y ~ A, data = .) # konfound # --------- konfound(ATE_est, A) ## Robustness of Inference to Replacement (RIR): ## To invalidate an inference, 93.947 % of the estimate would have to be due to bias. ## This is based on a threshold of -0.02 for statistical significance (alpha = 0.05). ## To invalidate an inference, 9395 observations would have to be replaced with cases ## for which the effect is 0 (RIR = 9395). ## See Frank et al. (2013) for a description of the method. ## Citation: Frank, K.A., Maroulis, S., Duong, M., and Kelcey, B. (2013). ## What would it take to change an inference? ## Using Rubin’s causal model to interpret the ## robustness of causal inferences. ## Education, Evaluation and ## Policy Analysis, 35 437-460. We can use pkonfound() when we have values from an already-conducted analysis. One noteable application is that this would allow you to estimate sensativity for models you might encounter in a paper. # when we already conducted an analysis # ------------------------------------------------ # pkonfound # --------- pkonfound(est_eff = -0.3244, std_err = 0.010019, n_obs = 10000, n_covariates = 1) ## Robustness of Inference to Replacement (RIR): ## To invalidate an inference, 93.946 % of the estimate would have to be due to bias. ## This is based on a threshold of -0.02 for statistical significance (alpha = 0.05). ## To invalidate an inference, 9395 observations would have to be replaced with cases ## for which the effect is 0 (RIR = 9395). ## See Frank et al. (2013) for a description of the method. ## Citation: Frank, K.A., Maroulis, S., Duong, M., and Kelcey, B. (2013). ## What would it take to change an inference? ## Using Rubin’s causal model to interpret the CS Help, Email: tutorcs@163.com ## robustness of causal inferences. ## Education, Evaluation and ## Policy Analysis, 35 437-460. ## For other forms of output, run ## ?pkonfound and inspect the to_return argument ## For models fit in R, consider use of konfound(). # threshold plot # share "above threshold" represents share of estimate not confounded pkonfound(est_eff = -0.3244, std_err = 0.010019, n_obs = 10000, n_covariates = 1, to_return = "thresh_plot") Above Threshold Below Threshold Estimated Effect Effect (abs. value) # correlation plot # 0.697 represents the correlation needed of a confounder with treatment & outcome pkonfound(est_eff = -0.3244, std_err = 0.010019, n_obs = 1000, to_return = "corr_plot") To invalidate an inference Predictor of Interest Rx.cv | Z = 0.835 Confounding Variable References Rx.cv | Z * Ry.cv | Z = 0.697 • Imbens, G. W., & Rubin, D. B. (2018). Causal inference for statistics, social, and biomedical sciences: An introduction. New York: Cambridge University Press. • Frank, Kenneth, Spiro J. Maroulis, Minh Q. Duong, and Benjamin M. Kelcey. 2013. “What Would It Take to Change an Inference? Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences.” Educational Evaluation and Policy Analysis 35(4):437–60. doi: 10.3102/0162373713493129. • Crosnoe, Robert. 2009. “Low-Income Students and the Socioeconomic Composition of Public High Schools.” American Sociological Review 74(5):709–30. • https://carohaensch.shinyapps.io/tippingsens/ Ry.cv | Z = 0.835