Subject Types and Potential Outcomes

What is one-sided non compliance?

Can non compliance exist in the control group?

What does the notation \(d_i\)(1) = 0 mean?


Types of One-Sided Non Compliance.

Types of One-Sided Non Compliance.


How would you read the following potential outcomes?

\(E[Y_i(z=1,d=0)|d_i(1)=0]\)

\(d_i(1) = 0\) is the shorthand for “Never Takers” under one-sided non-compliance. The term gives the expected outcome of never takers in the treatment group.


How do we find compliers in the control group?



If we can back out the outcome for compliers in the control group, why would we need to use a placebo design?


Excludability violations


The excludability assumption states \(Y_i(z,d)=Y_i(d)\). Put another way: \(Y_i(z=1,d)=Y_i(z=0,d)=Y_i(d)\). Or, potential outcomes respond to treatments, not treatment assignments.

Look at the potential outcomes table below and identify the “types” that violate the excludability assumption (i.e. subjects whose potential outcomes respond to treatment assignment \(z\)):



Answer: Type “B” violates excludability because \(Y_i(z=1,d=0) \neq Y_i(z=0,d=0)\).


Complier Average Causal Effects:


\[CACE = \frac{ITT_Y}{ITT_D}\]


Where:


Example of calculating CACE


Compliance and Outcome Control Treatment
Contacted and Voted 0 700
Contacted and Abstained 0 100
Not contacted and Voted 1100 600
Not contacted and Abstained 900 600


\(ITT_Y\) = Assigned to treatment who voted minus assigned to control who voted =


\[ITT_Y = \frac{700 + 600}{2000} - \frac{1100}{2000} = 0.1\]

\(ITT_Y\) = Proportion assigned to treatment who received treatment minus proportion assigned to control who would received treatment (we can ignore the second part of this becuase we are only looking at one-sided non compliance) =


\[ITT_D = \frac{700 + 100}{2000} - \frac{0 + 0}{2000} = 0.4\]

\[CACE = \frac{ITT_Y}{ITT_D} = \frac{0.1}{0.4} = 0.25\]

The CACE is instrumental variables


Angrist and Pischke, Mostly Harmless Econometrics, Chapter 4: Instrumental Variables and Causality (p.85):


Examples of IV estimation in political science (non-experimental):


Because IV estimation and CACE estimation are technically identitical, we can use the same tools to estimate the CACE as used in instrumental variables, AKA two-stage least squares (2SLS).


Let’s import data with a treatment assignment \(Z\), treatment \(D\), and outcome \(Y\), and estimate the CACE three different ways.


# Import data
dat5 <- read.csv("http://hdl.handle.net/10079/kh189dd")

# Rename variables
dat5 = dat5 %>%
  rename(Z = watch, D = watchdps, Y = ochange)

# Output table using kable
kable(head(dat5)) %>% kable_styling(position = "center")
Z D Y
0 0 0
0 0 0
1 0 0
1 1 1
1 1 0
0 0 1


First, as the (difference-in-means on the outcome)/(difference-in-means on treatment assignment).


# Difference in means
itt <- with(dat5, mean(dat5[Z == 1, "Y"]) - mean(dat5[Z == 0, "Y"]))
ittd <- with(dat5, mean(dat5[Z == 1, "D"]) - mean(dat5[Z == 0, "D"]))
cace <- itt/ittd
cace
## [1] 0.2787494



Second, let’s do this using 2SLS. In practice, we will be using one regression to estimate the \(ITT_Y\), and another to estimate the \(ITT_D\). Example of calculating CACE using 2SLS:


Calculate the \(ITT\) using OLS.


model_itt <- lm_robust(Y ~ Z, data = dat5)
kable(tidy(model_itt)) %>% kable_styling(position = "center")
term estimate std.error statistic p.value conf.low conf.high df outcome
(Intercept) 0.4181818 0.0221928 18.843102 0.0000000 0.3746318 0.4617318 998 Y
Z 0.0570657 0.0314219 1.816111 0.0696533 -0.0045949 0.1187263 998 Y


Next, let’s calculate the \(ITT_D\) using OLS.


model_ittd <- lm_robust(D ~ Z, data = dat5)
kable(tidy(model_ittd)) %>% kable_styling(position = "center")
term estimate std.error statistic p.value conf.low conf.high df outcome
(Intercept) 0.1616162 0.0165615 9.758530 0 0.1291168 0.1941156 998 D
Z 0.2047205 0.0271084 7.551908 0 0.1515244 0.2579166 998 D


Now let’s calculate the \(CACE\) as the ratio of the two.


# Calculate the CACE
itt = model_itt$coefficients[2] 
itt_d = model_ittd$coefficients[2]

itt/itt_d
##         Z 
## 0.2787494


Third, we can use “iv_robust” from DeclareDesign (or ivreg from the AER package) to do 2SLS in one step.


model_cace <- iv_robust(Y ~ D | Z, data = dat5)
kable(tidy(model_cace)) %>% kable_styling(position = "center")
term estimate std.error statistic p.value conf.low conf.high df outcome
(Intercept) 0.3731314 0.0435674 8.564455 0.0000000 0.2876371 0.4586257 998 Y
D 0.2787494 0.1530055 1.821826 0.0687807 -0.0215001 0.5789988 998 Y


Adding covariates. You must explicitly specify all exogenous regressors on both sides of the bar.


# Add covariates
dat5$X1 = with(dat5, rnorm(nrow(dat5), mean = 0, sd = 2))
dat5$X2 = with(dat5, 
               ifelse(Y == 1, 
                      rbinom(nrow(dat5), size = 1, prob = 0.7),
                      rbinom(nrow(dat5), size = 1, prob = 0.3)))

# IV robust with covariate
iv_robust_cov_right = iv_robust(Y ~ D + X1 + X2| Z + X1 + X2, data = dat5)
iv_robust_cov_wrong = iv_robust(Y ~ D | Z + X1 + X2, data = dat5)

# Output results
kable(tidy(iv_robust_cov_right)) %>% kable_styling(position = "center")
term estimate std.error statistic p.value conf.low conf.high df outcome
(Intercept) 0.1849672 0.0395286 4.679328 0.0000033 0.1073983 0.2625360 996 Y
D 0.2974963 0.1402557 2.121100 0.0341599 0.0222658 0.5727269 996 Y
X1 -0.0175552 0.0074307 -2.362528 0.0183422 -0.0321368 -0.0029736 996 Y
X2 0.3885044 0.0293627 13.231204 0.0000000 0.3308845 0.4461243 996 Y
kable(tidy(iv_robust_cov_wrong)) %>% kable_styling(position = "center")
term estimate std.error statistic p.value conf.low conf.high df outcome
(Intercept) 0.3400422 0.0431174 7.886423 0.0000000 0.2554310 0.4246534 998 Y
D 0.4036144 0.1513270 2.667168 0.0077732 0.1066589 0.7005699 998 Y


Manipulating levels of compliance


Suppose that we run an experiment in which some units are assigned to control \((Z=0)\), others to a condition in which minimal effort is made to treat units \((Z=1)\), and a third condition in which maximal effort is made to treat units \((Z=2)\). In essence, we are manipulating compliance rates, and want to compare treatment effects for different types of compliers. Under one-sided non-compliance, we can define types and estimands.

Hint: this should be helpful on question 9 of your problem set.


Type Z = 0 Z = 1 Z = 2
Never Takers 0 0 0
Minimal Compliers 0 1 1
Maximal Compliers 0 0 1


1.) First, let’s estimate the treatment effect for all compliers.

\[E[Yi|Z=2]−E[Yi|Z=0] = \]

\[ E[Y_i(0)|Z=2,NT] \times π_{NT} + E[Y_i(1)|Z=2,Min] \times π_{Min} + E[Y_i(1)|Z=2,Max] \times π_{Max} \]

\[ − (E[Y_i(0)|Z=0,NT] \times π_{NT} + E[Y_i(0)|Z=0,Min] \times π_{Min} + E[Y_i(0)|Z=0,Max] \times π_{Max})\]


Because of random assignment, each experimental group is a “random subset” of the population with similar proportions of each type (\(π\)’s). Due to the excludability assumption, \(E[Yi(0)|Z=2,NT]=E[Y_i(0)|Z=0,NT]\). This is because potential outcomes respond to \(d\), not \(z\), and \(d(z=2)=d(z=0)=0\) for never takers. This therefore simplifies to:


\[E[Y_i(Z=2,D=1)−Y_i(Z=0,D=0)|Min] \times π_{Min} + E[Y_i(Z=2,D=1) − Y_i(Z=0,D=0)|Max] \times π_{Max} = \]

\[E[Y_i(1)−Y_i(0)|Complier] \times π_{Min+Max} \]


So:

\[=E[Y_i(1)−Y_i(0)|Complier] \times π_{Min+Max} = CACE \times π_{C}\]


\[CACE = \frac{E[Y_i|Z=2]−E[Y_i|Z=0]}{π_{C}}\]


2.) Now let’s estimate the treatment effect for minimal compliers. We will compare the first treatment group to the control group:

\[E[Y_i|Z=1] − E[Y_i|Z=0] = \] \[= E[Y_i(0)|Z=1,NT] \times π_{NT} + E[Y_i(1)|Z=1,Min] \times π_{Min} + E[Y_i(0)|Z=1,Max] \times π_{Max} - \]

\[(E[Y_i(0)|Z=0,NT] \times π_{NT} + E[Y_i(0)|Z=0,Min] \times π_{Min} + E[Y_i(0)|Z=0,Max] \times π-{Max}) = \] \[E[Y_i(1) − Y_i(0)|Min] \times π_{Min}\]


So:

\[CACE_{Min} = \frac{E[Y_i | Z = 1] = E[Y_i | Z = 0]}{\pi_{min}}\]


3.) To estimate the treatment effect on maximal compliers, compare the second and first treatment groups:

\[E[Y_i|Z=2] − E[Y_i|Z=1] = \] \[E[Y_i(0)|Z=2,NT] \times π_{NT} + E[Y_i(1)|Z=2,Min] \times π_{Min} + E[Y_i(1)|Z=2,Max] \times π_{Max} - \]

\[(E[Y_i(0)|Z=1,NT] \times π_{NT} + E[Y_i(1)|Z=1,Min] \times π_{Min} + E[Y_i(0)|Z=1,Max] \times π_{Max}) = \]

\[E[Y_i(1) − Y_i(0)|Max] \times π_{Max}\]


So:

\[CACE_{Max} = \frac{E[Y_i | Z = 2] = E[Y_i | Z = 1]}{\pi_{max}} \]


Practice experiments


Let’s practice conducting a power analysis using DeclareDesign.


First, we will define our research design.


# we should turn this into a picture labeling MIDA
simple_design <- 
  
  # M: model
  
  # a 100-unit population with an unobserved shock 'e'
  declare_population(N = 100, u = rnorm(N)) +
  
  # two potential outcomes, Y_Z_0 and Y_Z_1
  # Y_Z_0 is the control potential outcome (what would happen if the unit is untreated)
  #   it is equal to the unobserved shock 'u'
  # Y_Z_1 is the treated potential outcome 
  #   it is equal to the control potential outcome plus a treatment effect of 0.25
  declare_potential_outcomes(Y_Z_0 = u, 
                             Y_Z_1 = Y_Z_0 + rnorm(N, mean = 0.25, sd = 0.1)) +
  
  # I: inquiry
  
  # we are interested in the average treatment effect in the population (PATE)
  declare_estimand(PATE = mean(Y_Z_1 - Y_Z_0)) +
  
  # D: data strategy
  
  # sampling: we randomly sample 50 of the 100 units in the population
  declare_sampling(n = 50) +
  
  # assignment: we randomly assign half of the 50 sampled units to treatment (half to control)
  declare_assignment(prob = 0.5) +
  
  # reveal outcomes: construct outcomes from the potential outcomes named Y depending on 
  #   the realized value of their assignment variable named Z
  declare_reveal(outcome_variables = Y, assignment_variables = Z) +
  
  # A: answer strategy
  
  # calculate the difference-in-means of Y depending on Z 
  # we link this estimator to PATE because this is our estimate of our inquiry
  declare_estimator(Y ~ Z, model = difference_in_means, estimand = "PATE")


Now let’s tell DeclareDesign what properties of our design we are interested in. We probably want to know power here more than anything.


Power: The probability over many iterations of a study of getting a statistically significant result (i.e. how often you reject a null hypothesis that you should reject).



# Select diagnosands
simple_design_diagnosands <- 
  declare_diagnosands(select = c(bias, rmse, power))


Now we ask DeclareDesign to return the properties we are interested in.


# Diagnose the design
simple_design_diagnosis <- 
  diagnose_design(simple_design, 
                  diagnosands = simple_design_diagnosands, 
                  sims = 1000)

simple_design_diagnosis
## 
## Research design diagnosis based on 1000 simulations. Diagnosand estimates with bootstrapped standard errors in parentheses (100 replicates).
## 
##   Design Label Estimand Label Estimator Label Term N Sims   Bias   RMSE
##  simple_design           PATE       estimator    Z   1000   0.01   0.28
##                                                           (0.01) (0.01)
##   Power
##    0.14
##  (0.01)


Looks like our experiment is underpowered. Let’s rewrite the code above more compactly and try again with a larger sample size.


# Or more compactly
# Let's always assign half of our units to treatment
N = 1000
m = N/2

simple_design <- 
  declare_population(N = N, u = rnorm(N)) +
  declare_potential_outcomes(Y_Z_0 = u, Y_Z_1 = Y_Z_0 + 0.25) +
  declare_estimand(PATE = mean(Y_Z_1 - Y_Z_0)) +
  declare_sampling(n = m) +
  declare_assignment(prob = 0.5) +
  declare_reveal(outcome_variables = Y, assignment_variables = Z) +
  declare_estimator(Y ~ Z, model = difference_in_means, estimand = "PATE")

simple_design_diagnosands <- 
  declare_diagnosands(select = c(bias, rmse, power))

simple_design_diagnosis <- 
  diagnose_design(simple_design, 
                  diagnosands = simple_design_diagnosands, 
                  sims = 1000)

simple_design_diagnosis
## 
## Research design diagnosis based on 1000 simulations. Diagnosand estimates with bootstrapped standard errors in parentheses (100 replicates).
## 
##   Design Label Estimand Label Estimator Label Term N Sims   Bias   RMSE
##  simple_design           PATE       estimator    Z   1000   0.00   0.09
##                                                           (0.00) (0.00)
##   Power
##    0.81
##  (0.01)


Candle experiment


# Candle experiment
N = 50
m = N/2

# Histogram
hist(rnorm(N, mean = 180, sd = 10))

candle_design <- 
  declare_population(N = N, u = rnorm(N, mean = 180, sd = 10)) +
  declare_potential_outcomes(Y_Z_0 = u, 
                             Y_Z_1 = Y_Z_0 + rnorm(N, mean = 60, sd = 5)) +
  declare_estimand(PATE = mean(Y_Z_1 - Y_Z_0)) +
  declare_sampling(n = m) +
  declare_assignment(prob = 0.5) +
  declare_reveal(outcome_variables = Y, assignment_variables = Z) +
  declare_estimator(Y ~ Z, model = difference_in_means, estimand = "PATE")

candle_diagnosis <- 
  declare_diagnosands(select = c(bias, rmse, power))

candle_diagnosis <- 
  diagnose_design(candle_design, 
                  diagnosands = candle_diagnosis, 
                  sims = 1000)

candle_diagnosis
## 
## Research design diagnosis based on 1000 simulations. Diagnosand estimates with bootstrapped standard errors in parentheses (100 replicates).
## 
##   Design Label Estimand Label Estimator Label Term N Sims   Bias   RMSE
##  candle_design           PATE       estimator    Z   1000   0.00   4.30
##                                                           (0.13) (0.10)
##   Power
##    1.00
##  (0.00)