Title: Instrumented Difference-in-Differences Decomposition
Version: 0.1.0
Description: Implements a decomposition of the two-way fixed effects instrumental variable estimator into all possible Wald difference-in-differences estimators. Provides functions to summarize the contribution of different cohort comparisons to the overall two-way fixed effects instrumental variable estimate, with or without controls. The method is described in Miyaji (2024) <doi:10.48550/arXiv.2405.16467>.
License: MIT + file LICENSE
Encoding: UTF-8
RoxygenNote: 7.3.1
URL: https://github.com/shomiyaji/twfeiv-decomp
BugReports: https://github.com/shomiyaji/twfeiv-decomp/issues
Suggests: testthat (≥ 3.0.0)
Config/testthat/edition: 3
Depends: R (≥ 3.5)
LazyData: true
Imports: dplyr, Formula, AER, stats, magrittr
NeedsCompilation: no
Packaged: 2025-09-04 20:07:31 UTC; shomi
Author: Sho Miyaji [aut, cre]
Maintainer: Sho Miyaji <sho.miyaji@yale.edu>
Repository: CRAN
Date/Publication: 2025-09-22 11:50:02 UTC

Description

Print the summary.

Usage

print_summary(data, return_df = FALSE)

Arguments

data

A data.frame.

return_df

Logical. If TRUE, returns the summary data.frame.

Value

Invisibly prints the summary to console. Returns a data.frame if return_df = TRUE.


Example simulation data

Description

A toy dataset included in the package to illustrate the use of the twfeiv_decomp() function. This is artificial data and does not represent real observations.

Usage

simulation_data

Format

A data frame with 60 rows and 6 variables:

id

Individual identifier (1–10)

time

Time period (2000–2005)

instrument

Binary instrumental variable

treatment

Treatment variable

outcome

Outcome variable

control1

Control variable 1

control2

Control variable 2

Examples

data(simulation_data)
head(simulation_data)

DID-IV decomposition

Description

twfeiv_decomp() is a function that decomposes the TWFEIV estimator into all possible Wald-DID estimators.

Usage

twfeiv_decomp(formula, data, id_var, time_var, summary_output = FALSE)

Arguments

formula

A formula object of the form Y ~ D + controls | controls + Z, where:

  • Y is the outcome variable,

  • D is the treatment variable,

  • Z is a binary instrumental variable, and

  • controls are optional control variables. Do not include fixed effects (e.g., individual or time dummies) in the control variables.

data

A data frame containing all variables used in the formula, as well as the variables specified by id_var and time_var.

id_var

The name of id variable.

time_var

The name of time variable.

summary_output

Logical. If TRUE, prints a summary table showing, for each design type, the total weight and the weighted average of the Wald-DID estimates. If FALSE (the default), no summary is printed.

Value

If no control variables are included in the formula, the function returns a data frame named exposed_unexposed_combinations which contains the Wald-DID estimates and corresponding weights for each exposed/unexposed cohort pair.

If control variables are included, the function returns a list named decomposition_list containing:

within_IV_coefficient

Numeric. The coefficient from the within-IV regression.

between_IV_coefficient

Numeric. The coefficient from the between-IV regression.

Omega

Numeric. The weight on the within-IV coefficient in the TWFEIV estimator, such that TWFEIV = \Omega \times \text{within} + (1 - \Omega) \times \text{between}.

exposed_unexposed_combinations

A data.frame with the between-IV coefficients and corresponding weights for each exposed/unexposed cohort pair.

Examples

# Load example dataset
data(simulation_data)
head(simulation_data)

# Example without controls
decomposition_result_without_controls <- twfeiv_decomp(outcome ~ treatment | instrument,
                                      data = simulation_data,
                                      id_var = "id",
                                      time_var = "time")

# Example with controls
decomposition_result_with_controls <- twfeiv_decomp(
  outcome ~ treatment + control1 + control2 |control1 + control2 + instrument,
  data = simulation_data,
  id_var = "id",
  time_var = "time"
)