Treatment assignment is not randomized in observational studies of recurrent events, which can lead to confounding bias when estimating causal effects. The mean cumulative count (MCC) can be estimated after applying propensity score methods to address confounding, following the approach described by Gaber, et al.1
This vignette demonstrates how to:
The standard (unweighted) Dong-Yasui estimator provides unbiased estimates of the MCC in randomized trials or when there is no confounding bias. However, in observational studies, we need to account for measured confounders that affect both treatment assignment and the recurrent outcome.
Key principle: Use weights when you want to estimate the causal effect of treatment on recurrent event burden, not just describe the observed association.
We’ll use the survival::bladder1 dataset and create a
binary treatment variable for demonstration:
library(mccount)
library(dplyr)
library(WeightIt)
library(MatchIt)
library(patchwork)
# Create example data with binary treatment
bladder_nested <- survival::bladder1 |>
mutate(status = if_else(status > 2, 2, status)) |>
filter(treatment %in% c("placebo", "thiotepa")) |>
tidyr::nest(.by = c(id, treatment, number, size)) |>
mutate(treatment_binary = if_else(treatment == "thiotepa", 1, 0))# Estimate MCC with IPTW weights
mcc_weighted <- mcc(
data = bladder_example,
id_var = "id",
time_var = "stop",
cause_var = "status",
by = "treatment",
weights = "iptw_weights",
method = "equation"
)
#> Warning: Found 7 participants where last observation is an event of interest
#> (`cause_var` = 1)
#> First 5 IDs: 13, 15, 16, 19, 24
#> Total affected: 7 participants
#> ℹ `mcc()` assumes these participants are censored at their final `time_var`
#> ℹ If participants were actually censored or experienced competing risks after
#> their last event, add those observations to ensure correct estimates
#> Warning: Found 2 participants where last observation is an event of interest
#> (`cause_var` = 1)
#> ! ID: 83, 104
#> ℹ `mcc()` assumes these participants are censored at their final `time_var`
#> ℹ If participants were actually censored or experienced competing risks after
#> their last event, add those observations to ensure correct estimates
# Display results
summary(mcc_weighted)
#>
#> ── Summary of Mean Cumulative Count Results ────────────────────────────────────
#> ℹ Method: Dong-Yasui Equation Method
#> ℹ Weighted estimation: Yes
#> ℹ Total participants: 86
#> ℹ Overall observation period: [0, 64]
#>
#> ── Summary by Group (treatment) ──
#>
#> ── Group: placebo
#> Participants in group: 48
#> Group observation period: [0, 64]
#> Time to MCC = 1.0: 17
#> Time to maximum MCC: 53
#> MCC at end of follow-up: 2.6759
#> Events of interest: 87
#> Competing risk events: 11
#> Censoring events: 30
#>
#> ── Group: thiotepa
#> Participants in group: 38
#> Group observation period: [0, 59]
#> Time to MCC = 1.0: 29
#> Time to maximum MCC: 47
#> MCC at end of follow-up: 1.4006
#> Events of interest: 45
#> Competing risk events: 11
#> Censoring events: 25# Estimate unweighted MCC for comparison
mcc_unweighted <- mcc(
data = bladder_example,
id_var = "id",
time_var = "stop",
cause_var = "status",
by = "treatment",
method = "equation"
)
#> Warning: Found 7 participants where last observation is an event of interest
#> (`cause_var` = 1)
#> First 5 IDs: 13, 15, 16, 19, 24
#> Total affected: 7 participants
#> ℹ `mcc()` assumes these participants are censored at their final `time_var`
#> ℹ If participants were actually censored or experienced competing risks after
#> their last event, add those observations to ensure correct estimates
#> Warning: Found 2 participants where last observation is an event of interest
#> (`cause_var` = 1)
#> ! ID: 83, 104
#> ℹ `mcc()` assumes these participants are censored at their final `time_var`
#> ℹ If participants were actually censored or experienced competing risks after
#> their last event, add those observations to ensure correct estimates
# Extract final MCC values for comparison
weighted_final <- mcc_final_values(mcc_weighted)
unweighted_final <- mcc_final_values(mcc_unweighted)
# Create comparison table
comparison_table <- data.frame(
Method = c("Unweighted", "IPTW Weighted"),
Control_MCC = c(
cards::round5(unweighted_final[1], digits = 2),
cards::round5(weighted_final[1], digits = 2)
),
Treated_MCC = c(
cards::round5(unweighted_final[2], digits = 2),
cards::round5(weighted_final[2], digits = 2)
)
)
knitr::kable(comparison_table)| Method | Control_MCC | Treated_MCC |
|---|---|---|
| Unweighted | 2.61 | 1.55 |
| IPTW Weighted | 2.68 | 1.40 |
The weighted estimates represent the causal effect of treatment on recurrent event burden, adjusted for measured confounding.
For propensity score matching:
# Perform 1:1 nearest neighbor matching
match_nn <- matchit(
treatment_binary ~ size + number,
data = bladder_nested
)If you run match_nn in the console, you’ll see that only
76 out of the original 86 patients were matched using the nearest
neighbor approach, which can change the estimand we can estimate with
nearest neighbor matching from the average treatment effect among the
treated (ATT) to the average treatment among the remaining matched
sample (ATM). See {MatchIt} for more details regarding
matching methods and causal estimands.
# Extract matched data (no additional weights needed)
matched_nn_data <- match.data(match_nn) |>
tidyr::unnest(data)
# Estimate MCC on matched data without additional weights
mcc_nn_matched <- mcc(
data = matched_nn_data,
id_var = "id",
time_var = "stop",
cause_var = "status",
by = "treatment_binary",
method = "equation"
# No weights argument needed for simple 1:1 matching (all weights are 1)
)
#> ℹ Converting numeric grouping variable "by_var" to <factor>
#> ℹ Found 2 unique groups: 0 and 1
#> Warning: Found 5 participants where last observation is an event of interest
#> (`cause_var` = 1)
#> ! ID: 13, 16, 19, 34, 44
#> ℹ `mcc()` assumes these participants are censored at their final `time_var`
#> ℹ If participants were actually censored or experienced competing risks after
#> their last event, add those observations to ensure correct estimates
#> Warning: Found 2 participants where last observation is an event of interest
#> (`cause_var` = 1)
#> ! ID: 83, 104
#> ℹ `mcc()` assumes these participants are censored at their final `time_var`
#> ℹ If participants were actually censored or experienced competing risks after
#> their last event, add those observations to ensure correct estimates
summary(mcc_nn_matched)
#>
#> ── Summary of Mean Cumulative Count Results ────────────────────────────────────
#> ℹ Method: Dong-Yasui Equation Method
#> ℹ Total participants: 76
#> ℹ Overall observation period: [0, 64]
#>
#> ── Summary by Group (treatment_binary) ──
#>
#> ── Group: 0
#> Participants in group: 38
#> Group observation period: [0, 64]
#> Time to MCC = 1.0: 22
#> Time to maximum MCC: 53
#> MCC at end of follow-up: 2.5943
#> Events of interest: 63
#> Competing risk events: 7
#> Censoring events: 26
#>
#>
#> ── Group: 1
#> Participants in group: 38
#> Group observation period: [0, 59]
#> Time to MCC = 1.0: 27
#> Time to maximum MCC: 47
#> MCC at end of follow-up: 1.5463
#> Events of interest: 45
#> Competing risk events: 11
#> Censoring events: 25# Perform full matching (creates matching weights)
match_obj <- matchit(
treatment_binary ~ size + number,
data = bladder_nested,
method = "full", # Full matching creates weights
estimand = "ATE"
)
# Check matching balance
summary(match_obj)
#>
#> Call:
#> matchit(formula = treatment_binary ~ size + number, data = bladder_nested,
#> method = "full", estimand = "ATE")
#>
#> Summary of Balance for All Data:
#> Means Treated Means Control Std. Mean Diff. Var. Ratio eCDF Mean
#> distance 0.4495 0.4358 0.2312 1.6411 0.0477
#> size 1.9211 2.0625 -0.0986 0.7794 0.0230
#> number 2.3158 1.9167 0.2232 2.0164 0.0587
#> eCDF Max
#> distance 0.1217
#> size 0.0636
#> number 0.1217
#>
#> Summary of Balance for Matched Data:
#> Means Treated Means Control Std. Mean Diff. Var. Ratio eCDF Mean
#> distance 0.4422 0.4414 0.0141 1.0362 0.0094
#> size 1.9884 1.9767 0.0081 0.8714 0.0150
#> number 2.1008 2.0698 0.0173 1.1531 0.0177
#> eCDF Max Std. Pair Dist.
#> distance 0.0465 0.046
#> size 0.0310 0.268
#> number 0.0465 0.086
#>
#> Sample Sizes:
#> Control Treated
#> All 48. 38.
#> Matched (ESS) 44.04 31.83
#> Matched 48. 38.
#> Unmatched 0. 0.
#> Discarded 0. 0.
# Extract matched data with weights
matched_data <- match_data(match_obj) |>
tidyr::unnest(data)
# The 'weights' column contains the matching weights
head(matched_data[c("id", "treatment", "weights")])
#> # A tibble: 6 × 3
#> id treatment weights
#> <int> <fct> <dbl>
#> 1 1 placebo 1.67
#> 2 2 placebo 1.12
#> 3 3 placebo 1.12
#> 4 4 placebo 1.12
#> 5 5 placebo 2.23
#> 6 6 placebo 1.12# Estimate MCC using matching weights
mcc_matched <- mcc(
data = matched_data,
id_var = "id",
time_var = "stop",
cause_var = "status",
by = "treatment",
weights = "weights", # Use matching weights from MatchIt
method = "equation"
)
#> Warning: Found 7 participants where last observation is an event of interest
#> (`cause_var` = 1)
#> First 5 IDs: 13, 15, 16, 19, 24
#> Total affected: 7 participants
#> ℹ `mcc()` assumes these participants are censored at their final `time_var`
#> ℹ If participants were actually censored or experienced competing risks after
#> their last event, add those observations to ensure correct estimates
#> Warning: Found 2 participants where last observation is an event of interest
#> (`cause_var` = 1)
#> ! ID: 83, 104
#> ℹ `mcc()` assumes these participants are censored at their final `time_var`
#> ℹ If participants were actually censored or experienced competing risks after
#> their last event, add those observations to ensure correct estimates
summary(mcc_matched)
#>
#> ── Summary of Mean Cumulative Count Results ────────────────────────────────────
#> ℹ Method: Dong-Yasui Equation Method
#> ℹ Weighted estimation: Yes
#> ℹ Total participants: 86
#> ℹ Overall observation period: [0, 64]
#>
#> ── Summary by Group (treatment) ──
#>
#> ── Group: placebo
#> Participants in group: 48
#> Group observation period: [0, 64]
#> Time to MCC = 1.0: 17
#> Time to maximum MCC: 53
#> MCC at end of follow-up: 2.5748
#> Events of interest: 87
#> Competing risk events: 11
#> Censoring events: 30
#>
#> ── Group: thiotepa
#> Participants in group: 38
#> Group observation period: [0, 59]
#> Time to MCC = 1.0: 27
#> Time to maximum MCC: 47
#> MCC at end of follow-up: 1.6084
#> Events of interest: 45
#> Competing risk events: 11
#> Censoring events: 25p_unwt <- plot(mcc_unweighted) +
geom_line_mcc(mcc_unweighted) +
labs(subtitle = element_blank(), color = "Treatment") +
scale_y_continuous(limits = c(0, 2.75)) +
ggtitle("Unweighted")
p_wt <- plot(mcc_weighted) +
geom_line_mcc(mcc_weighted) +
ggtitle("IPTW") +
labs(subtitle = "Estimand: ATE", color = "Treatment") +
scale_y_continuous(limits = c(0, 2.75)) +
theme(axis.title.y = element_blank())
p_mwt <- plot(mcc_matched) +
geom_line_mcc(mcc_matched) +
ggtitle("Full Matching") +
labs(subtitle = "Estimand: ATE", color = "Treatment") +
scale_y_continuous(limits = c(0, 2.75)) +
theme(axis.title.y = element_blank())
combined <- p_unwt | p_wt | p_mwt
combined +
plot_layout(guides = "collect") &
theme(legend.position = "bottom")Weighted MCC estimation assumes the standard causal inference assumptions:
The Gaber, et al. paper2 uses stabilized weights, which have the form:
\[ W_i = P(A = a) / P(A = a | L_i) \]
Stabilized weights typically have better finite sample properties than unstabilized weights (\(1 / P(A = a | L_i)\)) because they tend to be less extreme and more stable.
This vignette covers how to use weights from IPTW or matching to get an adjusted MCC point estimate. To get associated confidence intervals, you will need to perform bootstrapping (which isn’t covered in this vignette).
Gaber CE, Edwards JK, Lund JL, Peery AF, Richardson DB, Kinlaw AC. Inverse Probability Weighting to Estimate Exposure Effects on the Burden of Recurrent Outcomes in the Presence of Competing Events. Am J Epidemiol. 2023;192(5):830-839. doi: 10.1093/aje/kwad031↩︎
Gaber CE, Edwards JK, Lund JL, Peery AF, Richardson DB, Kinlaw AC. Inverse Probability Weighting to Estimate Exposure Effects on the Burden of Recurrent Outcomes in the Presence of Competing Events. Am J Epidemiol. 2023;192(5):830-839. doi: 10.1093/aje/kwad031↩︎