In order to illustrate the use of the functions defined above, we
simulate data following the setting described in Müller and Van Keilegom (2019). In this example,
the cure rate is generated from a logistic model given by
set.seed(1234)
theta0 = c(1,1)
gamma0 = 0.5
gamma1 = 0.5
cT = 0
n <- 200
maxT = 0.02*n
X = runif(n,-1,1)
Z = runif(n,-1,1)
phi = exp(theta0[1]+theta0[2]*X)
phi = phi/(1+phi)
B = (runif(n) <= phi)
aT = (X+1)^cT
lambdaX = exp(gamma0+gamma1*X)
bT = lambdaX^(-1/aT)
tau = qweibull(0.9,shape=mean(aT),scale=mean(bT))
Y = rep(100000,n)
count = 0
for (j in 1:n)
{if (B[j]==1)
{stop = 0
while (stop==0)
{Y[j] = rweibull(1,shape=aT[j],scale=bT[j])
if ((Y[j] > tau)*(count <= maxT))
{Y[j] = tau
count = count + 1}
if (Y[j] <= tau) stop = 1}}}
aC = 1
bC = 1.5
C = rweibull(n,shape=aC,scale=bC)
C = replace(C,C>tau,tau+0.001)
T = apply(cbind(Y,C),1,min)
Delta = as.numeric(Y <= C)
#Covariate hypothesis test for the cure rate with one covariate
testcov(X, T, Delta, method = "All", P = 499)
## ============== Covariate hypothesis test for the cure rate ============
## Method: All
##
## MDCU: Martingale difference correlation unbiased with permutations
## Test statistic: 0.03986
## p-value: 0.002
## Number of permutations: 499
##
## MDCV: Martingale difference correlation biased with permutations
## Test statistic: 0.04746
## p-value: 0.008
## Number of permutations: 499
##
## FMDCU: Fast Chi-square test based on MDC unbiased
## Test statistic: 0.03986
## p-value: 0.0027426
##
## GOFT
## Test statistic: 0.769
## p-value: 0.03006
## Number of bootstrap replicates: 499
##
## ========================================================================
#Covariate hypothesis test for the cure rate with two covariates
testcov2(X, T, Z, Delta, P = 499)
## ---------------------------------------------------------------------
## Covariate hypothesis test for the cure rate with two covariates
## ---------------------------------------------------------------------
## Hypotheses:
## H0: The conditional mean of the cure status given X adjusting on Z does not depend on X,
## i.e., E[nu|X,Z] = E[nu|Z].
## H1: The conditional mean of the cure status depends on X adjusting on Z
##
## Test Statistic: 0.02869
## p-value: 0.004
## Number of permutations: 499
testcov2(Z, T, X, Delta, P = 499)
## ---------------------------------------------------------------------
## Covariate hypothesis test for the cure rate with two covariates
## ---------------------------------------------------------------------
## Hypotheses:
## H0: The conditional mean of the cure status given Z adjusting on X does not depend on Z,
## i.e., E[nu|Z,X] = E[nu|X].
## H1: The conditional mean of the cure status depends on Z adjusting on X
##
## Test Statistic: -0.002226
## p-value: 0.744
## Number of permutations: 499
#Goodness-of-fit test for the cure rate
goft(X, T, Delta, model = "logit")
## ---------------------------------------------------------------------
## Goodness-of-fit test for the cure rate in a mixture cure model
## -------------------------------------------------------------------
##
## Model: logit
##
## Hypotheses:
## H0: The model fits the data, i.e., p(x) = p_theta(x)
## H1: The model does not fit the data, i.e., p(x) != p_theta(x)
##
## Test Statistic: 0.1192
## p-value: 0.4168
## Number of bootstrap replications: 499
# plotCure(X, T, Delta, density = FALSE)
In the first instance, we test whether the cure fraction depends on
the covariate \(X\). All available
methods are applied. By default, the function uses
method = "FMDCU"
, which corresponds to the fast chi-square
test for the martingale difference correlation. In all cases, the
resulting p-values are below 0.05, which aligns with the data-generating
mechanism, the cure probability was simulated to depend on \(X\) through a logistic function. This
provides evidence against the null hypothesis of no covariate effect on
the cure rate.
Secondly, we test whether the cure fraction depends on two
covariates. The order of covariates is crucial in this test:
testcov2(X, Y, Z)
is not equivalent to
testcov2(Z, Y, X)
.
This example illustrates the asymmetry of the test and the importance
of specifying the covariates in the correct order.
Finally, in the third part we test if the data fit to a logistic
model. Note that:
Amico, Mailis, and Ingrid Van Keilegom. 2018. “Cure Models in
Survival Analysis.” Annual Review of Statistics and Its
Application 5 (1): 311–42.
Amico, M, I Van Keilegom, and B Han. 2021. “Assessing Cure Status
Prediction from Survival Data Using Receiver Operating Characteristic
Curves.” Biometrika 108 (3): 727–40.
Gretton, Arthur, Kenji Fukumizu, Choon Teo, Le Song, Bernhard Schölkopf,
and Alex Smola. 2008. “A Kernel Statistical Test of
Independence.” Advances in Neural Information Processing
Systems 20: 585–92.
Gretton, Arthur, Ralf Herbrich, Alexander Smola, Olivier Bousquet,
Bernhard Schölkopf, and Aapo Hyvärinen. 2005. “Kernel Methods for
Measuring Independence.” Journal of Machine Learning
Research 6 (12): 2075–2129.
Heller, Ruth, Yair Heller, and Malka Gorfine. 2013. “A Consistent
Multivariate Test of Association Based on Ranks of Distances.”
Biometrika 100 (2): 503–10.
Laska, Eugene M, and Morris J Meisner. 1992. “Nonparametric
Estimation and Testing in a Cure Model.” Biometrics 48
(4): 1223–34.
López-Cheda, Ana, Ricardo Cao, M Amalia Jácome, and Ingrid Van Keilegom.
2017. “Nonparametric Incidence Estimation and Bootstrap Bandwidth
Selection in Mixture Cure Models.” Computational Statistics
& Data Analysis 105: 144–65.
López-Cheda, Ana, M Amalia Jácome, and Ricardo Cao. 2017.
“Nonparametric Latency Estimation for Mixture Cure Models.”
Test 26 (2): 353–76.
López-Cheda, Ana, M Amalia Jácome, and Ignacio López-de-Ullibarri. 2021.
“Npcure: An R Package for Nonparametric Inference in
Mixture Cure Models.” R Journal 13 (1): 21–41.
Maller, Ross, Sidney Resnick, Soudabeh Shemehsavar, and Muzhi Zhao.
2024. “Mixture Cure Model Methodology in Survival Analysis: Some
Recent Results for the One-Sample Case.” Statistic
Surveys 18: 82–138.
Monroy-Castillo, Blanca E., M. A. Jácome, Ricardo Cao, and Ingrid Van
Keilegom. 2025. “Ovariate Hypothesis Tests for the Cure Rate in
Mixture Cure Models Based on Martingale Difference Correlation.”
Submitted.
Morbiducci, Marta, Alessandra Nardi, and Carla Rossi. 2003.
“Classification of ‘Cured’ Individuals in Survival
Analysis: The Mixture Approach to the Diagnostic–Prognostic
Problem.” Computational Statistics & Data Analysis
41 (3-4): 515–29.
Müller, Ursula U, and Ingrid Van Keilegom. 2019. “Goodness-of-Fit
Tests for the Cure Rate in a Mixture Cure Model.”
Biometrika 106 (1): 211–27.
Park, Trevor, Xiaofeng Shao, and Shun Yao. 2015. “Partial
Martingale Difference Correlation.” Electronic Journal of
Statistics 9 (1): 1492–1517.
Patilea, Valentin, and Ingrid Van Keilegom. 2020. “A General
Approach for Cure Models in Survival Analysis.” The Annals of
Statistics 48 (4): 2323–46.
Peng, Yingwei, and Binbing Yu. 2021. Cure Models: Methods,
Applications, and Implementation. Chapman; Hall/CRC.
Pfister, Niklas, Peter Bühlmann, Bernhard Schölkopf, and Jonas Peters.
2018. “Kernel-Based Tests for Joint Independence.”
Journal of the Royal Statistical Society Series B: Statistical
Methodology 80 (1): 5–31.
Shao, Xiaofeng, and Jingsi Zhang. 2014.
“Martingale Difference
Correlation and Its Use in High-Dimensional Variable Screening.”
Journal of the American Statistical Association 109 (507):
1302–18.
https://doi.org/10.1080/01621459.2014.887012.
Shen, Cencheng, Sambit Panda, and Joshua T Vogelstein. 2022. “The
Chi-Square Test of Distance Correlation.” Journal of
Computational and Graphical Statistics 31 (1): 254–62.
Szekely, Gabor J., Maria L. Rizzo, and Nail K. Bakirov. 2007.
“Measuring and Testing Dependence by Correlation of
Distances.” The Annals of Statistics 35 (6): 2769–94.