| Type: | Package |
| Title: | Longitudinal Surrogate Marker Analysis |
| Version: | 1.1 |
| Description: | Assess the proportion of treatment effect explained by a longitudinal surrogate marker as described in Agniel D and Parast L (2021) <doi:10.1111/biom.13310>; and estimate the treatment effect on a longitudinal surrogate marker as described in Wang et al. (2025) <doi:10.1093/biomtc/ujaf104>. A tutorial for this package can be found at https://www.laylaparast.com/longsurr. |
| License: | GPL-2 | GPL-3 [expanded from: GPL] |
| Imports: | stringr, splines, mgcv, Rsurrogate, dplyr, here, tidyr, fs, KernSmooth, stats, fdapace, grf, lme4, mvnfast, plyr, tibble, magrittr, glue, purrr, readr, refund, fda, fda.usc, survival, MASS |
| NeedsCompilation: | no |
| Packaged: | 2025-11-09 11:33:25 UTC; parastlm |
| Author: | Layla Parast [aut, cre], Denis Agniel [aut], Xuan Wang [aut] |
| Maintainer: | Layla Parast <parast@austin.utexas.edu> |
| Depends: | R (≥ 3.5.0) |
| Repository: | CRAN |
| Date/Publication: | 2025-11-09 12:40:02 UTC |
Example data for semiparametric joint estimation functions
Description
Simulated example data for semiparametric joint estimation functions
Usage
data("data_sjm")
Format
A list with 200 observations on the following:
deltanumeric vector containing the event indicator for each observation
obsTnumeric matrix containing the time that the surrogate marker was measured for each observation; number of rows is equal to the number of observations (200) and number of columns is equal to the maximum number of surrogate markers measured (15)
Ynumeric matrix containing the surrogate marker measurements over time for each observation; same dimension as obsT
Timenumeric vector containing the observed event or censoring time for each observation
Treatmentnumeric vector containing the treatment indicator for each observation with 1 for treated and 0 for control
Examples
data(data_sjm)
names(data_sjm)
Estimate the surrogate value of a longitudinal marker
Description
Estimate the surrogate value of a longitudinal marker
Usage
estimate_surrogate_value(y_t, y_c, X_t, X_c, method = c("gam", "linear",
"kernel"), k = 3, var = FALSE, bootstrap_samples = 50, alpha = 0.05)
Arguments
y_t |
vector of n1 outcome measurements for treatment group |
y_c |
vector of n0 outcome measurements for control or reference group |
X_t |
n1 x T matrix of longitudinal surrogate measurements for treatment group, where T is the number of time points |
X_c |
n0 x T matrix of longitudinal surrogate measurements for control or reference group, where T is the number of time points |
method |
method for dimension-reduction of longitudinal surrogate, either 'gam', 'linear', or 'kernel' |
k |
number of eigenfunctions to use in semimetric |
var |
logical, if TRUE then standard error estimates and confidence intervals are provided |
bootstrap_samples |
number of bootstrap samples to use for standard error estimation, used if var = TRUE, default is 50 |
alpha |
alpha level, default is 0.05 |
Value
a tibble containing estimates of the treatment effect (Deltahat), the residual treatment effect (Deltahat_S), and the proportion of treatment effect explained (R); if var = TRUE, then standard errors of Deltahat_S and R are also provided (Deltahat_S_se and R_se), and quantile-based 95% confidence intervals for Deltahat_S and R are provided (Deltahat_S_ci_l [lower], Deltahat_S_ci_h [upper], R_ci_l [lower], R_ci_u [upper])
References
Agniel D and Parast L (2021). Evaluation of Longitudinal Surrogate Markers. Biometrics, 77(2): 477-489.
Examples
library(dplyr)
data(full_data)
wide_ds <- full_data %>%
dplyr::select(id, a, tt, x, y) %>%
tidyr::spread(tt, x)
wide_ds_0 <- wide_ds %>% filter(a == 0)
wide_ds_1 <- wide_ds %>% filter(a == 1)
X_t <- wide_ds_1 %>% dplyr::select(`-1`:`1`) %>% as.matrix
y_t <- wide_ds_1 %>% pull(y)
X_c <- wide_ds_0 %>% dplyr::select(`-1`:`1`) %>% as.matrix
y_c <- wide_ds_0 %>% pull(y)
estimate_surrogate_value(y_t = y_t, y_c = y_c, X_t = X_t, X_c = X_c,
method = 'gam', var = FALSE)
estimate_surrogate_value(y_t = y_t, y_c = y_c, X_t = X_t, X_c = X_c,
method = 'linear', var = TRUE, bootstrap_sample = 50)
Example data to illustrate functions
Description
Simulated nonsmooth data to illustrate functions
Usage
data("full_data")
Format
A data frame with 10100 observations on the following 5 variables.
ida unique person ID
atreatment group, 0 or 1
tttime
xsurrogate marker value
yprimary outcome
Pre-smooth sparse longitudinal data
Description
Pre-smooth sparse longitudinal data
Usage
presmooth_data(obs_data, ...)
Arguments
obs_data |
data.frame or tibble containing the observed data, with columns |
... |
additional arguments passed on to |
Value
list containing matrices X_t and X_c, which are the smoothed surrogate values for the treated and control groups, respectively, for use in downstream analyses
Examples
library(dplyr)
data(full_data)
obs_ds <- group_by(full_data, id)
obs_data <- sample_n(obs_ds, 5)
obs_data <- ungroup(obs_data)
head(obs_data)
presmooth_X <- presmooth_data(obs_data)
Resampling for Semiparametric Joint Linear Model
Description
Resamples data for variance estimation for the semiparametric joint linear model estimator using weights
Usage
resam(v, X, Time, Delta, obsT, Y)
Arguments
v |
resampling or perturbation weight, must be the same length of X |
X |
numeric vector containing the treatment indicator for each observation with 1 for treated and 0 for control |
Time |
numeric vector containing the observed event or censoring time for each observation |
Delta |
numeric vector containing the event indicator for each observation |
obsT |
numeric matrix containing the time that the surrogate marker was measured for each observation; number of rows should be equal to the number of observations and number of columns should be equal to the maximum number of surrogate markers measured. If the surrogate marker was not measured, the corresponding entry should be 0 or NA. |
Y |
numeric matrix containing the the surrogate marker measurements over time for each observation; number of rows should be equal to the number of observations and number of columns should be equal to the maximum number of surrogate markers measured. If the surrogate marker was not measured, as determined by the obsT entry, the Y at that time will be ignored. |
Value
Returns a numeric vector of resampled estimates.
Resampling for Semiparametric Joint Nonlinear Model
Description
Resamples data for variance estimation for the semiparametric joint nonlinear model estimator using weights
Usage
resam_nonlinear(v, X, Time, Delta, obsT, Y, gap_time)
Arguments
v |
resampling or perturbation weight, must be the same length of X |
X |
numeric vector containing the treatment indicator for each observation with 1 for treated and 0 for control |
Time |
numeric vector containing the observed event or censoring time for each observation |
Delta |
numeric vector containing the event indicator for each observation |
obsT |
numeric matrix containing the time that the surrogate marker was measured for each observation; number of rows should be equal to the number of observations and number of columns should be equal to the maximum number of surrogate markers measured. If the surrogate marker was not measured, the corresponding entry should be 0 or NA. |
Y |
numeric matrix containing the the surrogate marker measurements over time for each observation; number of rows should be equal to the number of observations and number of columns should be equal to the maximum number of surrogate markers measured. If the surrogate marker was not measured, as determined by the obsT entry, the Y at that time will be ignored. |
gap_time |
number indicating gap time for slope estimation |
Value
Returns a numeric vector of resampled estimates.
Semiparametric Joint Modeling of the Treatment Effect on a Longitudinal Surrogate with a Linear Model
Description
Semiparametric joint modeling of the treatment effect on a longitudinal surrogate using both a Cox proportional hazards model and linear model
Usage
sjm_linear_estimate(X, Time, Delta, obsT, Y, n.resample=100, var = FALSE)
Arguments
X |
numeric vector containing the treatment indicator for each observation with 1 for treated and 0 for control |
Time |
numeric vector containing the observed event or censoring time for each observation |
Delta |
numeric vector containing the event indicator for each observation |
obsT |
numeric matrix containing the time that the surrogate marker was measured for each observation; number of rows should be equal to the number of observations and number of columns should be equal to the maximum number of surrogate markers measured. If the surrogate marker was not measured, the corresponding entry should be 0 or NA. |
Y |
numeric matrix containing the the surrogate marker measurements over time for each observation; number of rows should be equal to the number of observations and number of columns should be equal to the maximum number of surrogate markers measured. If the surrogate marker was not measured, as determined by the obsT entry, the Y at that time will be ignored. |
n.resample |
number of resampled estimates used for variance estimation; default is 100. |
var |
logical indicating whether the user would like variance estimates and confidence intervals; default is FALSE. |
Value
A list of estimates is returned:
est |
vector of point estimates where the first entry is the hazard ratio from the Cox model, the second entry is the estimated treatment effect on the surrogate marker at baseline, and the third entry is the estimated treatment on the slope of the surrogate marker i.e., the surrogate marker trajectory |
SE |
if var is TRUE, a vector of standard error estimates corresponding to the returned point estimates |
CI_lower |
if var is TRUE, a vector of estimates for the lower bound of the 95% confidence interval for the quantities corresponding to the returned point estimates |
CI_upper |
if var is TRUE, a vector of estimates for the upper bound of the 95% confidence interval for the quantities corresponding to the returned point estimates |
Author(s)
Xuan Wang
References
Wang X, Zhou J, Parast L, Greene T (2025). Semiparametric Joint Modeling to Estimate the Treatment Effect on a Longitudinal Surrogate with Application to Chronic Kidney Disease Trials. Biometrics, 81(3): ujaf104.
Examples
data(data_sjm)
sjm_linear_estimate(X=data_sjm$Treatment, Time = data_sjm$Time,
Delta = data_sjm$delta, obsT = data_sjm$obsT, Y = data_sjm$Y)
sjm_linear_estimate(X=data_sjm$Treatment, Time =
data_sjm$Time, Delta = data_sjm$delta, obsT = data_sjm$obsT,
Y = data_sjm$Y, n.resample=5, var=TRUE)
Semiparametric Joint Modeling of the Treatment Effect on a Longitudinal Surrogate with a Nonlinear Model
Description
Semiparametric joint modeling of the treatment effect on a longitudinal surrogate using both a Cox proportional hazards model and a splines-based model
Usage
sjm_nl_estimate(X, Time, Delta, obsT, Y, gap_time = 0.1, n.resample = 100,
var = FALSE)
Arguments
X |
numeric vector containing the treatment indicator for each observation with 1 for treated and 0 for control |
Time |
numeric vector containing the observed event or censoring time for each observation |
Delta |
numeric vector containing the event indicator for each observation |
obsT |
numeric matrix containing the time that the surrogate marker was measured for each observation; number of rows should be equal to the number of observations and number of columns should be equal to the maximum number of surrogate markers measured. If the surrogate marker was not measured, the corresponding entry should be 0 or NA. |
Y |
numeric matrix containing the the surrogate marker measurements over time for each observation; number of rows should be equal to the number of observations and number of columns should be equal to the maximum number of surrogate markers measured. If the surrogate marker was not measured, as determined by the obsT entry, the Y at that time will be ignored. |
gap_time |
number indicating gap time for slope estimation; default is 0.1. |
n.resample |
number of resampled estimates used for variance estimation; default is 100. |
var |
logical indicating whether the user would like variance estimates and confidence intervals; default is FALSE. |
Value
A list of estimates is returned:
est |
estimated hazard ratio from the Cox model |
est_t |
vector of estimated treatment effect on the slope of the surrogate marker i.e., the surrogate marker trajectory, on a grid constructed from the given gap time |
t_grid |
vector of grid times corresponding to the returned estimates |
SE_est |
if var is TRUE, standard error estimate of the hazard ratio |
SE_est_t |
if var is TRUE, standard error estimate of the estimated treatment effect on the slope of the surrogate marker |
CI_lower_est |
if var is TRUE, lower bound of the 95% confidence interval for the hazard ratio |
CI_upper_est |
if var is TRUE, upper bound of the 95% confidence interval for the hazard ratio |
CI_lower_est_t |
if var is TRUE, lower bound of the 95% confidence interval for the treatment effect on the slope of the surrogate marker |
CI_upper_est_t |
if var is TRUE, upper bound of the 95% confidence interval for the treatment effect on the slope of the surrogate marker |
Author(s)
Xuan Wang
References
Wang X, Zhou J, Parast L, Greene T (2025). Semiparametric Joint Modeling to Estimate the Treatment Effect on a Longitudinal Surrogate with Application to Chronic Kidney Disease Trials. Biometrics, 81(3): ujaf104.
Examples
data(data_sjm)
sjm_nl_estimate(X=data_sjm$Treatment, Time = data_sjm$Time,
Delta = data_sjm$delta, obsT = data_sjm$obsT, Y = data_sjm$Y, gap_time=0.2)
sjm_nl_estimate(X=data_sjm$Treatment, Time =
data_sjm$Time, Delta = data_sjm$delta, obsT = data_sjm$obsT,
Y = data_sjm$Y, gap_time = 0.2, n.resample=5, var=TRUE)