Help for package SurvMetrics

Type:

Package

Title:

Predictive Evaluation Metrics in Survival Analysis

Version:

0.5.1

Date:

2025-01-16

Author:

Hanpu Zhou [aut], Xuewei Cheng [aut], Sizheng Wang [aut], Yi Zou [aut], Hong Wang [aut, cre]

Maintainer:

Hong Wang <wh@csu.edu.cn>

Description:

An implementation of popular evaluation metrics that are commonly used in survival prediction including Concordance Index, Brier Score, Integrated Brier Score, Integrated Square Error, Integrated Absolute Error and Mean Absolute Error. For a detailed information, see (Ishwaran H, Kogalur UB, Blackstone EH and Lauer MS (2008) <doi:10.1214/08-AOAS169>) , (Moradian H, Larocque D and Bellavance F (2017) <doi:10.1007/s10985-016-9372-1>), (Hanpu Zhou, Hong Wang, Sizheng Wang and Yi Zou (2023) <doi:10.32614/rj-2023-009>) for different evaluation metrics.

License:

GPL-2 | GPL-3 [expanded from: GPL (≥ 2)]

Encoding:

UTF-8

Imports:

survminer, stats, survival, MASS,pec

Suggests:

knitr, rmarkdown, caret, ggplot2, ggpubr, randomForestSRC, testthat (≥ 3.0.0)

NeedsCompilation:

URL:

https://github.com/whcsu/SurvMetrics

Packaged:

2025-02-04 12:39:33 UTC; flood

VignetteBuilder:

knitr

Config/testthat/edition:

Depends:

R (≥ 3.5.0)

RoxygenNote:

7.3.2

Repository:

CRAN

Date/Publication:

2025-02-05 11:50:28 UTC

SurvMetrics: Predictive Evaluation Metrics in Survival Analysis

Description

An implementation of popular evaluation metrics that are commonly used in survival prediction including Concordance Index, Brier Score, Integrated Brier Score, Integrated Square Error, Integrated Absolute Error and Mean Absolute Error. For a detailed information, see (Ishwaran H, Kogalur UB, Blackstone EH and Lauer MS (2008) doi:10.1214/08-AOAS169) , (Moradian H, Larocque D and Bellavance F (2017) doi:10.1007/s10985-016-9372-1), (Hanpu Zhou, Hong Wang, Sizheng Wang and Yi Zou (2023) doi:10.32614/rj-2023-009) for different evaluation metrics.

Author(s)

Maintainer: Hong Wang wh@csu.edu.cn

Authors:

Hanpu Zhou 974637739@qq.com
Xuewei Cheng xwcheng@hunnu.edu.cn
Sizheng Wang 964555106@qq.com
Yi Zou 2584695843@qq.com

Brier

Description

The Brier Score was proposed by Glenn W. Brier in 1950 which is a proper score function that measures the accuracy of probabilistic predictions, usually used to measure the accuracy of a model fit for survival data. Brier can calculate the value of Brier Score at any timepoint, regardless of whether it is the event time.

Usage

Brier(object, pre_sp, t_star = -1)

Arguments

object

object of class Surv in the testing set created by Surv function.

pre_sp

a vector of predicted values of survival probabilities of each observation in testing set at time t_star.

t_star

the timepoint at which the Brier score you want to calculate.

Details

The Brier Score is the mean square difference between the true classes and the predicted probabilities. So the Brier Score can be thought of as a cost function. Therefore, the lower the Brier Score is for a set of predictions, the better the predictions are calibrated. The Brier Score takes on a value between zero and one, since this is the square of the largest possible difference between a predicted probability and the actual outcome. As we all know, for the cencoring samples, we do not know the real time of death, so the residual cannot be directly calculated when making the prediction. So the Brier Score is widely used in survival analysis.

The Brier Score is a strictly proper score (Gneiting and Raftery, 2007), which means that it takes its minimal value only when the predicted probabilities match the empirical probabilities.

Judging from the sparse empirical evidence, predictions of duration of survival tend to be rather inaccurate. More precision is achieved by using patient-specific survival probabilities and the Brier score as predictions to discriminate future survivors from failures.

Value

the Brier Score at time t_star

Author(s)

Hanpu Zhou zhouhanpu@csu.edu.cn

References

Graf, Erika, Schmoor, Claudia, Sauerbrei, & Willi, et al. (1999). Assessment and comparison of prognostic classification schemes for survival data. Statist. Med., 18(1718), 2529-2545.

Brier, G. W. (1950). Verification of forecasts expressed in terms of probability. Monthly Weather Review, 78.

Gneiting, T. , & Raftery, A. E. . (2007). Strictly Proper Scoring Rules, Prediction, and Estimation.

Examples

library(survival)
time <- rexp(50)
status <- sample(c(0, 1), 50, replace = TRUE)
pre_sp <- runif(50)
t_star <- runif(1)
Brier(Surv(time, status), pre_sp, t_star)

Concordance index for right censored survival time data

Description

Concordance index is a rank correlation measures between a variable X and a possibly censored variable Y, with event/censoring indicator. In survival analysis, a pair of patients is called concordant if the risk of the event predicted by a model is lower for the patient who experiences the event at a later timepoint. The concordance probability (C-index) is the frequency of concordant pairs among all pairs of subjects. It can be used to measure and compare the discriminative power of a risk prediction models.

Usage

Cindex(object, predicted, t_star = -1)

Arguments

object

object of class Surv created by Surv function.

predicted

a vector containing the predicted survival time or probability of each observation.

t_star

the timepoint at which the C-index you want to calculate.

Details

Pairs with identical observed times, where one is uncensored and one is censored, are always considered usuable (independent of the value of tiedOutcomeIn), as it can be assumed that the event occurs at a later timepoint for the censored observation.

For uncensored response the result equals the one obtained with the functions rcorr.cens and rcorrcens from the Hmisc package (see examples).

Value

Estimates of the C-index.

Author(s)

Hanpu Zhou zhouhanpu@csu.edu.cn

References

Ishwaran, H. , Kogalur, U. B. , Blackstone, E. H. , & Lauer, M. S. . (2008). Random survival forests. Journal of Thoracic Oncology Official Publication of the International Association for the Study of Lung Cancer, 2(12), 841-860.

Kang, L. , Chen, W. , Petrick, N. A. , & Gallas, B. D. . (2015). Comparing two correlated c indices with right-censored survival outcome: a one-shot nonparametric approach. Statistics in Medicine, 34(4).

TA Gerds, MW Kattan, M Schumacher, and C Yu. Estimating a time-dependent concordance index for survival prediction models with covariate dependent censoring. Statistics in Medicine, Ahead of print:to appear, 2013. DOI = 10.1002/sim.5681

Wolbers, M and Koller, MT and Witteman, JCM and Gerds, TA (2013) Concordance for prognostic models with competing risks Research report 13/3. Department of Biostatistics, University of Copenhagen

Andersen, PK (2012) A note on the decomposition of number of life years lost according to causes of death Research report 12/2. Department of Biostatistics, University of Copenhagen

Paul Blanche, Michael W Kattan, and Thomas A Gerds. The c-index is not proper for the evaluation of-year predicted risks. Biostatistics, 20(2): 347–357, 2018.

Examples

library(survival)
time <- c(1, 1, 2, 2, 2, 2, 2, 2)
status <- c(0, 1, 1, 0, 1, 1, 0, 1)
predicted <- c(2, 3, 3, 3, 4, 2, 4, 3)
Cindex(Surv(time, status), predicted)

Concordance index in the Presence of Competing Risks

Description

The C-index (Concordance index) of the prognostic model in the presence of competing risks according to Marcel, W et al.(2014).

Usage

CindexCR(time, status, predicted, Cause_int = 1)

Arguments

time

minimum value of deletion time and survival time.

status

the status indicator, for models with competing risks, the status indicator is 0=censored, 1=event at time, 2= competing risks at time.

predicted

a vector of predicted values or the survival time of survival probabilities of each observation.

Cause_int

event type of interest, the default value is 1.

Value

Estimates of the C-index in the presence of competing risks.

Author(s)

HanPu Zhou zhouhanpu@csu.edu.cn

References

Marcel, W. , Paul, B. , Koller, M. T. , Witteman, J. , & Gerds, T. A. . (2014).Concordance for prognostic models with competing risks. Biostatistics(3), 526.

Examples

time <- c(4, 7, 5, 8)
status <- rep(1, 4)
predicted <- c(3, 5, 7, 10)
Cause_int <- 1
CindexCR(time, status, predicted, Cause_int)

Gt

Description

G(t)=P(C>t) denote the Kaplan-Meier estimate of the censoring distribution which is used to adjust for censoring. Gt is used to calculate G(t) at any timepoint you want.

Usage

Gt(object, timepoint)

Arguments

object

object of class Surv created by Surv function.

timepoint

any point in time you want to get the Kaplan–Meier estimate of the censoring.

Value

The Kaplan–Meier estimate of the censoring in (0,1).

Author(s)

Hanpu Zhou zhouhanpu@csu.edu.cn

References

Graf, Erika, Schmoor, Claudia, Sauerbrei, & Willi, et al. (1999). Assessment and comparison of prognostic classification schemes for survival data. Statist. Med., 18(1718), 2529-2545.

Kaplan, E. L. , & Meier, P. . (1958). Nonparametric estimation from incomplete observations. Journal of the American Statistical Association, 53, 457-481.

Examples

library(survival)
time <- rexp(50)
status <- sample(c(0, 1), 50, replace = TRUE)
pre_sp <- runif(50)
timepoint <- runif(1)
Gt(Surv(time, status), timepoint)

IAEISE

Description

Two ways of the continuous-time approach to continuous-time identification based on least-squares and least-absolute errors are proposed. Integrate Absolute Error and Integrate Square Error.To evaluate the performance of survival models methods Lower values of IAE or ISE indicate better performances.

Usage

IAEISE(object, sp_matrix, IRange = c(-2, -1))

Arguments

object

object of class Surv on the testing set created by Surv function.

sp_matrix

a matrix of predicted values of survival probabilities for the testing set.

IRange

a vector contains all discrete time points corresponding to the predicted probability in sp_matrix. Or the scale you want to get the IAE and ISE; .

Value

Estimates of the IAE and ISE

Author(s)

Hanpu Zhou zhouhanpu@csu.edu.cn

References

Marron, J. S. , & Wand, M. P. . (1992). Exact mean integrated squared error. Annals of Statistics, 20(2), 712-736.

HooraMoradian, DenisLarocque, & FranoisBellavance. (2017). L1 splitting rules in survival forests. Lifetime Data Analysis, 23(4), 671–691.

Kowalczuk, & Z. (1998). Integrated squared error and integrated absolute error in recursive identification of continuous-time plants. Control 98 Ukacc International Conference on (Vol.1998, pp.693-698). IET.

Examples


library(survival)
library(SurvMetrics)
set.seed(123)
N <- 100
mydata <- SDGM4(N, p = 20, c_step = -0.5)
index.train <- sample(1:N, 2 / 3 * N)
data.train <- mydata[index.train, ]
data.test <- mydata[-index.train, ]

time_interest <- sort(data.train$time[data.train$status == 1])
sp_matrix <- matrix(sort(runif(nrow(data.test) * length(time_interest)),
  decreasing = TRUE
), nrow = nrow(data.test))
object <- Surv(data.test$time, data.test$status)

# a vector for all the distinct time
IAEISE(object, sp_matrix, time_interest)
# a range
IAEISE(object, sp_matrix, c(12, 350))

IBS

Description

IBS is an integrated version of the Brier which is used to calculate the integration of the Brier Score. The Brier Score is the mean square difference between the true classes and the predicted probabilities. Basically, the IBS is an integrated weighted squared distance between the estimated survival function and the empirical survival function. The inverse probability censoring weighting(IPCW) is used to adjust for censoring.

Usage

IBS(object, sp_matrix, IBSrange = c(-2, -1))

Arguments

object

object of class Surv in the testing set created by Surv function.

sp_matrix

a matrix or data.frame of predicted values of survival probabilities for the testing set.

IBSrange

a vector contains all discrete time points corresponding to the predicted probability in sp_matrix. Or the scale you want to get the IBS; and if it is a single point the return value will be the Brier Score at the timepoint.

Details

The percentage of censored observations increases in time, and this will surely affect the dispersion of the empirical Brier Score. The question of how censoring in finite samples acts on the distribution of our measures of inaccuracy is an interesting subject. Our recommendation is to choose t* in a way that censoring is not too heavy (for example, the median follow-up time). We also prefer measures with integrated loss functions since they will reflect inaccuracy over an interval rather than just at one point in time. In addition, the corresponding empirical measures are likely to have lower dispersion, because censored observations contribute their estimated event-free probabilities to the integrand until the censoring occurs.

Value

The integration of brierscore

Author(s)

Hanpu Zhou zhouhanpu@csu.edu.cn

References

HooraMoradian, DenisLarocque, & FranoisBellavance. (2017). \(l_1\) splitting rules in survival forests. Lifetime Data Analysis, 23(4), 671–691.

Graf, Erika, Schmoor, Claudia, Sauerbrei, & Willi, et al. (1999). Assessment and comparison of prognostic classification schemes for survival data. Statist. Med., 18(1718), 2529-2545.

Brier, G. W. . (1950). Verification of forecasts expressed in terms of probability. Monthly Weather Review, 78.

Gneiting, T. , & Raftery, A. E. . (2007). Strictly Proper Scoring Rules, Prediction, and Estimation.

Examples

library(survival)
library(SurvMetrics)
set.seed(123)
N <- 100
mydata <- SDGM4(N, p = 20, c_step = -0.5)
index.train <- sample(1:N, 2 / 3 * N)
data.train <- mydata[index.train, ]
data.test <- mydata[-index.train, ]

time_interest <- sort(data.train$time[data.train$status == 1])
sp_matrix <- matrix(sort(runif(nrow(data.test) * length(time_interest)),
  decreasing = TRUE
), nrow = nrow(data.test))
object <- Surv(data.test$time, data.test$status)

# the default time points
IBS(object, sp_matrix, time_interest)
# a time range
IBS(object, sp_matrix, c(18:100))

Mean Absolute Error

Description

A somewhat naive criterion that is sometimes used consists of simply omitting all censored cases from the data set. For survival analysis problems, the mean absolute error (MAE) can be defined as an average of the differences between the predicted time values and the actual observation time values. Only the samples for which the event occurs are being considered in this metric.

Usage

MAE(object, pre_time)

Arguments

object

object of class Surv created by Surv function.

pre_time

a vector of predicted values of survival time of each observation.

Details

Condition: MAE can only be used for the evaluation of survival models which can provide the event time as the predicted target value.

Value

the value of Mean Absolute Error

Author(s)

Hanpu Zhou zhouhanpu@csu.edu.cn

References

Matsuo, K. , Purushotham, S. , Jiang, B. , Mandelbaum, R. S. , Takiuchi, T. , & Liu, Y. , et al. (2018). Survival outcome prediction in cervical cancer: cox models vs deep-learning model. American Journal of Obstetrics & Gynecology. Coyle, E. J. , & Lin, J. H. . (1988). Stack filters and the mean absolute error criterion. IEEE Trans Acoustics Speech Signal Processing, 36(8), 1244-1254.

Examples

library(survival)
time <- rexp(50)
status <- sample(c(0, 1), 50, replace = TRUE)
pre_time <- rexp(50)
MAE(Surv(time, status), pre_time)

SDGM1

Description

Survival data generation method. An example of the proportional hazards model where in the Cox model is expected to perform best.

Usage

SDGM1(N = 200, p = 15, c_mean = 0.4)

Arguments

N

The sample size of the simulated dataset.

p

The covariate dimension of the simulated dataset.

c_mean

The parameter which is used to control the censoring rate.

Value

the simulated dataset

Author(s)

Hanpu Zhou zhouhanpu@csu.edu.cn

References

Steingrimsson, J. A. , Diao, L. , & Strawderman, R. L. . (2019). Censoring unbiased regression trees and ensembles. Journal of the American Statistical Association, 114.

Zhu, R. , & Kosorok, M. R. . (2012). Recursively imputed survival trees. Journal of the American Statistical Association, 107(497), 331-340.

Ishwaran, H. , Kogalur, U. B. , Gorodeski, E. Z. , Minn, A. J. , & Lauer, M. S. . (2010). High-dimensional variable selection for survival data. Journal of the American Statistical Association, 105(489), 205-217.

Examples

SDGM1(N = 200, p = 15, c_mean = 0.4)

SDGM2

Description

Survival data generation method. The dataset represents mild violations of the proportional hazards assumption.

Usage

SDGM2(N = 200, p = 15, u_max = 4)

Arguments

N

The sample size of the simulated dataset.

p

The covariate dimension of the simulated dataset.

u_max

The parameter which is used to control the censoring rate.

Value

the simulated dataset

Author(s)

Hanpu Zhou zhouhanpu@csu.edu.cn

References

Steingrimsson, J. A. , Diao, L. , & Strawderman, R. L. . (2019). Censoring unbiased regression trees and ensembles. Journal of the American Statistical Association, 114.

Zhu, R. , & Kosorok, M. R. . (2012). Recursively imputed survival trees. Journal of the American Statistical Association, 107(497), 331-340.

Examples

SDGM2(N = 200, p = 15, u_max = 4)

SDGM3

Description

Survival data generation method. The proportional hazards assumption is strongly violated in this dataset.

Usage

SDGM3(N = 200, p = 15, u_max = 7)

Arguments

N

The sample size of the simulated dataset.

p

The covariate dimension of the simulated dataset.

u_max

The parameter which is used to control the censoring rate.

Value

the simulated dataset

Author(s)

Hanpu Zhou zhouhanpu@csu.edu.cn

References

Steingrimsson, J. A. , Diao, L. , & Strawderman, R. L. . (2019). Censoring unbiased regression trees and ensembles. Journal of the American Statistical Association, 114.

Zhu, R. , & Kosorok, M. R. . (2012). Recursively imputed survival trees. Journal of the American Statistical Association, 107(497), 331-340.

Examples

SDGM3(N = 200, p = 15, u_max = 7)

SDGM4

Description

Survival data generation method. An example of the proportional hazards model where in the Cox model is expected to perform best.

Usage

SDGM4(N = 200, p = 15, c_step = 0.4)

Arguments

N

The sample size of the simulated dataset.

p

The covariate dimension of the simulated dataset.

c_step

The parameter which is used to control the censoring rate.

Value

the simulated dataset

Author(s)

Hanpu Zhou zhouhanpu@csu.edu.cn

References

Steingrimsson, J. A. , Diao, L. , & Strawderman, R. L. . (2019). Censoring unbiased regression trees and ensembles. Journal of the American Statistical Association, 114.

Zhu, R. , & Kosorok, M. R. . (2012). Recursively imputed survival trees. Journal of the American Statistical Association, 107(497), 331-340.

Ishwaran, H., Kogalur, U. B., Gorodeski, E.Z., Minn, A.J., & Lauer, M. S. . (2010). High-dimensional variable selection for survival data. Journal of the American Statistical Association, 105(489), 205-217.

Examples

SDGM4(N = 200, p = 15, c_step = 0.4)

SurvMetrics: Predictive Evaluation Metrics in Survival Analysis

Description

Author(s)

See Also

Brier

Description

Usage

Arguments

Details

Value

Author(s)

References

Examples

Concordance index for right censored survival time data

Description

Usage

Arguments

Details

Value

Author(s)

References

Examples

Concordance index in the Presence of Competing Risks

Description

Usage

Arguments

Value

Author(s)

References

Examples

Gt

Description

Usage

Arguments

Value

Author(s)

References

Examples

IAEISE

Description

Usage

Arguments

Value

Author(s)

References

Examples

IBS

Description

Usage

Arguments

Details

Value

Author(s)

References

Examples

Mean Absolute Error

Description

Usage

Arguments

Details

Value

Author(s)

References

Examples

SDGM1

Description

Usage

Arguments

Value

Author(s)

References

Examples

SDGM2

Description

Usage

Arguments

Value

Author(s)

References

Examples