The R package provides an efficient, object-oriented R6 interface for offline change point detection, implemented in C++ for high performance. This was created as part of the Google Summer of Code 2025 program (see edelweiss611428/rupturesRcpp at gsoc-2025 for the project archive).
+------------------------------------------------------------+ | | | Google Summer of Code 2025 Program | | | | Project: rupturesRcpp | | Contributor: @edelweiss611428 | | Mentors: @tdhock & @deepcharles | | Organisation: The R Project for Statistical Computing | | | +------------------------------------------------------------+
To install the newest version of the package, use the following R code:
library(devtools)
install_github("edelweiss611428/rupturesRcpp") To detect change-points using rupturesRcpp you need
three main components:
costFunc)binSeg,
Window, PELT)Each component is implemented using an R6-based
object-oriented design for modularity and maintainability.
Create a costFunc object by specifying the desired
(supported) cost function and optional parameters:
library("rupturesRcpp")
costFuncObj = costFunc$new("L2")
costFunc$pass() #output attributes corresponding to the specified cost function.$costFunc [1] "L2"
The following table shows the list of supported cost functions. Here,
n is segment length.
| Cost function | Description | Parameters/active bindings | Dimension | Time complexity |
|---|---|---|---|---|
"L1" |
Sum of L1 distances to the segment-wise median; robust
to outliers. |
costFunc |
multi |
O(nlog(n)) |
"L2" |
Sum of squared L2 distances to the segment-wise mean;
faster but less robust than L1. |
costFunc |
multi |
O(1) |
"SIGMA" |
Log-determinant of empirical covariance; models varying mean&variance. | costFunc, addSmallDiag,
epsilon |
multi |
O(1) |
"VAR" |
Sum of squared residuals from a vector autoregressive model with constant noise variance. | costFunc, pVAR |
multi |
O(1) |
"LinearL2" |
Sum of squared residuals from a linear regression model with constant noise variance. | costFunc, intercept |
multi |
O(1) |
If active binding costFunc is modified by assigning to
costFuncObj$costFunc and the required parameters are
missing, the default parameters will be used.
costFuncObj$costFunc = "VAR"
costFuncObj$pass()$costFunc [1] "VAR" $pVAR [1] 1
After initialising a costFunc object, create a
segmentation object such as binSeg, Window, or
PELT.
| R6 Class | Method | Description | Parameters/active bindings |
|---|---|---|---|
binSeg |
Binary Segmentation | Recursively splits the signal at points that minimise the cost. | minSize, jump, costFunc,
tsMat, covariates |
Window |
Slicing Window | Detects change-points using local gains over sliding windows. | minSize, jump, radius,
costFunc,tsMat, covariates |
PELT |
Pruned Exact Linear Time | Optimal segmentation with pruning for linear-time performance. | minSize, jump, costFunc,
tsMat, covariates |
The covariates argument is optional and only required
for models involving both dependent and independent variables (e.g.,
"LinearL2"). If not provided, the model is force-fitted
using only an intercept term (i.e., a column of ones).
A PELT object, for example, can be initialised as
follows:
detectionObj = PELT$new(minSize = 1L, jump = 1L, costFunc = costFuncObj)All segmentation objects implement the following methods:
$describe(printConfig): Views the (current)
configurations of the object.$fit(tsMat, covariates): Constructs a C++
detection module corresponding to the current configurations.$predict(pen): Performs change-point detection given a
linear penalty value.$eval(a,b): Evaluates the cost of a segment (a,b].$plot(d, endPts,...): Plots change-point segmentation
in ggplot style.Active bindings (such as minSize or tsMat)
can be modified at any time—either before or after the object is created
via the $ operator. For consistency, if the object has
already been fitted, modifying any active bindings will automatically
trigger the re-fitting process.
detectionObj$minSize = 2L #Before fitting
detectionObj$fit(a_time_series_matrix) #Fitted
detectionObj$minSize = 1L #After fitting - automatically trigger `$fit()`To demonstrate the package usage, we first consider a simple 2d time series with two piecewise Gaussian regimes and varying variance.
set.seed(1)
tsMat = cbind(c(rnorm(100,0), rnorm(100,5,5)),
c(rnorm(100,0), rnorm(100,5,5)))As our example involves regimes with varying variance, a suitable
costFunc option is "SIGMA". Since the
segmentation objects’ interfaces are similar, it is sufficient to
demonstrate the usage of binSeg only.
SIGMAObj = costFunc$new("SIGMA", addSmallDiag = TRUE, epsilon = 1e-6)
binSegObj = binSeg$new(minSize = 1L, jump = 1L, costFunc = SIGMAObj)
binSegObj$fit(tsMat) Once fitted, $predict() and $eval() can be
used. To view the configurations of the binSeg object, we
can use $describe().
binSegObj$describe(printConfig = TRUE) Binary Segmentation (binSeg) minSize : 1L jump : 1L costFunc : "SIGMA" addSmallDiag : TRUE epsilon : 1e-06 fitted : TRUE n : 200L p : 2L
To obtain an estimated segmentation, we can use the
$predict() method and specify a non-negative penalty value
pen, which should be properly tuned. This returns a sorted
integer vector of end-points, including the number of observations by
design.
Here, we set pen = 100.
binSegObj$predict(pen = 100)[1] 100 200
After running $predict(), the segmentation output is
temporarily saved to the binSeg object, allowing users to
use the $plot() method without specifying
endPts.
binSegObj$plot(d = 1:2,
main = "method: binSeg; costFunc: SIGMA; pen: 100")binSeg object through its active
bindingsYou can also modify a binSeg object’s fields through its
active bindings. To demonstrate this, we consider a piecewise vector
autoregressive example with constant noise variance.
set.seed(1)
tsMat = matrix(c(filter(rnorm(100), filter = 0.9, method = "recursive"),
filter(rnorm(100), filter = -0.9, method = "recursive")))Here, the most suitable cost function is "VAR". Instead
of creating a new binSeg object, we will modify the current
binSegObj as follows:
VARObj = costFunc$new("VAR")
binSegObj$tsMat = tsMat
binSegObj$costFunc = VARObjModifying tsMat (or any other bindings) will
automatically trigger self$fit() if a tsMat
has already existed.
binSegObj$describe(printConfig = TRUE)Binary Segmentation (binSeg) minSize : 1L jump : 1L costFunc : "VAR" pVAR : 1L fitted : TRUE n : 200L p : 1L
We can then perform binary segmentation with pen = 25
and plot the segmentation results.
binSegObj$predict(pen = 25)
binSegObj$plot(d = 1L,
main = "method: binSeg; costFunc: VAR; pen: 25")"L1" cost module, potentially allowing
queries in O(log(n)) time using data structures such as a
persistent segment tree with O(nlog(n))
precomputation."Poisson"
and "Linear-L1").Opt and BottomUp).$eval() methods for parameter
estimation.costFactory class for users focusing solely
on fast cost computation and parameter estimation.$plot() method for models involving both
dependent and independent variables.We welcome all contributions, whether big or small. If you encounter a bug or have a feature request, please open an issue to let us know.
Feel free to fork the repository and make your changes. For significant updates, it’s best to discuss them with us first. When your changes are ready, submit a pull request.
Thanks for helping us improve this project!
This project is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0) License.