| Title: | Pipeline Tools Inspired by 'GNU Make' |
| Version: | 0.2.2 |
| Description: | A suite of tools for transforming an existing workflow into a self-documenting pipeline with very minimal upfront costs. Segments of the pipeline are specified in much the same way a 'Make' rule is, by declaring an executable recipe (which might be an R script), along with the corresponding targets and dependencies. When the entire pipeline is run through, only those recipes that need to be executed will be. Meanwhile, execution metadata is captured behind the scenes for later inspection. |
| License: | GPL (≥ 3) |
| URL: | https://kinto-b.github.io/makepipe/, https://github.com/kinto-b/makepipe |
| BugReports: | https://github.com/kinto-b/makepipe/issues |
| Imports: | cli, nomnoml, R6, utils, roxygen2 |
| Suggests: | knitr, covr, testthat (≥ 3.0.0), withr, rmarkdown, webshot2, visNetwork, |
| Config/testthat/edition: | 3 |
| Encoding: | UTF-8 |
| RoxygenNote: | 7.2.3 |
| VignetteBuilder: | knitr |
| NeedsCompilation: | no |
| Packaged: | 2025-01-05 16:50:57 UTC; kinto |
| Author: | Kinto Behr [aut, cre, cph] |
| Maintainer: | Kinto Behr <kinto.behr@gmail.com> |
| Repository: | CRAN |
| Date/Publication: | 2025-01-07 10:30:02 UTC |
Pipeline visualisations
Description
A Pipeline object is automatically constructed as calls to
make_*() are made. It stores the relationships between targets,
dependencies, and sources.
Public fields
segmentsA list of
Segmentobjects
Methods
Public methods
Method add_source_segment()
Add an edge to edges
Add any nodes in private$edges that are missing from
private$nodes into private$nodes
Reconstruct Pipeline edges from Segment edges. Called primarily to update outofdateness
Add a pipeline segment corresponding to a make_with_source()
call
Usage
Pipeline$add_source_segment( source, targets, dependencies, packages, envir, force )
Arguments
sourceThe path to an R script which makes the
targetstargetsA character vector of paths to files
dependenciesA character vector of paths to files which the
targetsdepend onpackagesA character vector of names of packages which
targetsdepend onenvirThe environment in which to execute the
sourceorrecipe.forceA logical determining whether or not execution of the
sourceorrecipewill be forced (i.e. happen whether or not the targets are out-of-date)new_edgeAn data.frame constructed with
new_edge()
Returns
The SegmentSource added to the Pipeline
Method add_recipe_segment()
Add a pipeline segment corresponding to a make_with_recipe()
call
Usage
Pipeline$add_recipe_segment( recipe, targets, dependencies, packages, envir, force )
Arguments
recipeA language object which, when evaluated, makes the
targetstargetsA character vector of paths to files
dependenciesA character vector of paths to files which the
targetsdepend onpackagesA character vector of names of packages which
targetsdepend onenvirThe environment in which to execute the
sourceorrecipe.forceA logical determining whether or not execution of the
sourceorrecipewill be forced (i.e. happen whether or not the targets are out-of-date)
Returns
The SegmentRecipe added to the Pipeline
Method build()
Build all targets
Usage
Pipeline$build(quiet = getOption("makepipe.quiet"))Arguments
quietA logical determining whether or not messages are signaled
Returns
self
Method clean()
Clean all targets
Usage
Pipeline$clean()
Returns
self
Method touch()
Touch all targets, updating file modification time to current time. This is useful when you know your targets are all up-to-date but makepipe doesn't (e.g. after a negligible change was made to your source code).
Usage
Pipeline$touch()
Returns
self
Method annotate()
Apply annotations to Pipeline
Usage
Pipeline$annotate(labels = NULL, notes = NULL)
Arguments
labelsA named character vector mapping nodes in the
Pipelineonto labels to display beside them.notesA named character vector mapping nodes in the
Pipelineonto notes to display on beside the labels (nomnoml) or as tooltips (visNetwork).
Method refresh()
Refresh Pipeline to check outofdateness
Usage
Pipeline$refresh()
Method nomnoml()
Display the pipeline with nomnoml
Usage
Pipeline$nomnoml(
direction = c("down", "right"),
arrow_size = 1,
edge_style = c("hard", "rounded"),
bend_size = 0.3,
font = "Courier",
font_size = 12,
line_width = 3,
padding = 16,
spacing = 40,
leading = 1.25,
stroke = "#33322E",
fill_arrows = FALSE,
gutter = 5,
edge_margin = 0
)Arguments
directionThe direction the flowchart should go in
arrow_sizeThe arrowhead size
edge_styleThe arrow edge style
bend_sizeThe degree of rounding in the arrows (requires
edge_style=rounded)fontThe name of a font to use
font_sizeThe font size
line_widthThe line width for arrows and box outlines
paddingThe amount of padding within boxes
spacingThe amount of spacing between boxes,
leadingThe amount of spacing between lines of text
strokeThe color of arrows, text, and box outlines
fill_arrowsWhether arrow heads are full triangles (
TRUE) or angled (FALSE)gutterThe amount space to leave around the flowchart
edge_marginThe amount of space to leave between boxes and arrows
Returns
self
Method visnetwork()
Display the pipeline with nomnoml
Usage
Pipeline$visnetwork(...)
Arguments
...Arguments (other than
nodesandedges) to pass tovisNetwork::visNetwork()
Returns
self
Method text_summary()
Display a text summary of the pipeline
Usage
Pipeline$text_summary()
Returns
self
Method print()
Display
Usage
Pipeline$print(...)
Arguments
...Arguments (other than
nodesandedges) to pass tovisNetwork::visNetwork()
Returns
self
Method save_visnetwork()
Save pipeline visNetwork
Usage
Pipeline$save_visnetwork(file, selfcontained = TRUE, background = "white", ...)
Arguments
fileFile to save HTML into
selfcontainedWhether to save the HTML as a single self-contained file (with external resources base64 encoded) or a file with external resources placed in an adjacent directory.
backgroundText string giving the html background color of the widget. Defaults to white.
...Arguments (other than
nodesandedges) to pass tovisNetwork::visNetwork()
Returns
self
Method save_nomnoml()
Save pipeline nomnoml
Usage
Pipeline$save_nomnoml(file, width = NULL, height = NULL, ...)
Arguments
fileFile to save the png into
widthImage width
heightImage height
...Arguments to pass to
self$nomnoml()
Returns
self
Method save_text_summary()
Save a text summary of the pipeline
Usage
Pipeline$save_text_summary(file)
Arguments
fileFile to save text summary into
Returns
self
Method clone()
The objects of this class are cloneable with this method.
Usage
Pipeline$clone(deep = FALSE)
Arguments
deepWhether to make a deep clone.
See Also
Other pipeline:
pipeline-accessors,
pipeline-vis
Segment
Description
A Segment object is automatically constructed and attached to
the Pipeline when a call to make_*() is made. It stores the relationships
between targets, dependencies, and sources.
Public fields
targetsA character vector of paths to files
dependenciesA character vector of paths to files which the
targetsdepend onpackagesA character vector of names of packages which
targetsdepend onforceA logical determining whether or not execution of the
sourceorrecipewill be forced (i.e. happen whether or not the targets are out-of-date)envirThe environment in which to execute the instructions.
resultAn object, whatever is returned by executing the instructions
executedA logical, whether or not the instructions were executed
execution_timeA difftime, the time taken to execute the instructions
labelA short label for the segment
noteA description of what the segment does
Active bindings
edgesGet edges connecting the dependencies, instructions, and targets
nodesGet nodes corresponding to dependencies, instructions, and targets
text_summaryA plain text summary of the Segment
Methods
Public methods
Method new()
Initialise a new Segment
Usage
Segment$new( id, targets, dependencies, packages, envir, force, executed, result, execution_time )
Arguments
idAn integer that uniquely identifies the segment
targetsA character vector of paths to files
dependenciesA character vector of paths to files which the
targetsdepend onpackagesA character vector of names of packages which
targetsdepend onenvirThe environment in which to execute the instructions.
forceA logical determining whether or not execution of the
sourceorrecipewill be forced (i.e. happen whether or not the targets are out-of-date)executedA logical, whether or not the instructions were executed
resultAn object, whatever is returned by executing the instructions
execution_timeA difftime, the time taken to execute the instructions
Method print()
Printing method
Usage
Segment$print()
Method update_result()
Update the Segment with new execution information
Usage
Segment$update_result(executed, execution_time, result)
Arguments
executedA logical, whether or not the instructions were executed
execution_timeA difftime, the time taken to execute the instructions
resultAn object, whatever is returned by executing the instructions
Method annotate()
Apply annotations to Segment
Usage
Segment$annotate(label = NULL, note = NULL)
Arguments
labelA short label for the segment
noteA description of what the segment does
Method clone()
The objects of this class are cloneable with this method.
Usage
Segment$clone(deep = FALSE)
Arguments
deepWhether to make a deep clone.
See Also
Other segment:
SegmentRecipe,
SegmentSource
Segment
Description
A Segment object is automatically constructed and attached to
the Pipeline when a call to make_*() is made. It stores the relationships
between targets, dependencies, and sources.
Super class
makepipe::Segment -> SegmentRecipe
Public fields
recipeA chunk of R code which makes the
targets
Methods
Public methods
Inherited methods
Method new()
Initialise a new Segment
Usage
SegmentRecipe$new( id, recipe, targets, dependencies, packages, envir, force, executed, result, execution_time )
Arguments
idAn integer that uniquely identifies the segment
recipeA chunk of R code which makes the
targetstargetsA character vector of paths to files
dependenciesA character vector of paths to files which the
targetsdepend onpackagesA character vector of names of packages which
targetsdepend onenvirThe environment in which to execute the instructions.
forceA logical determining whether or not execution of the
sourceorrecipewill be forced (i.e. happen whether or not the targets are out-of-date)executedA logical, whether or not the instructions were executed
resultAn object, whatever is returned by executing the instructions
execution_timeA difftime, the time taken to execute the instructions
Method update_result()
Update the Segment with new execution information
Usage
SegmentRecipe$update_result(executed, execution_time, result)
Arguments
executedA logical, whether or not the instructions were executed
execution_timeA difftime, the time taken to execute the instructions
resultAn object, whatever is returned by executing the instructions
Method execute()
Execute the Segment
Usage
SegmentRecipe$execute(envir = NULL, quiet = getOption("makepipe.quiet"), ...)Arguments
envirThe environment in which to execute the instructions.
quietA logical determining whether or not messages are signaled
...Additional parameters to pass to
base::eval()
Method clone()
The objects of this class are cloneable with this method.
Usage
SegmentRecipe$clone(deep = FALSE)
Arguments
deepWhether to make a deep clone.
See Also
Other segment:
SegmentSource,
Segment
Segment
Description
A Segment object is automatically constructed and attached to
the Pipeline when a call to make_*() is made. It stores the relationships
between targets, dependencies, and sources.
Super class
makepipe::Segment -> SegmentSource
Public fields
sourceThe path to an R script which makes the
targets
Methods
Public methods
Inherited methods
Method new()
Initialise a new Segment
Usage
SegmentSource$new( id, source, targets, dependencies, packages, envir, force, executed, result, execution_time )
Arguments
idAn integer that uniquely identifies the segment
sourceThe path to an R script which makes the
targetstargetsA character vector of paths to files
dependenciesA character vector of paths to files which the
targetsdepend onpackagesA character vector of names of packages which
targetsdepend onenvirThe environment in which to execute the instructions.
forceA logical determining whether or not execution of the
sourceorrecipewill be forced (i.e. happen whether or not the targets are out-of-date)executedA logical, whether or not the instructions were executed
resultAn object, whatever is returned by executing the instructions
execution_timeA difftime, the time taken to execute the instructions
Method update_result()
Update the Segment with new execution information
Usage
SegmentSource$update_result(executed, execution_time, result)
Arguments
executedA logical, whether or not the instructions were executed
execution_timeA difftime, the time taken to execute the instructions
resultAn object, whatever is returned by executing the instructions
Method execute()
Execute the Segment
Usage
SegmentSource$execute(envir = NULL, quiet = getOption("makepipe.quiet"), ...)Arguments
envirThe environment in which to execute the instructions.
quietA logical determining whether or not messages are signaled
...Additional parameters to pass to
base::source()
Method clone()
The objects of this class are cloneable with this method.
Usage
SegmentSource$clone(deep = FALSE)
Arguments
deepWhether to make a deep clone.
See Also
Other segment:
SegmentRecipe,
Segment
Parameters for make-like functions
Description
Parameters for make-like functions
Arguments
source |
The path to an R script which makes the |
recipe |
A chunk of R code which makes the |
targets |
A character vector of paths to files |
dependencies |
A character vector of paths to files which the |
packages |
A character vector of names of packages which |
envir |
The environment in which to execute the |
quiet |
A logical determining whether or not messages are signaled |
force |
A logical determining whether or not execution of the |
label |
A short label for the |
build |
A logical determining whether or not the pipeline/segment will be built immediately or simply returned to the user |
Register objects to be returned from make_with_source
Description
It is sometimes useful to have access to certain objects which are generated as side-products in a source script which yields as a main-product one or more targets. Typically these objects are used for checking that the targets were produced as expected.
Usage
make_register(value, name, quiet = FALSE)
Arguments
value |
A value to be registered in a source script and returned as part
of the |
name |
A variable name, given as a character string. No coercion is done, and the first element of a character vector of length greater than one will be used, with a warning. |
quiet |
A logical determining whether or not warnings are signaled when
|
Value
value invisibly
Examples
## Not run:
# Imagine this is part of your source script:
x <- readRDS("input.Rds")
x <- do_stuff(x)
chk <- do_check(x)
make_register(chk, "x_check")
saveRDS(x, "output.Rds")
# You will have access to `chk` in your pipeline script:
step_one <- make_with_source(
"source.R",
"output.Rds",
"input.Rds",
)
step_one$result$chk
## End(Not run)
Create a pipeline using roxygen tags
Description
Instead of maintaining a separate pipeline script containing calls to
make_with_source(), you can add roxygen-like headers to the .R files in
your pipeline containing the @makepipe tag along with @targets,
@dependencies, and so on. These tags will be parsed by make_with_dir()
and used to construct a pipeline. You can call a specific part of the
pipeline that has been documented in this way using make_with_roxy().
Usage
make_with_dir(
dir = ".",
recursive = FALSE,
build = TRUE,
envir = new.env(parent = parent.frame()),
quiet = getOption("makepipe.quiet")
)
make_with_roxy(
source,
envir = new.env(parent = parent.frame()),
quiet = getOption("makepipe.quiet"),
build = TRUE
)
Arguments
dir |
A character vector of full path names; the default corresponds to the working directory |
recursive |
A logical determining whether or not to recurse into subdirectories |
build |
A logical determining whether or not the pipeline/segment will be built immediately or simply returned to the user |
envir |
The environment in which to execute the |
quiet |
A logical determining whether or not messages are signaled |
source |
The path to an R script which makes the |
Details
Other than @makepipe, which is used to tell whether a given script should
be included in the pipeline, the tags recognised mirror the arguments to
make_with_source(). In particular,
-
@targetsand@dependenciesare for declaring inputs and outputs, the expected format is a comma separated list of strings like@targets "out1.Rds", "out2.Rds"but R code like@targets file.path(DIR, "out.Rds")(evaluated inenvir) works too -
@packagesis for declaring the packages that the targets depend on, the expected format is@packages pkg1 pkg2 etc -
@forceis for declaring whether or not execution should be forced, the expected format is a logical likeTRUEorFALSE
See the getting started vignette for more information.
Value
A Pipeline object
See Also
Other make:
make_with_recipe(),
make_with_source()
Examples
## Not run:
# Create a pipeline from scripts in the working dir without executing it
p <- make_with_dir(build = FALSE)
p$build() # Then execute it yourself
## End(Not run)
Make targets out of dependencies using a recipe
Description
Make targets out of dependencies using a recipe
Usage
make_with_recipe(
recipe,
targets,
dependencies = NULL,
packages = NULL,
envir = new.env(parent = parent.frame()),
quiet = getOption("makepipe.quiet"),
force = FALSE,
label = NULL,
note = NULL,
build = TRUE,
...
)
Arguments
recipe |
A chunk of R code which makes the |
targets |
A character vector of paths to files |
dependencies |
A character vector of paths to files which the |
packages |
A character vector of names of packages which |
envir |
The environment in which to execute the |
quiet |
A logical determining whether or not messages are signaled |
force |
A logical determining whether or not execution of the |
label |
A short label for the |
note |
A description of what the |
build |
A logical determining whether or not the pipeline/segment will be built immediately or simply returned to the user |
... |
Additional parameters to pass to |
Value
A Segment object containing execution metadata.
See Also
Other make:
make_with_dir(),
make_with_source()
Examples
## Not run:
# Merge files in fresh environment if raw data has been updated since last
# merged
make_with_recipe(
recipe = {
dat <- readRDS("data/raw_data.Rds")
pop <- readRDS("data/pop_data.Rds")
merged_dat <- merge(dat, pop, by = "id")
saveRDS(merged_dat, "data/merged_data.Rds")
},
targets = "data/merged_data.Rds",
dependencies = c("data/raw_data.Rds", "data/raw_pop.Rds")
)
# Merge files in current environment if raw data has been updated since last
# merged. (If recipe executed, all objects bound in source will be available
# in current env).
make_with_recipe(
recipe = {
dat <- readRDS("data/raw_data.Rds")
pop <- readRDS("data/pop_data.Rds")
merged_dat <- merge(dat, pop, by = "id")
saveRDS(merged_dat, "data/merged_data.Rds")
},
targets = "data/merged_data.Rds",
dependencies = c("data/raw_data.Rds", "data/raw_pop.Rds"),
envir = environment()
)
# Merge files in global environment if raw data has been updated since last
# merged. (If source executed, all objects bound in source will be available
# in global env).
make_with_recipe(
recipe = {
dat <- readRDS("data/raw_data.Rds")
pop <- readRDS("data/pop_data.Rds")
merged_dat <- merge(dat, pop, by = "id")
saveRDS(merged_dat, "data/merged_data.Rds")
},
targets = "data/merged_data.Rds",
dependencies = c("data/raw_data.Rds", "data/raw_pop.Rds"),
envir = globalenv()
)
## End(Not run)
Make targets out of dependencies using a source file
Description
Make targets out of dependencies using a source file
Usage
make_with_source(
source,
targets,
dependencies = NULL,
packages = NULL,
envir = new.env(parent = parent.frame()),
quiet = getOption("makepipe.quiet"),
force = FALSE,
label = NULL,
note = NULL,
build = TRUE,
...
)
Arguments
source |
The path to an R script which makes the |
targets |
A character vector of paths to files |
dependencies |
A character vector of paths to files which the |
packages |
A character vector of names of packages which |
envir |
The environment in which to execute the |
quiet |
A logical determining whether or not messages are signaled |
force |
A logical determining whether or not execution of the |
label |
A short label for the |
note |
A description of what the |
build |
A logical determining whether or not the pipeline/segment will be built immediately or simply returned to the user |
... |
Additional parameters to pass to |
Value
A Segment object containing execution metadata.
See Also
Other make:
make_with_dir(),
make_with_recipe()
Examples
## Not run:
# Merge files in fresh environment if raw data has been updated since last
# merged
make_with_source(
source = "merge_data.R",
targets = "data/merged_data.Rds",
dependencies = c("data/raw_data.Rds", "data/raw_pop.Rds")
)
# Merge files in current environment if raw data has been updated since last
# merged. (If source executed, all objects bound in source will be available
# in current env).
make_with_source(
source = "merge_data.R",
targets = "data/merged_data.Rds",
dependencies = c("data/raw_data.Rds", "data/raw_pop.Rds"),
envir = environment()
)
# Merge files in global environment if raw data has been updated since last
# merged. (If source executed, all objects bound in source will be available
# in global env).
make_with_source(
source = "merge_data.R",
targets = "data/merged_data.Rds",
dependencies = c("data/raw_data.Rds", "data/raw_pop.Rds"),
envir = globalenv()
)
## End(Not run)
Check if targets are out-of-date vis-a-vis their dependencies
Description
Check if targets are out-of-date vis-a-vis their dependencies
Usage
out_of_date(targets, dependencies, packages = NULL)
Arguments
targets |
A character vector of paths to files |
dependencies |
A character vector of paths to files which the |
packages |
A character vector of names of packages which |
Value
TRUE if any of targets are older than any of dependencies or if
any of targets do not exist; FALSE otherwise
Examples
## Not run:
out_of_date("data/processed_data.Rds", "data/raw_data.Rds")
## End(Not run)
Access and interface with Pipeline.
Description
get_pipeline(), set_pipeline() and reset_pipeline() access and modify
the current active pipeline, while all other helper functions do not affect
the active pipeline
Usage
is_pipeline(pipeline)
set_pipeline(pipeline)
get_pipeline()
reset_pipeline()
Arguments
pipeline |
A pipeline. See Pipeline for more details. |
See Also
Other pipeline:
Pipeline,
pipeline-vis
Examples
## Not run:
# Build up a pipeline from scratch and save it out
reset_pipeline()
# A series of `make_with_*()` blocks go here...
saveRDS(get_pipeline(), "data/my_pipeline.Rds")
# ... Later on we can read in and set the pipeline
p <- readRDS("data/my_pipeline.Rds")
set_pipeline(p)
## End(Not run)
Visualise the Pipeline.
Description
Produce a flowchart visualisation of the pipeline. Out-of-date targets will be coloured red, up-to-date targets will be coloured green, and everything else will be blue.
Usage
show_pipeline(
pipeline = get_pipeline(),
as = c("nomnoml", "visnetwork", "text"),
labels = NULL,
notes = NULL,
...
)
save_pipeline(
file,
pipeline = get_pipeline(),
as = c("nomnoml", "visnetwork", "text"),
labels = NULL,
notes = NULL,
...
)
Arguments
pipeline |
A pipeline. See Pipeline for more details. |
as |
A string determining whether to use |
labels |
A named character vector mapping nodes in the |
notes |
A named character vector mapping nodes in the |
... |
Arguments passed onto |
file |
File to save png (nomnoml) or html (visnetwork) into |
Details
Labels and notes must be supplied as named character vector where the
names correspond to the filepaths of nodes (i.e. targets, dependencies,
or source scripts)
See Also
Other pipeline:
Pipeline,
pipeline-accessors
Examples
## Not run:
# Run pipeline
make_with_source(
"recode.R",
"data/0 raw_data.R",
"data/1 data.R"
)
make_with_source(
"merge.R",
c("data/1 data.R", "data/0 raw_pop.R"),
"data/2 data.R"
)
# Visualise pipeline with custom notes
show_pipeline(notes = c(
"data/0 raw_data.R" = "Raw survey data",
"data/0 raw_pop.R" = "Raw population data",
"data/1 data.R" = "Survey data with recodes applied",
"data/2 data.R" = "Survey data with demographic variables merged in"
))
## End(Not run)