-
Notifications
You must be signed in to change notification settings - Fork 68
Description
What's wrong
If an rset object was created using some package other than {rsample} (e.g. {spatialsample}), then rsample::.get_split_args() fails with an opaque error, unless that other package happens to be attached to the current session.
Minimal example
In a fresh R session:
# rset defined with a function from {rsample}
rsample::vfold_cv(spatialsample::boston_canopy) |>
rsample::.get_split_args()
#> $v
#> [1] 10
#>
#> $repeats
#> [1] 1
#>
#> $breaks
#> [1] 4
#>
#> $pool
#> [1] 0.1
# rset defined with a function from some other package e.g. {spatialsample}
spatialsample::spatial_block_cv(spatialsample::boston_canopy) |>
rsample::.get_split_args()
#> Error in get(fun, mode = "function", envir = envir): object 'spatial_block_cv' of mode 'function' was not found
# Attach the other package - i.e. library() or require(), note that loading only (e.g. requireNamespace()) is not sufficient
library("spatialsample")
# Try again - this time it works!
spatialsample::spatial_block_cv(spatialsample::boston_canopy) |>
rsample::.get_split_args()
#> $method
#> [1] "random"
#>
#> $v
#> [1] 10
#>
#> $relevant_only
#> [1] TRUE
#>
#> $repeats
#> [1] 1
#>
#> $expand_bbox
#> [1] 1e-05Created on 2025-12-10 with reprex v2.1.1
Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#> setting value
#> version R version 4.3.2 (2023-10-31)
#> os Amazon Linux 2023.9.20251110
#> system aarch64, linux-gnu
#> ui X11
#> language (EN)
#> collate C.UTF-8
#> ctype C.UTF-8
#> tz Europe/London
#> date 2025-12-10
#> pandoc 3.6.3 @ /usr/lib/rstudio-server/bin/quarto/bin/tools/aarch64/ (via rmarkdown)
#> quarto 1.8.26 @ /usr/local/bin/quarto
#>
#> ─ Packages ───────────────────────────────────────────────────────────────────
#> ! package * version date (UTC) lib source
#> class 7.3-23 2025-01-01 [3] CRAN (R 4.3.2)
#> classInt 0.4-11 2025-01-08 [2] CRAN (R 4.3.2)
#> cli 3.6.5 2025-04-23 [2] CRAN (R 4.3.2)
#> codetools 0.2-20 2024-03-31 [3] CRAN (R 4.3.2)
#> DBI 1.2.3 2024-06-02 [2] CRAN (R 4.3.2)
#> digest 0.6.39 2025-11-19 [2] CRAN (R 4.3.2)
#> dplyr 1.1.4 2023-11-17 [2] CRAN (R 4.3.2)
#> e1071 1.7-16 2024-09-16 [2] CRAN (R 4.3.2)
#> evaluate 1.0.5 2025-08-27 [2] CRAN (R 4.3.2)
#> farver 2.1.2 2024-05-13 [2] CRAN (R 4.3.2)
#> fastmap 1.2.0 2024-05-15 [2] CRAN (R 4.3.2)
#> fs 1.6.6 2025-04-12 [2] CRAN (R 4.3.2)
#> furrr 0.3.1 2022-08-15 [2] CRAN (R 4.3.2)
#> future 1.68.0 2025-11-17 [2] CRAN (R 4.3.2)
#> generics 0.1.4 2025-05-09 [2] CRAN (R 4.3.2)
#> ggplot2 4.0.1 2025-11-14 [2] CRAN (R 4.3.2)
#> globals 0.18.0 2025-05-08 [2] CRAN (R 4.3.2)
#> glue 1.8.0 2024-09-30 [2] CRAN (R 4.3.2)
#> gtable 0.3.6 2024-10-25 [2] CRAN (R 4.3.2)
#> htmltools 0.5.8.1 2024-04-04 [2] CRAN (R 4.3.2)
#> KernSmooth 2.23-26 2025-01-01 [3] CRAN (R 4.3.2)
#> knitr 1.50 2025-03-16 [2] CRAN (R 4.3.2)
#> lifecycle 1.0.4 2023-11-07 [2] CRAN (R 4.3.2)
#> listenv 0.10.0 2025-11-02 [2] CRAN (R 4.3.2)
#> magrittr 2.0.4 2025-09-12 [2] CRAN (R 4.3.2)
#> parallelly 1.45.1 2025-07-24 [2] CRAN (R 4.3.2)
#> pillar 1.11.1 2025-09-17 [2] CRAN (R 4.3.2)
#> pkgconfig 2.0.3 2019-09-22 [2] CRAN (R 4.3.2)
#> proxy 0.4-27 2022-06-09 [2] CRAN (R 4.3.2)
#> purrr 1.2.0 2025-11-04 [2] CRAN (R 4.3.2)
#> R6 2.6.1 2025-02-15 [2] CRAN (R 4.3.2)
#> RColorBrewer 1.1-3 2022-04-03 [2] CRAN (R 4.3.2)
#> Rcpp 1.1.0 2025-07-02 [2] CRAN (R 4.3.2)
#> reprex 2.1.1 2024-07-06 [2] CRAN (R 4.3.2)
#> rlang 1.1.6 2025-04-11 [2] CRAN (R 4.3.2)
#> rmarkdown 2.30 2025-09-28 [2] CRAN (R 4.3.2)
#> rsample 1.3.1 2025-07-29 [2] CRAN (R 4.3.2)
#> rstudioapi 0.17.1 2024-10-22 [2] CRAN (R 4.3.2)
#> R S7 0.2.1 <NA> [2] <NA>
#> scales 1.4.0 2025-04-24 [2] CRAN (R 4.3.2)
#> sessioninfo 1.2.3 2025-02-05 [2] CRAN (R 4.3.2)
#> sf 1.0-23 2025-11-28 [2] CRAN (R 4.3.2)
#> spatialsample * 0.6.1 2025-12-02 [2] CRAN (R 4.3.2)
#> tibble 3.3.0 2025-06-08 [2] CRAN (R 4.3.2)
#> tidyr 1.3.1 2024-01-24 [2] CRAN (R 4.3.2)
#> tidyselect 1.2.1 2024-03-11 [2] CRAN (R 4.3.2)
#> units 1.0-0 2025-10-09 [2] CRAN (R 4.3.2)
#> vctrs 0.6.5 2023-12-01 [2] CRAN (R 4.3.2)
#> withr 3.0.2 2024-10-28 [2] CRAN (R 4.3.2)
#> xfun 0.54 2025-10-30 [2] CRAN (R 4.3.2)
#> yaml 2.3.11 2025-11-28 [2] CRAN (R 4.3.2)
#>
#> [1] /home/[me]/R/aarch64-amazon-linux-gnu-library/4.3
#> [2] /usr/lib64/R/site-library
#> [3] /usr/lib64/R/library
#>
#> * ── Packages attached to the search path.
#> R ── Package was removed from disk.
#>
#> ──────────────────────────────────────────────────────────────────────────────Impact
Since tune>=2.0.0, tune:::tune_grid_workflow() uses .get_split_args(), meaning that this issue can affect tuning pipelines written using tidymodels workflows.
Here's a more realistic example of where we might run into this error...
# A not-very-useful, but simple, recipe with a single tuning parameter
recipe <- recipes::recipe(canopy_gain ~ mean_temp, data = spatialsample::boston_canopy) |>
recipes::step_poly(mean_temp, degree = tune::tune())
# A very simple model
model <- parsnip::linear_reg()
# Combine these into a workflow
wf <- workflows::workflow(recipe, model)
# Tuning with resamples defined via {rsample} - no problem!
cv_rsample <- rsample::vfold_cv(spatialsample::boston_canopy, v = 3)
tune::tune_grid(wf, resamples = cv_rsample)
# Tuning with resamples defined via some other package - "object ... was not found"
cv_other <- spatialsample::spatial_block_cv(spatialsample::boston_canopy, v = 3)
tune::tune_grid(wf, resamples = cv_other)Cause
When an rset is defined, the name of the defining function is included as the first element in the object's class attribute.
Within .get_split_args(), we extract that function name, and then attempt to retrieve its argument names using formals():
Lines 142 to 151 in 167783c
| .get_split_args <- function(x, allow_strata_false = FALSE) { | |
| all_attributes <- attributes(x) | |
| function_used_to_create <- switch( | |
| all_attributes$class[[1]], | |
| "validation_set" = "initial_validation_split", | |
| "group_validation_set" = "group_initial_validation_split", | |
| "time_validation_set" = "initial_validation_time_split", | |
| all_attributes$class[[1]] | |
| ) | |
| args <- names(formals(function_used_to_create)) |
Since we're passing a string to formals(), it attempts to find an object with that name in the environment stack, using get():
formals
#> function (fun = sys.function(sys.parent()), envir = parent.frame())
#> {
#> if (is.character(fun))
#> fun <- get(fun, mode = "function", envir = envir)
#> .Internal(formals(fun))
#> }But if the package containing that function isn't attached to the session, then R - understandably - fails to find a suitable object.
Workaround
The trivial workaround is to ensure the other package is attached to the current session, e.g. with library().
But in certain contexts (e.g. {targets} pipelines, {box} modules), this doesn't align with best practices.
NB. my "more realistic" example in the Impact section above, is much closer to how I encountered this error in the first place: we have a {targets} pipeline which prepares a spatial dataset, and then defines & tunes various tidymodels workflows using it. For various reasons, we tend to avoid using library() in these pipelines.