Function to parse restart dates out of event files by infotroph · Pull Request #3828 · PecanProject/pecan

infotroph · 2026-02-23T17:12:47Z

Description

Single-PFTs models like Sipnet need to reinitialize their parameterization when changing the plant type. Here's a function to get the list of dates and plant types from a (possibly multisite) events JSON.

Discussion needed: This version assumes the crop type is specified in each planting event, but this is not enforced by the existing events schema. Should we update the schema or have this function ignore all planting events with no crop type specified (even if this means returning an empty list)?

Motivation and Context

Review Time Estimate

Immediately
Within one week
When possible

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

My change requires a change to the documentation.
My name is in the list of CITATION.cff
I agree that PEcAn Project may distribute my contribution under any or all of
- the same license as the existing code,
- and/or the BSD 3-clause license.
I have updated the CHANGELOG.md.
I have updated the documentation accordingly.
I have read the CONTRIBUTING document.
I have added tests to cover my changes.
All new and existing tests passed.

dlebauer · 2026-02-24T01:36:05Z

I agree that the events schema should be updated. It makes sense for each planting event to be required to specify the associated crop or pft.

dlebauer

This looks good. Yes, it makes sense to require a crop identifier.

Before merging, please:

update or create a new schema that requires crop identifier
create example and ensure it validates
enable currently skipped test
either:
- pull out helper functions for events.json <--> table
- create an issue, or ask me to do so

dlebauer · 2026-02-24T03:39:44Z

+#' events_to_crop_cycle_starts(evts)
+#' }
+events_to_crop_cycle_starts <- function(event_json) {
+  jsonlite::read_json(event_json) |>


I think that it would be helpful to have helper functions that convert events.json to and from tables, e.g. events_json_to_table()
events_table_to_json(). Please either implement or convert this comment to an issue.

Where else do you expect to want to use these, and for what fraction of event usage? If we'll ~always want to process events in table format, then maybe they should be stored that way instead of unnesting from JSON all the time.

Each event type is already being generated in a "tidy" tabular format, so staying tabular is easier and more compact for us. But the crux of the problem comes when you have to interleave different event types chronologically as each event type has different variables associated with it. We solved that in SIPNET by not having column headers -- you just have to know from the metadata the position of the different variables in each row. That remains an option here, but we'll loose the ability for the dataframe to play nice with lots of R tools (e.g., tidyverse). Other options are a wide format (all possible event column names, most irrelevant for most events) or a long format (e.g., datetime, site, variable, value) which will result in each event taking up multiple rows.

This sounds to me like we don't yet have a single target table format, so I think it's premature to try to write the helper for it in this PR. It seems very plausible that the format will be context-specific: Here I unnested first to get a wide (and sparse!) table and it was fine, but in other cases we might want to filter the still-nested events by event type/crop/etc so that we can unnest to a form with fewer NAs.

infotroph · 2026-02-25T11:38:40Z

schema update is in #3836

…rts.R

infotroph · 2026-02-27T20:26:22Z

@dlebauer Can you elaborate on "create example and ensure it validates"? I don't follow -- remember this function consumes JSON, not produces it.

ashiklom · 2026-03-11T00:18:05Z

+    dplyr::group_by(.data$site_id, .data$crop_cycle_id) |>
+    dplyr::slice_min(.data$date) |>
+    dplyr::select("site_id", "date", "crop")


Two quick notes (from trying these changes in my restart work):

This will return a grouped data frame because we never ungroup. I suggest either adding an ungroup call to the end of the pipe or (my personal preference) replacing the group_by with the by argument to slice_min to only group that one operation.
dplyr::slice_min(.data$date, by = c("site_id", "crop_cycle_id")) |>

I think in Update events schema: require crop id when planting, remove pft from site #3836 we will start to require crop_code, right? I'm not sure what the relationship between crop and crop_code is, but assuming we'll be using crop_code instead of crop, this function will need to be adjusted accordingly. Since the two PRs are closely related, I might suggest sequencing this one after Update events schema: require crop id when planting, remove pft from site #3836 and future-proofing this to start using crop_code right away.

Good call!

Yep, this is now waiting on schema finalization in Update events schema: require crop id when planting, remove pft from site #3836. I'll mark it as waiting.

first draft of fn to parse restart dates out of event files

0e85564

github-actions bot added tests modules labels Feb 23, 2026

infotroph changed the title ~~New function to parse restart dates out of event files~~ Function to parse restart dates out of event files Feb 23, 2026

dlebauer requested changes Feb 24, 2026

View reviewed changes

infotroph mentioned this pull request Feb 25, 2026

Update events schema: require crop id when planting, remove pft from site #3836

Open

infotroph commented Feb 27, 2026

View reviewed changes

Comment thread modules/data.land/tests/testthat/test-events_to_crop_cycle_starts.R Outdated

Update modules/data.land/tests/testthat/test-events_to_crop_cycle_sta…

7393801

…rts.R

mdietze approved these changes Feb 27, 2026

View reviewed changes

ashiklom reviewed Mar 11, 2026

View reviewed changes

infotroph added the status:blocked Waiting for another PR/issue (say which one in comments) label Mar 11, 2026

ashiklom approved these changes Mar 11, 2026

View reviewed changes

Merge branch 'develop' into sipnet-hop

2f69964

ashiklom mentioned this pull request Apr 17, 2026

SIPNET workflow for restarting with events #3919

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Function to parse restart dates out of event files#3828

Function to parse restart dates out of event files#3828
infotroph wants to merge 3 commits intoPecanProject:developfrom
infotroph:sipnet-hop

infotroph commented Feb 23, 2026

Uh oh!

dlebauer commented Feb 24, 2026

Uh oh!

dlebauer left a comment

Uh oh!

dlebauer Feb 24, 2026

Uh oh!

infotroph Feb 24, 2026

Uh oh!

mdietze Feb 24, 2026

Uh oh!

infotroph Feb 27, 2026

Uh oh!

infotroph commented Feb 25, 2026

Uh oh!

Uh oh!

infotroph commented Feb 27, 2026

Uh oh!

ashiklom Mar 11, 2026

Uh oh!

infotroph Mar 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

infotroph commented Feb 23, 2026

Description

Motivation and Context

Review Time Estimate

Types of changes

Checklist:

Uh oh!

dlebauer commented Feb 24, 2026

Uh oh!

dlebauer left a comment

Choose a reason for hiding this comment

Uh oh!

dlebauer Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

infotroph Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

mdietze Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

infotroph Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

infotroph commented Feb 25, 2026

Uh oh!

Uh oh!

infotroph commented Feb 27, 2026

Uh oh!

ashiklom Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

infotroph Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants