Skip to content

add ca fertilization harmonization and ncc compost support#4002

Open
divine7022 wants to merge 75 commits into
PecanProject:developfrom
divine7022:ncc-compost
Open

add ca fertilization harmonization and ncc compost support#4002
divine7022 wants to merge 75 commits into
PecanProject:developfrom
divine7022:ncc-compost

Conversation

@divine7022
Copy link
Copy Markdown
Member

@divine7022 divine7022 commented May 18, 2026

Description

folds standalone N fertilization harmonization (was living at /projectnb/dietzelab/ccmmf/management/fertilization/harmonization.R) into data.land data-raw pipeline, then layers ncc (compost) sampling support on top and extends the events schema with an ncc event type

events_schema_v0.1.2.json picks up ncc as an allowed event_type with material, ncc_subtype, fert_subtype, and pft properties

separate workflow PR will consume these samplers from workflows/fertilization-statewide and workflows/ncc-statewide to emit ensemble events

cc @sarahkanee @mdietze @infotroph @dlebauer

Motivation and Context

Review Time Estimate

  • Immediately
  • Within one week
  • When possible

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My change requires a change to the documentation.
  • My name is in the list of CITATION.cff
  • I agree that PEcAn Project may distribute my contribution under any or all of
    • the same license as the existing code,
    • and/or the BSD 3-clause license.
  • I have updated the CHANGELOG.md.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

@divine7022
Copy link
Copy Markdown
Member Author

high level, priors are out now, two ref tables (ca_compost_amendment, ca_n_application_rate) stay as bundled data for now. and new ag/management package vs bety priors vs whatever is separate conv

@dlebauer
Copy link
Copy Markdown
Member

new ag/management package vs bety priors vs whatever is separate conv

@divine7022 Could you please clarify?

Copy link
Copy Markdown
Member

@dlebauer dlebauer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for these changes.

General changes requested:

Better documentation

(Make sure to review the data packaging guidance from the Data chapter of the "R-Packages" book)

Documentation should cover both the source (TSV) and package (RDA) data:

package datasets should be documented in roxygen, and should include

  • definitions of each field
  • information about where the data came from:
    • references
    • how the dataset was compiled, including assumptions
    • link to more details in source data README.

Source datasets can be documented in a data-raw/README.md and should include:
- where data came from
- what decisions were made during curation
- note that for package datasets that do not have a manual curation step, the provenance of package data is documented by R scripts.
- how to update both the TSV and .rda files

dataset names

Make sure naming is consistent, e.g. create_n_rate_data.R reads a file n_fertilizaton.tsv and generates a dataset ca_n_application_rate. This can get confusing.

consider renaming compost.tsv to organic_ammendments.tsv to bettr reflect the diversity of materials represented in the dataset.

@@ -0,0 +1,33 @@
Material C_MIN (C:N) C_MAX (C:N) C_Avg (C:N) C_Assumed (%) Total N (%) 4 week PAN (%) LowerN/HigherN RowsMIN_AppRate (tons/acre) RowsMIN_AppRate (lbs/acre) RowsMIN_Total_N (lbs N/acre) RowsMIN_Avail_N (lbs N/acre) RowsMIN_Total_C (lbs C/acre) RowsMAX_AppRate (tons/acre) RowsMAX_AppRate (lbs/acre) RowsMAX_Total_N (lbs N/acre) RowsMAX_Avail_N (lbs N/acre) RowsMAX_Total_C (lbs C/acre) TreesMIN_AppRate (tons/acre) TreesMIN_AppRate (lbs/acre) TreesMIN_Total_N (lbs N/acre) TreesMIN_Avail_N (lbs N/acre) TreesMIN_Total_C (lbs C/acre) TreesMAX_AppRate (tons/acre) TreesMAX_AppRate (lbs/acre) TreesMAX_Total_N (lbs N/acre) TreesMAX_Avail_N (lbs N/acre) TreesMAX_Total_C (lbs C/acre) Source
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This table combines stoichiometric properties with assumed application-rate
scenarios.

  • would it make more sense to separate material properties from application-rates?
  • See more general requests about dataset documentation in my general review comments. Some specific questions are below
  • what columns come from the source, and what information was added during manual curation?
    • e.g. are the Rows* and Trees* application-rate values directly supported by
      the cited sources, or did sources provide more granular, species level rates?
  • Why does the build script expose only the Rows* values in
    ca_compost_amendment?
  • Is the intended distinction that synthetic N rates are crop-specific, while
    compost/organic amendment rates are only available at a broader crop-structure
    level?
  • Given the contents include manure, blood meal, paper, wood chips, straw, and
    other organic materials, would organic_amendments.tsv or
    compost_amendments.tsv be a clearer filename than compost.tsv?

# map each raw material name to one of the CalRecycle classes
# (14 CCR section 17852). biosolids is empty in the current table.
material_to_class <- function(m) {
s <- tolower(m)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where does m come from? I don't see it in compost.tsv

Comment on lines +37 to +38
app_rate_min = .data$`RowsMIN_AppRate (lbs/acre)`,
app_rate_max = .data$`RowsMAX_AppRate (lbs/acre)`,
Copy link
Copy Markdown
Member

@dlebauer dlebauer May 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are only Rows[MIN|MAX] and not Trees[MIN|MAX] used here? (see also general question about documentation)

@@ -0,0 +1,90 @@
PFT Group Crop PlantStage Season MINN MAXN Unit Source Notes
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do crop-names come from a controlled vocabulary? Any reason not to use the LandIQ names here?

Where do PFTs come from? On one hand, row/woody is a simple designation, and there may be reasons to include a grouping in this dataset. But caution should be used and rationale/motivation documented because PFT groupings are opinionated, and in this case alfalfa is not typically considered a 'row' crop (it is forage).

Copy link
Copy Markdown
Member

@dlebauer dlebauer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@divine7022

To unblock #4003, which currently depends on this PR, you could

  • copy the .rda and .R files from this PR to the workflows/[fertilization|ncc]-statewide directories
  • update #4003 to use those resources

This is in line with one of the motivations for the workflows directory - as a place to maintain and develop workflows until individual components can be generalized and extracted into PEcAn package functions and data

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants