Report assumptions#606
Conversation
There was a problem hiding this comment.
Pull request overview
This PR introduces assumption-check reporting into the report package, adding a new report_assumptions() helper and integrating its output into both the human narrative (report_text.lm()) and AI-optimized (audience = "ai") report flows.
Changes:
- Added
report_assumptions()to summarize influential observations, homoskedasticity, and collinearity (with human + AI output formats). - Integrated assumption summaries into
report()outputs forlm-family models and into the AI report block generation. - Added a
report_text()/report()method forperformance::check_outliers()plus new test coverage and documentation/vignette updates.
Reviewed changes
Copilot reviewed 10 out of 16 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| vignettes/report_ai.Rmd | Updates vignette to recommend report(x, audience = "ai") instead of report_ai(). |
| tests/testthat/test-report_assumptions.R | Adds unit tests for report_assumptions() and its integration into report() / AI output. |
| R/report.lm.R | Adds assumptions argument and injects assumption summary sentence into report_text.lm(). |
| R/report.check_outliers.R | Adds reporting methods for check_outliers objects and helper method-name formatting. |
| R/report_assumptions.R | Implements new exported assumption-reporting function with human + AI output. |
| R/report_ai.R | Adds @exportS3Method tags and injects an ## Assumptions block into AI report output. |
| NEWS.md | Adds changelog entries describing the new assumptions reporting and related fixes. |
| NAMESPACE | Registers new S3 methods and exports report_assumptions (but stops exporting report_ai). |
| man/report.Rd | Updates audience documentation to describe AI output as report_ai class markdown. |
| man/report.compare.loo.Rd | Fixes Rd link target for brms::loo_compare. |
| man/report-package.Rd | Updates package authors section (generated doc change). |
| man/report_text.check_outliers.Rd | Adds generated docs for report_text.check_outliers() / report.check_outliers(). |
| man/report_assumptions.Rd | Adds generated docs for report_assumptions(). |
| man/reexports.Rd | Updates generated “reexports” help page formatting/links. |
| DESCRIPTION | Updates Collate order and adds RoxygenNote. |
| man/report_ai.Rd | Removes generated docs for report_ai() (no longer exported). |
Files not reviewed (6)
- man/reexports.Rd: Language not supported
- man/report-package.Rd: Language not supported
- man/report.Rd: Language not supported
- man/report.compare.loo.Rd: Language not supported
- man/report_assumptions.Rd: Language not supported
- man/report_text.check_outliers.Rd: Language not supported
| 'report.stanreg.R' | ||
| 'report.brmsfit.R' | ||
| 'report.character.R' | ||
| 'report.check_outliers.R' | ||
| 'report.compare.loo.R' |
| #' @export | ||
| report_assumptions <- function( | ||
| x, | ||
| ..., | ||
| audience = getOption("report_audience", "humans") | ||
| ) { | ||
| insight::check_if_installed("performance") | ||
| audience <- match.arg(audience, c("humans", "ai")) | ||
|
|
There was a problem hiding this comment.
Code Review
This pull request introduces report_assumptions(), a new function for summarizing model assumption checks like influential observations and heteroskedasticity, with support for both human and AI audiences. These checks are now integrated into linear model reports, and several S3 methods have been updated with proper export tags. Reviewers suggested improving code robustness by explicitly extracting p-values to avoid logical errors in conditional checks, ensuring that optional arguments are passed to underlying functions, and using datawizard::text_concatenate() for better natural language formatting of lists.
I am having trouble creating individual review comments. Click here to see my feedback.
R/report_assumptions.R (64-66)
Calling as.numeric() on the result of performance::check_heteroskedasticity() (which is typically a data frame) may fail or produce unexpected results. Furthermore, if multiple p-values are returned, homosked_ok will be a vector, which will cause a warning or error in if (homosked_ok) (lines 122 and 178) in recent R versions. It is safer to extract the p-value column explicitly and use all() to ensure a single logical value.
p_val <- as.numeric(heterosk$p)
p_fmt <- insight::format_p(p_val)
homosked_ok <- all(p_val >= 0.05)R/report.check_outliers.R (57)
Using paste(..., collapse = " and ") can result in grammatically awkward strings when more than two methods are used (e.g., "A and B and C"). It is recommended to use datawizard::text_concatenate() which correctly handles lists of any length (e.g., "A, B, and C").
method_str <- datawizard::text_concatenate(method_parts)R/report.lm.R (760)
The ... arguments are not passed to report_assumptions(). This prevents users from passing custom parameters (such as specific outlier detection methods or thresholds) through the main report() call to the underlying performance functions.
report_assumptions(x, ...),R/report_assumptions.R (228)
Consider using datawizard::text_concatenate() here to ensure the list of terms is grammatically correct (e.g., using "and" for the last item instead of just a comma).
datawizard::text_concatenate(collin_info$flagged_terms),R/report_assumptions.R (266)
Using paste(..., collapse = " and ") is suboptimal for lists with more than two items. datawizard::text_concatenate() is preferred for generating natural language lists in easystats reports.
method_str <- datawizard::text_concatenate(method_labels)Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
|
Nice! Do we want this to be the default, though? I'd say default is not including the assumptions, because usually you want to focus on results. |
|
I would have said so a couple of months ago, but given the now-mostly educational value of these reports (i.e. useful to show how to optimally phrase and formulate things) I'd be fine with keeping the default on, as it only adds a sentence. But I don't have a strong opinion either way |
|
I like reporting assumptions, but I think that reporting the results of various "tests" of assumptions goes against the "visually checking" philosophy we've been encouraging with functions like |
|
Ironically, the assumptions-check probably works better for more complex models, where we rely on DHARMa. |
|
I'd say we should not include by default, but having a report() method for our assumption check objects would be great |

Started integrating assumption reports in the report:
We fitted a linear model (estimated using OLS) to predict disp with mpg and hp
(formula: disp ~ mpg + hp). The model's assumptions were checked: 2 influential
observations (5.88%) were detected (Cook's distance), heteroskedasticity was
detected (p = 0.029) and no collinearity was detected. The model explains a
statistically significant and substantial proportion of variance (R2 = 0.54,
F(2, 31) = 18.40, p < .001, adj. R2 = 0.51). The model's intercept,
corresponding to mpg = 0 and hp = 0, is at 49.34 (95% CI [-148.35, 247.02],
t(31) = 0.51, p = 0.614). Within this model:
-0.30, 95% CI [-6.04, 5.45], t(31) = -0.10, p = 0.917; Std. beta = -0.02, 95%
CI [-0.36, 0.32])
CI [0.72, 1.97], t(31) = 4.36, p < .001; Std. beta = 0.72, 95% CI [0.39, 1.06])
Standardized parameters were obtained by fitting the model on a standardized
version of the dataset. 95% Confidence Intervals (CIs) and p-values were
computed using a Wald t-distribution approximation.
Created on 2026-05-25 with reprex v2.1.1