hist(group): add validation & empty-data handling; docstring, guide and tests#658
hist(group): add validation & empty-data handling; docstring, guide and tests#658Subat-01 wants to merge 1 commit intodata-8:masterfrom
Conversation
…and empty-group handling; add tests; fix folium tile attribution and BeautifyIcon textColor compatibility
| Constraints and behavior: | ||
| - ``group`` cannot be combined with ``bin_column``. | ||
| - ``group`` requires exactly one histogram value column. If more | ||
| than one value column is passed, a ``ValueError`` is raised. |
There was a problem hiding this comment.
No need to specify which exception is raised if a constraint is violated.
| - ``group`` cannot be combined with ``bin_column``. | ||
| - ``group`` requires exactly one histogram value column. If more | ||
| than one value column is passed, a ``ValueError`` is raised. | ||
| - If ``group`` does not reference an existing column (by label or |
| warnings.warn("It looks like you're making a grouped histogram with " | ||
| "a lot of groups ({:d}), which is probably incorrect." | ||
| .format(grouped.num_rows)) | ||
| if grouped.num_rows == 0: |
There was a problem hiding this comment.
What is the argument for this change in behavior?
| # This code is factored as a function for clarity only. | ||
| n = len(values_dict) | ||
| if n == 0: | ||
| # Create an empty figure to maintain a no-error contract on empty groups |
There was a problem hiding this comment.
What is the argument for this change in behavior? Why is it important to avoid exceptions in this case?
|
|
||
| Notes and constraints: | ||
| - `group` cannot be used together with `bin_column`. | ||
| - `group` expects exactly one numeric value column (e.g., `'height'`). Passing multiple value columns raises a `ValueError`. |
There was a problem hiding this comment.
I don't think the exception raised should be part of the API specification.
| Notes and constraints: | ||
| - `group` cannot be used together with `bin_column`. | ||
| - `group` expects exactly one numeric value column (e.g., `'height'`). Passing multiple value columns raises a `ValueError`. | ||
| - If `group` does not reference an existing column label or index, a `ValueError` is raised. |
There was a problem hiding this comment.
I don't think the exception raised should be part of the API specification.
| - `group` cannot be used together with `bin_column`. | ||
| - `group` expects exactly one numeric value column (e.g., `'height'`). Passing multiple value columns raises a `ValueError`. | ||
| - If `group` does not reference an existing column label or index, a `ValueError` is raised. | ||
| - If the data are empty for all groups, `hist` creates an empty figure and returns without error. |
|
Thanks for these improvements! I prefer that each pull request address one self-contained issue. Please split out the fixes to maps/tiles in a separate PR. I do prefer that we stick to .rst. |
Background
Table.histneeded clearer documentation and safer behavior for thegroupargument.Changes
Code
datascience/tables.py:histgroupis a legal column (label or index); otherwise raiseValueErrorwith a clear message.groupcannot be used withbin_column; and grouping allows only one numeric data column.t.hist('height', group='gender')t.hist('height', group='gender', side_by_side=True)Docs
docs/hist_grouping.md:.rstand add it to the TOC.)Tests
tests/test_hist_group.pyValueError.(Optional) Maps small fixes
datascience/maps.pyattrsupplied (Folium >= 0.20).text_color->textColoroption forBeautifyIconfor backward compatibility.Verification
Files changed (for quick review)
datascience/tables.py[around lines ~5310, ~5398, ~5420, ~5455]docs/hist_grouping.mdtests/test_hist_group.pydatascience/maps.pyCompatibility & Notes