From f0d81ebf4b9036673cc424a0252bf652b4cb0354 Mon Sep 17 00:00:00 2001
From: Jim Dowling <jim@hopsworks.ai>
Date: Thu, 21 May 2026 14:48:19 +0200
Subject: [PATCH 1/6] [HWORKS-2802] Document partitioned_by parameter on
 feature group creation https://hopsworks.atlassian.net/browse/HWORKS-2802
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Add a section to docs/user_guides/fs/feature_group/create.md
describing the storage-engine-native partitioned_by parameter for
Delta feature groups. Covers:

- Usage example with create_feature_group / get_or_create_feature_group.
- The CREATE TABLE … USING DELTA … GENERATED ALWAYS AS … contract:
  the storage layer derives the partition columns; the user's
  dataframe never carries them.
- Validation rules: mutual exclusion with partition_key, requires
  event_time.
- Partition pruning table — Delta auto-derives partition predicates
  from the GENERATED expressions for hierarchical specs (year /
  year+month / year+month+day / year+month+day+hour), so
  `fg.read(start_time=..., end_time=...)` and
  `fg.filter(fg.event_time >= ...)` prune at the partition level.
  Non-hierarchical specs (e.g. ["month"], ["year","week"]) are valid
  but skip the auto-derivation — only direct predicates on the
  grain columns prune. Recommend hierarchical specs.
- Online feature store behavior: derived columns live offline-only
  by default; online_partition_columns=true opts into online
  materialization. Until the onlinefs consumer filter ships, the
  backend rejects partitioned_by + online_enabled=true with the
  default online_partition_columns=false. Document both
  workarounds.
- Hudi: partitioned_by + HUDI is rejected at creation; Hudi support
  is tracked under a separate follow-up ticket.

Signed-off-by: Jim Dowling <jim@logicalclocks.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 docs/user_guides/fs/feature_group/create.md | 54 +++++++++++++++++++++
 1 file changed, 54 insertions(+)

diff --git a/docs/user_guides/fs/feature_group/create.md b/docs/user_guides/fs/feature_group/create.md
index c6db36f3ef..c7c6a91d0f 100644
--- a/docs/user_guides/fs/feature_group/create.md
+++ b/docs/user_guides/fs/feature_group/create.md
@@ -102,6 +102,60 @@ MaxDirectoryItemsExceededException - The directory item limit is exceeded: limit
 
 By using partitioning the system will write the feature data in different subdirectories, thus allowing you to write 10240 files per partition.
 
+##### Time-grain partitioning with `partitioned_by` (Delta only)
+
+When the partition columns are derived from the feature group's `event_time`, the Python client can hand the backend the desired time grains and let the storage engine generate the partition columns automatically.
+Pass `partitioned_by=[...]` with one or more grains drawn from `hour`, `day`, `week`, `month`, and `year`.
+
+```python
+fg = fs.get_or_create_feature_group(
+    name="transactions",
+    version=1,
+    primary_key=["tx_id"],
+    event_time="tx_ts",
+    partitioned_by=["year", "month", "day"],
+    time_travel_format="DELTA",
+)
+fg.insert(df)  # df does not need year/month/day — Delta derives them
+```
+
+The example above is equivalent to manually decomposing `tx_ts` into three columns and passing `partition_key=["year", "month", "day"]`.
+The backend creates the table via `CREATE TABLE … USING DELTA … GENERATED ALWAYS AS …`, so the derived columns live entirely inside the storage layer; the source dataframe never carries them.
+
+`partitioned_by` and `partition_key` are mutually exclusive.
+`partitioned_by` requires `event_time` to be set.
+
+###### Partition pruning
+
+Delta auto-derives partition predicates from the GENERATED expressions when the user filters on the source column.
+Filtering on `event_time` ranges therefore prunes partitions for free on hierarchical specs:
+
+| `partitioned_by` | Prunes on `event_time` range? | Prunes on `year` / `month` / `day` filter? |
+| --- | --- | --- |
+| `["year"]` | ✅ | ✅ |
+| `["year", "month"]` | ✅ | ✅ |
+| `["year", "month", "day"]` | ✅ | ✅ |
+| `["year", "month", "day", "hour"]` | ✅ | ✅ |
+| `["month"]` (no year) | ⚠️ no — month alone is ambiguous across years | ✅ filter on month works |
+| `["year", "week"]` | ⚠️ year only — week isn't directly derivable from a date range | ✅ both columns prune |
+| `["day"]` (no year/month) | ⚠️ no — day-of-month is ambiguous | ✅ filter on day works |
+
+Prefer hierarchical specs (`["year"]`, `["year", "month"]`, `["year", "month", "day"]`) — they line up with the typical batch-pipeline access pattern and prune naturally.
+
+###### Online feature store
+
+By default, the derived partition columns live only in the offline storage; the online feature store does not get them.
+Pass `online_partition_columns=True` to materialize them in the online row as well.
+
+While the online-store filter (the `onlinefs` consumer that drops `offline_only` columns from the RonDB write) is still pending, the backend rejects `partitioned_by` together with `online_enabled=true` and the default `online_partition_columns=false` to avoid writing the grain columns to RonDB by accident.
+The two workarounds: keep the feature group offline-only, or set `online_partition_columns=True` to materialize the grains online explicitly.
+
+###### Hudi
+
+`partitioned_by` on `time_travel_format="HUDI"` feature groups is not yet supported and the backend rejects it at creation.
+Hudi needs a different mechanism (a `CustomKeyGenerator` + server-side `Transformer`) and is tracked under a separate follow-up ticket.
+Until that lands, use `time_travel_format="DELTA"` to get time-grain partitioning, or partition Hudi groups explicitly via `partition_key=["year"]` with a `year` column the upstream pipeline computes.
+
 ##### Table format
 
 When you create a feature group, you can specify the table format you want to use to store the data in your feature group by setting the `time_travel_format` parameter.

From 6b0c36317e3d698c6f82eaee2b64952b6a4267ef Mon Sep 17 00:00:00 2001
From: Jim Dowling <jim@hopsworks.ai>
Date: Sun, 31 May 2026 15:18:16 +0200
Subject: [PATCH 2/6] [HWORKS-2802] Update partitioned_by docs for the
 real-column design https://hopsworks.atlassian.net/browse/HWORKS-2802

The partitioned_by section described Delta GENERATED ALWAYS AS columns and
storage-engine-side derivation, which is no longer how it works. Document
the real design: the client derives the grain columns from event_time and
writes them as real partition columns, pruning works natively on grain
filters and via predicate translation on event_time ranges. Correct the
online-store note: online-enabled partitioned_by feature groups are
rejected entirely until HWORKS-2808, not only with the default
online_partition_columns.

Signed-off-by: Jim Dowling <jim@logicalclocks.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
---
 docs/user_guides/fs/feature_group/create.md | 21 ++++++++++-----------
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/docs/user_guides/fs/feature_group/create.md b/docs/user_guides/fs/feature_group/create.md
index c7c6a91d0f..8197c9245f 100644
--- a/docs/user_guides/fs/feature_group/create.md
+++ b/docs/user_guides/fs/feature_group/create.md
@@ -104,8 +104,8 @@ By using partitioning the system will write the feature data in different subdir
 
 ##### Time-grain partitioning with `partitioned_by` (Delta only)
 
-When the partition columns are derived from the feature group's `event_time`, the Python client can hand the backend the desired time grains and let the storage engine generate the partition columns automatically.
-Pass `partitioned_by=[...]` with one or more grains drawn from `hour`, `day`, `week`, `month`, and `year`.
+When the partition columns are derived from the feature group's `event_time`, hand the backend the desired time grains with `partitioned_by=[...]` and the Python client derives the partition columns for you.
+Pass one or more grains drawn from `hour`, `day`, `week`, `month`, and `year`.
 
 ```python
 fg = fs.get_or_create_feature_group(
@@ -116,19 +116,20 @@ fg = fs.get_or_create_feature_group(
     partitioned_by=["year", "month", "day"],
     time_travel_format="DELTA",
 )
-fg.insert(df)  # df does not need year/month/day — Delta derives them
+fg.insert(df)  # df does not need year/month/day — the client derives them
 ```
 
 The example above is equivalent to manually decomposing `tx_ts` into three columns and passing `partition_key=["year", "month", "day"]`.
-The backend creates the table via `CREATE TABLE … USING DELTA … GENERATED ALWAYS AS …`, so the derived columns live entirely inside the storage layer; the source dataframe never carries them.
+The grain columns are ordinary materialized partition columns: the client computes them from `event_time` on each write and the backend registers them as partition columns through the normal table-creation path.
+The source dataframe does not need to carry them.
 
 `partitioned_by` and `partition_key` are mutually exclusive.
 `partitioned_by` requires `event_time` to be set.
 
 ###### Partition pruning
 
-Delta auto-derives partition predicates from the GENERATED expressions when the user filters on the source column.
-Filtering on `event_time` ranges therefore prunes partitions for free on hierarchical specs:
+The grain columns are real partition columns, so a filter on a grain column (for example `year == 2026`) prunes partitions natively.
+A filter on an `event_time` range is rewritten into equivalent grain-column predicates by the query layer, so it prunes too on hierarchical specs:
 
 | `partitioned_by` | Prunes on `event_time` range? | Prunes on `year` / `month` / `day` filter? |
 | --- | --- | --- |
@@ -144,11 +145,9 @@ Prefer hierarchical specs (`["year"]`, `["year", "month"]`, `["year", "month", "
 
 ###### Online feature store
 
-By default, the derived partition columns live only in the offline storage; the online feature store does not get them.
-Pass `online_partition_columns=True` to materialize them in the online row as well.
-
-While the online-store filter (the `onlinefs` consumer that drops `offline_only` columns from the RonDB write) is still pending, the backend rejects `partitioned_by` together with `online_enabled=true` and the default `online_partition_columns=false` to avoid writing the grain columns to RonDB by accident.
-The two workarounds: keep the feature group offline-only, or set `online_partition_columns=True` to materialize the grains online explicitly.
+Online-enabled feature groups do not yet support `partitioned_by`.
+The online ingestion path does not exclude the offline-only grain columns from the Kafka/Avro schema, nor materialize them for the online write, so the backend rejects `partitioned_by` together with `online_enabled=true` until that work lands (tracked under a separate follow-up ticket).
+Keep the feature group offline-only to use `partitioned_by`.
 
 ###### Hudi
 

From 00494373b894da3d99750817ca6eb8682ac7b171 Mon Sep 17 00:00:00 2001
From: Jim Dowling <jim@hopsworks.ai>
Date: Wed, 10 Jun 2026 11:09:54 +0200
Subject: [PATCH 3/6] [HWORKS-2802] Drop key-generator detail from the Hudi
 partitioned_by note https://hopsworks.atlassian.net/browse/HWORKS-2802

The Hudi follow-up materializes the grain columns server-side and
partitions on them directly; the CustomKeyGenerator phrasing described
a mechanism the revised design no longer uses.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
---
 docs/user_guides/fs/feature_group/create.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/user_guides/fs/feature_group/create.md b/docs/user_guides/fs/feature_group/create.md
index 8197c9245f..97fa30189e 100644
--- a/docs/user_guides/fs/feature_group/create.md
+++ b/docs/user_guides/fs/feature_group/create.md
@@ -152,7 +152,7 @@ Keep the feature group offline-only to use `partitioned_by`.
 ###### Hudi
 
 `partitioned_by` on `time_travel_format="HUDI"` feature groups is not yet supported and the backend rejects it at creation.
-Hudi needs a different mechanism (a `CustomKeyGenerator` + server-side `Transformer`) and is tracked under a separate follow-up ticket.
+Hudi materializes the grain columns server-side in the streaming materialization job, and that work is tracked under a separate follow-up ticket.
 Until that lands, use `time_travel_format="DELTA"` to get time-grain partitioning, or partition Hudi groups explicitly via `partition_key=["year"]` with a `year` column the upstream pipeline computes.
 
 ##### Table format

From 1dec9e01c46fdfb56935b56924f8c959c05646c7 Mon Sep 17 00:00:00 2001
From: Jim Dowling <jim@hopsworks.ai>
Date: Thu, 11 Jun 2026 06:41:05 +0200
Subject: [PATCH 4/6] [HWORKS-2802] Expand partitioned_by feature group docs
 https://hopsworks.atlassian.net/browse/HWORKS-2802

Flesh out the partitioned_by section into reference for the shipped
feature: the parameter list (partitioned_by + online_partition_columns
with their constraints), cross-session persistence and the round-trip
through get_feature_group, the on-disk Hive layout, a read/partition-
pruning example with the hierarchical-vs-non-hierarchical matrix, a
clickstream-by-hour example, and the current online and Hudi
limitations (online rejected at create and on enable).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
---
 docs/user_guides/fs/feature_group/create.md | 83 ++++++++++++++++-----
 1 file changed, 66 insertions(+), 17 deletions(-)

diff --git a/docs/user_guides/fs/feature_group/create.md b/docs/user_guides/fs/feature_group/create.md
index 97fa30189e..d07f7e3cd3 100644
--- a/docs/user_guides/fs/feature_group/create.md
+++ b/docs/user_guides/fs/feature_group/create.md
@@ -104,7 +104,8 @@ By using partitioning the system will write the feature data in different subdir
 
 ##### Time-grain partitioning with `partitioned_by` (Delta only)
 
-When the partition columns are derived from the feature group's `event_time`, hand the backend the desired time grains with `partitioned_by=[...]` and the Python client derives the partition columns for you.
+Most time-series feature groups want to partition by a time grain derived from `event_time`.
+Instead of decomposing the timestamp into `year` / `month` / `day` columns yourself and passing them as `partition_key`, declare the grains with `partitioned_by` and let Hopsworks derive the partition columns for you.
 Pass one or more grains drawn from `hour`, `day`, `week`, `month`, and `year`.
 
 ```python
@@ -116,20 +117,50 @@ fg = fs.get_or_create_feature_group(
     partitioned_by=["year", "month", "day"],
     time_travel_format="DELTA",
 )
-fg.insert(df)  # df does not need year/month/day — the client derives them
+fg.insert(df)  # df does not need year/month/day; they derive from tx_ts
 ```
 
-The example above is equivalent to manually decomposing `tx_ts` into three columns and passing `partition_key=["year", "month", "day"]`.
-The grain columns are ordinary materialized partition columns: the client computes them from `event_time` on each write and the backend registers them as partition columns through the normal table-creation path.
-The source dataframe does not need to carry them.
+The example above is equivalent to manually decomposing `tx_ts` into three columns and passing `partition_key=["year", "month", "day"]`, but you never write the grain columns yourself.
+The grain columns are ordinary materialized partition columns: the client computes them from `event_time` on each write, and the backend registers them as partition columns through the normal table-creation path (no Delta generated columns, no extra job).
+The source DataFrame must contain only your real features plus `event_time`; it must not carry the grain columns.
 
-`partitioned_by` and `partition_key` are mutually exclusive.
-`partitioned_by` requires `event_time` to be set.
+On disk the data lands in the standard Hive layout, one directory level per grain in the order you listed them:
 
-###### Partition pruning
+```text
+.../transactions_1/year=2026/month=06/day=11/<parquet files>
+```
+
+The grains become real features on the feature group, so they show up in the schema and in `fg.partition_key`, and you can filter on them directly.
+By default they are written only to the offline store (see [Online feature store](#online-feature-store) below).
+
+###### Parameters
+
+- `partitioned_by`: ordered, non-empty list of grains from `{"hour", "day", "week", "month", "year"}`, no duplicates.
+  Mutually exclusive with `partition_key`, and requires `event_time` to be set.
+  A grain must not collide with `event_time` or an existing feature name.
+- `online_partition_columns` (default `False`): when `True`, the derived grain columns are also written to the online store; when `False` they are offline-only.
+  Online serving with `partitioned_by` is not supported yet, so this is effectively always `False` today (see below).
+
+###### Persistence across sessions
+
+`partitioned_by` is stored on the feature group, so it round-trips without re-passing it:
 
-The grain columns are real partition columns, so a filter on a grain column (for example `year == 2026`) prunes partitions natively.
-A filter on an `event_time` range is rewritten into equivalent grain-column predicates by the query layer, so it prunes too on hierarchical specs:
+```python
+fg = fs.get_feature_group("transactions", version=1)
+fg.partitioned_by          # ["year", "month", "day"]
+fg.partition_key           # ["year", "month", "day"]
+```
+
+###### Reading and partition pruning
+
+Read the whole group, or a time slice; the grain columns appear as normal feature columns, populated from `event_time`:
+
+```python
+recent = fg.read(start_time="2026-06-01", end_time="2026-06-11")
+```
+
+The grain columns are real partition columns, so a filter on a grain column (for example `fg.filter(fg.year == 2026)`) prunes partitions natively.
+A filter on an `event_time` range is rewritten into equivalent grain-column predicates by the query layer, so `fg.read(start_time=..., end_time=...)` prunes too on hierarchical specs (and tightens to the finest grain the range allows, so a within-one-month window also bounds `day`):
 
 | `partitioned_by` | Prunes on `event_time` range? | Prunes on `year` / `month` / `day` filter? |
 | --- | --- | --- |
@@ -137,23 +168,41 @@ A filter on an `event_time` range is rewritten into equivalent grain-column pred
 | `["year", "month"]` | ✅ | ✅ |
 | `["year", "month", "day"]` | ✅ | ✅ |
 | `["year", "month", "day", "hour"]` | ✅ | ✅ |
-| `["month"]` (no year) | ⚠️ no — month alone is ambiguous across years | ✅ filter on month works |
-| `["year", "week"]` | ⚠️ year only — week isn't directly derivable from a date range | ✅ both columns prune |
-| `["day"]` (no year/month) | ⚠️ no — day-of-month is ambiguous | ✅ filter on day works |
+| `["month"]` (no year) | ⚠️ no, month alone is ambiguous across years | ✅ filter on month works |
+| `["year", "week"]` | ⚠️ year only, week is not directly derivable from a date range | ✅ both columns prune |
+| `["day"]` (no year/month) | ⚠️ no, day-of-month is ambiguous | ✅ filter on day works |
+
+Prefer hierarchical specs: `["year"]`, `["year", "month"]`, `["year", "month", "day"]`, `["year", "month", "day", "hour"]`.
+They line up with the typical batch-pipeline access pattern and prune naturally on both grain-column and `event_time`-range filters.
+Non-hierarchical specs are still valid; they just do not prune on an `event_time` range, only on a direct filter of the derived columns.
 
-Prefer hierarchical specs (`["year"]`, `["year", "month"]`, `["year", "month", "day"]`) — they line up with the typical batch-pipeline access pattern and prune naturally.
+###### Example: clickstream partitioned by the hour
+
+A high-volume event stream partitioned down to the hour, so a query for a few hours reads only those partitions:
+
+```python
+fg = fs.get_or_create_feature_group(
+    name="clickstream",
+    version=1,
+    primary_key=["event_id"],
+    event_time="event_time",
+    partitioned_by=["year", "month", "day", "hour"],
+    online_enabled=False,
+    time_travel_format="DELTA",
+)
+fg.insert(clickstream_df)  # only event_id / event_time / event fields
+```
 
 ###### Online feature store
 
 Online-enabled feature groups do not yet support `partitioned_by`.
-The online ingestion path does not exclude the offline-only grain columns from the Kafka/Avro schema, nor materialize them for the online write, so the backend rejects `partitioned_by` together with `online_enabled=true` until that work lands (tracked under a separate follow-up ticket).
+The online ingestion path does not exclude the offline-only grain columns from the Kafka/Avro schema, nor materialize them for the online write, so the backend rejects `partitioned_by` together with `online_enabled=True`, both at creation and when enabling online on an existing group.
 Keep the feature group offline-only to use `partitioned_by`.
 
 ###### Hudi
 
 `partitioned_by` on `time_travel_format="HUDI"` feature groups is not yet supported and the backend rejects it at creation.
-Hudi materializes the grain columns server-side in the streaming materialization job, and that work is tracked under a separate follow-up ticket.
-Until that lands, use `time_travel_format="DELTA"` to get time-grain partitioning, or partition Hudi groups explicitly via `partition_key=["year"]` with a `year` column the upstream pipeline computes.
+Until Hudi support lands, use `time_travel_format="DELTA"` to get time-grain partitioning, or partition Hudi groups explicitly via `partition_key=["year"]` with a `year` column the upstream pipeline computes.
 
 ##### Table format
 

From 49db202ff6afb31a1117aedd1b8993d333a70cb3 Mon Sep 17 00:00:00 2001
From: Jim Dowling <jim@hopsworks.ai>
Date: Sat, 13 Jun 2026 00:17:48 +0200
Subject: [PATCH 5/6] [HWORKS-2807] Document partitioned_by support on Iceberg
 https://hopsworks.atlassian.net/browse/HWORKS-2807

partitioned_by now works on DELTA and ICEBERG; NONE is rejected alongside
Hudi. Update the section heading, supported-formats note, and the Hudi
fallback guidance.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
---
 docs/user_guides/fs/feature_group/create.md | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/docs/user_guides/fs/feature_group/create.md b/docs/user_guides/fs/feature_group/create.md
index 002ce4879f..367224968d 100644
--- a/docs/user_guides/fs/feature_group/create.md
+++ b/docs/user_guides/fs/feature_group/create.md
@@ -102,11 +102,12 @@ MaxDirectoryItemsExceededException - The directory item limit is exceeded: limit
 
 By using partitioning the system will write the feature data in different subdirectories, thus allowing you to write 10240 files per partition.
 
-##### Time-grain partitioning with `partitioned_by` (Delta only)
+##### Time-grain partitioning with `partitioned_by` (Delta and Iceberg)
 
 Most time-series feature groups want to partition by a time grain derived from `event_time`.
 Instead of decomposing the timestamp into `year` / `month` / `day` columns yourself and passing them as `partition_key`, declare the grains with `partitioned_by` and let Hopsworks derive the partition columns for you.
 Pass one or more grains drawn from `hour`, `day`, `week`, `month`, and `year`.
+Supported on `time_travel_format="DELTA"` and `time_travel_format="ICEBERG"`.
 
 ```python
 fg = fs.get_or_create_feature_group(
@@ -201,8 +202,8 @@ Keep the feature group offline-only to use `partitioned_by`.
 
 ###### Hudi
 
-`partitioned_by` on `time_travel_format="HUDI"` feature groups is not yet supported and the backend rejects it at creation.
-Until Hudi support lands, use `time_travel_format="DELTA"` to get time-grain partitioning, or partition Hudi groups explicitly via `partition_key=["year"]` with a `year` column the upstream pipeline computes.
+`partitioned_by` on `time_travel_format="HUDI"` feature groups is not yet supported and the backend rejects it at creation; so is `time_travel_format="NONE"` (plain Hive/parquet), which has no grain-materialization step.
+Until Hudi support lands, use `time_travel_format="DELTA"` or `"ICEBERG"` to get time-grain partitioning, or partition Hudi groups explicitly via `partition_key=["year"]` with a `year` column the upstream pipeline computes.
 
 ##### Table format
 

From c2e8830da78850e058dbfb63906af37d28cc40c4 Mon Sep 17 00:00:00 2001
From: Jim Dowling <jim@hopsworks.ai>
Date: Sat, 13 Jun 2026 07:31:35 +0200
Subject: [PATCH 6/6] [HWORKS-2807] Document partitioned_by on Hudi + stream
 limitation https://hopsworks.atlassian.net/browse/HWORKS-2807

Non-stream Hudi feature groups now support partitioned_by (direct Spark
write); stream feature groups and NONE are rejected. Update the section
heading, supported-formats note, Hudi note, and add a stream note.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
---
 docs/user_guides/fs/feature_group/create.md | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/docs/user_guides/fs/feature_group/create.md b/docs/user_guides/fs/feature_group/create.md
index 367224968d..f985c96472 100644
--- a/docs/user_guides/fs/feature_group/create.md
+++ b/docs/user_guides/fs/feature_group/create.md
@@ -102,12 +102,12 @@ MaxDirectoryItemsExceededException - The directory item limit is exceeded: limit
 
 By using partitioning the system will write the feature data in different subdirectories, thus allowing you to write 10240 files per partition.
 
-##### Time-grain partitioning with `partitioned_by` (Delta and Iceberg)
+##### Time-grain partitioning with `partitioned_by`
 
 Most time-series feature groups want to partition by a time grain derived from `event_time`.
 Instead of decomposing the timestamp into `year` / `month` / `day` columns yourself and passing them as `partition_key`, declare the grains with `partitioned_by` and let Hopsworks derive the partition columns for you.
 Pass one or more grains drawn from `hour`, `day`, `week`, `month`, and `year`.
-Supported on `time_travel_format="DELTA"` and `time_travel_format="ICEBERG"`.
+Supported on `time_travel_format="DELTA"`, `"ICEBERG"`, and `"HUDI"` for non-stream feature groups (see [Hudi](#hudi) and [Stream feature groups](#stream-feature-groups) below).
 
 ```python
 fg = fs.get_or_create_feature_group(
@@ -202,8 +202,15 @@ Keep the feature group offline-only to use `partitioned_by`.
 
 ###### Hudi
 
-`partitioned_by` on `time_travel_format="HUDI"` feature groups is not yet supported and the backend rejects it at creation; so is `time_travel_format="NONE"` (plain Hive/parquet), which has no grain-materialization step.
-Until Hudi support lands, use `time_travel_format="DELTA"` or `"ICEBERG"` to get time-grain partitioning, or partition Hudi groups explicitly via `partition_key=["year"]` with a `year` column the upstream pipeline computes.
+`partitioned_by` works on Hudi feature groups written directly by Spark (a non-stream feature group): the client materializes the grain columns and Hudi partitions on them.
+On the Python (non-Spark) engine a Hudi feature group is created as a stream feature group, which is not yet supported (see below); use `time_travel_format="DELTA"` or `"ICEBERG"` there.
+`time_travel_format="NONE"` (plain Hive/parquet) is rejected because it has no grain-materialization step.
+
+###### Stream feature groups
+
+`partitioned_by` is not yet supported on stream feature groups (`stream=True`).
+Stream feature groups materialize through the DeltaStreamer job, which does not derive the grain columns yet, so the backend rejects `partitioned_by` on them at creation.
+Create a non-stream feature group to use `partitioned_by`.
 
 ##### Table format