Skip to content

perf(relationships): bulk tree persistence, lean hydration and delta-only reindex when saving related content#36149

Open
jcastro-dotcms wants to merge 8 commits into
mainfrom
issue-36147-Saving-contentlets-with-many-to-many-relationship-fields-and-large-related-content-volumes-is-extremely-slow
Open

perf(relationships): bulk tree persistence, lean hydration and delta-only reindex when saving related content#36149
jcastro-dotcms wants to merge 8 commits into
mainfrom
issue-36147-Saving-contentlets-with-many-to-many-relationship-fields-and-large-related-content-volumes-is-extremely-slow

Conversation

@jcastro-dotcms

Copy link
Copy Markdown
Member

This PR fixes: #36147

🎯 Problem

Saving a contentlet whose Content Type has several MANY_TO_MANY relationship fields became unusably slow as related-content volume grew: on a real-world dataset (~1,500 parents × ~20 children each, so each child relates back to ~350 parents), a single Save/Publish took from 40 seconds up to several minutes and fired tens of thousands of JDBC queries. Three compounding root causes:

  1. Per-row relationship persistence — every related record cost a getTree + saveTree pair (2–3 queries each) plus one DELETE per row.
  2. Unnecessary full hydration — the save path loaded complete contentlets (fields, permissions, version info) when it only needed identifiers or a count.
  3. Over-broad dependency reindexingdependenciesLeftToReindex() re-queued every currently-related contentlet for reindexing on every save, because the old-state/new-state comparison read the new DB state on both sides. Saving one child re-indexed hundreds of unchanged parents, each fully hydrated and remapped.

✨ Proposed Changes

Relationship persistence — bulk, batched, and permission-aware (ESContentletAPIImpl, TreeFactory)

  • deleteRelatedContent now splits into two paths: a fast identifier-only path for the system user / CMS Admins (single bulk DELETE, zero hydration, only removed parents loaded for reindex), and a permission-scoped path for limited users that preserves the legacy contract — relationships to content the user cannot READ are never deleted (bulk delete when the filter removes nothing; chunked IN deletes otherwise).
  • relateContent collects all rows and persists them through a new TreeFactory.insertTrees: a duplicate-safe batched upsert (dedup by the (child, parent, relation_type) PK, batched DELETE+INSERT wrapped in LocalTransaction so it is atomic for any caller). Duplicate records in the payload (same identifier twice, multiple languages) no longer risk a PK violation.
  • The tree-order base comes from a single MAX(tree_order)+1 lookup over surviving rows (replacing a full getRelatedContentFromIndex hydration), queried only on the parent branch where it is actually used.

Hydration elimination (ESContentletAPIImpl, VersionableFactory/Impl)

  • Related identifiers are read straight from the tree table via new TreeFactory identifier lookups — no contentlet hydration to compute delete scopes or counts.
  • Removed parents that genuinely need reindexing are loaded through a new batched, uncached VersionableFactory.findAllContentletVersionInfos(Collection) (chunked IN, 500 ids/statement) + the existing batch findContentlets, instead of one lookup per identifier. All SQL lives in factory classes per house architecture.

Delta-only dependency reindexing (ESMappingAPIImpl)

  • dependenciesLeftToReindex() now compares the search index (pre-save state) against the tree table (post-save state) and returns their true symmetric difference: only relationships actually added or removed. Unchanged parents are never re-queued — this is the biggest win for heavily related datasets.
  • Identifiers are deduped (the index holds one document per language/variant), so multilingual content no longer triggers spurious reindexing; a scroll-search fallback removes the silent 10,000-document truncation.

Hardening (review feedback)

  • assertTreeColumn allowlist guard on every TreeFactory helper that interpolates a column name into SQL (only parent/child accepted) — enforces what the Semgrep findings flagged conventionally.
  • Javadoc pass: class-level doc for TreeFactory, contracts on the bulk deletes (caller-enforced permissions), and a full explanation of the new reindex-delta model on dependenciesLeftToReindex.

✅ Checklist

  • Tests
  • Translations (N/A — backend only)
  • Security Implications Contemplated (add notes if applicable)

Security notes: permission semantics are preserved and now covered by an integration test (a limited user's save cannot delete relationships to content they cannot READ — this also fixes a data-loss risk present in the first draft of this PR). The Semgrep CUSTOM_INJECTION-2 findings are false positives (concatenated fragments are compile-time literals or generated ?,?,... placeholders; all user data is bound parameters) and are additionally guarded by the new column-name allowlist.

🧪 Testing

11 new integration tests, all green:

  • TreeFactoryTest (new class, 7 tests): batch upsert dedup, pre-existing row replacement, scoped deletes, >500-id chunking, ordered id lookups, next-tree-order.
  • ESContentletAPIImplTest (+3): limited-user permission preservation, duplicate-records save, append-after-preserved ordering. Full class: 90/90 passing.
  • ESMappingAPITest (+2): reindex delta returns only added+removed parents; multilingual parents are not spuriously re-indexed.

ℹ️ Additional Info

  • Verified against a production-scale dataset: re-saving heavily related content drops from minutes to seconds, with query counts reduced by orders of magnitude (bulk statements + delta-only reindex instead of per-row queries + full-graph reindex).
  • A manual backport to 24.12.27 LTS is prepared in parallel (adapted for -source 11), including the test suite.
  • Behavioral notes for reviewers: orphan tree rows (pointing to fully-deleted content) are now cleaned up by the bulk delete; tree_order values are rebased per save (relative ordering preserved — consumers only ORDER BY tree_order).

Screenshots

N/A — backend-only performance change.

🤖 Generated with Claude Code

@claude

claude Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Claude finished @jcastro-dotcms's task in 2m 23s —— View job


Rollback Safety Analysis

  • Read rollback-unsafe categories reference
  • Get full PR diff
  • Analyze diff against all unsafe categories
  • Post findings and apply label

Verdict: ✅ Safe To Rollback

After reviewing the full diff against every category in the rollback-unsafe reference, this PR does not match any unsafe category. Here is the analysis for each category checked:

Category Assessment
C-1 Structural Data Model Change No schema changes to tree, contentlet, multi_tree, or any core table. The tree table is queried and written to differently, but its structure is unchanged.
C-2 ES Mapping Change ESMappingAPIImpl changes only affect the reindex trigger logic (which documents get queued for reindex). No putMapping() calls, no field type changes, no new mapping fields, no ESMappingConstants field name constants changed.
C-3 Content JSON Model Version Bump No changes to CURRENT_MODEL_VERSION, ImmutableContentlet, or ContentletJsonAPIImpl.
C-4 DROP TABLE / DROP COLUMN No DDL drop statements anywhere in the diff.
H-1 One-Way Data Migration No runonce task, no UPDATE … SET bulk transform, no destructive data backfill.
H-2 RENAME TABLE / COLUMN None.
H-3 PK Restructuring None. The (child, parent, relation_type) PK of tree is used as the dedup key in insertTrees but is not structurally altered.
H-4 New Field Type None.
H-5 Storage Provider Change None.
H-6 DROP PROCEDURE / FUNCTION None.
H-7 NOT NULL Column Without Default No DDL changes at all.
H-8 VTL Viewtool Contract Change ESMappingAPIImpl.dependenciesLeftToReindex() is an internal method, not a VTL viewtool method. No public viewtool signatures changed.
M-1 Column Type Change None.
M-2 Push Publishing Bundle Format None.
M-3 REST/GraphQL API Contract No REST endpoint changes.
M-4 OSGi Interface Change VersionableFactory gains one new public abstract method (findAllContentletVersionInfos(Collection<String>)). This is an addition on an abstract class, not a removal or signature change. OSGi plugins extending VersionableFactory directly would need to implement the new method, but this is an internal factory class not expected to have external OSGi implementations.

Summary: All changes are confined to SQL query logic, batch operation helpers, and internal method additions. No database schema changes, no Elasticsearch mapping changes, no data transformations, and no API contract breaks. N-1 can start and operate normally after this code is deployed (or rolled back).

Label applied: AI: Safe To Rollback

@jcastro-dotcms

Copy link
Copy Markdown
Member Author

NOTE TO DEVS

This new PR was created to make the branch match a GitHub ticket number. The initial PR containing the original feedback form Claude Code can be found in the previous version, localed here:

Just to be clear, this new PR contains the exact same code changes from the previous one.

@github-actions

github-actions Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

❌ Codex Review failed — openai.gpt-5.5

The review job failed before producing output. See the run for details.

Run: #27433876927

@semgrep-code-dotcms-test

Copy link
Copy Markdown
Contributor

Semgrep found 15 CUSTOM_INJECTION-2 findings:

The method identified is susceptible to injection. The input should be validated and properly
escaped.

If this is a critical or high severity finding, please also link this issue in the #security channel in Slack.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AI: Safe To Rollback Area : Backend PR changes Java/Maven backend code

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

Saving contentlets with many-to-many relationship fields and large related-content volumes is extremely slow

1 participant