[ISSUE #10442] Optimize message properties encode/decode to reduce allocation#10444
[ISSUE #10442] Optimize message properties encode/decode to reduce allocation#10444wang-jiahua wants to merge 1 commit into
Conversation
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
This PR focuses on reducing allocations and improving performance on the broker hot path by optimizing message property serialization/parsing and reusing buffers.
Changes:
- Added UTF-8 byte-level encode/decode paths for message properties to avoid intermediate
Stringallocations. - Introduced reusable
ThreadLocalbuffers (StringBuilder,char[], and a properties map) to reduce per-message allocations. - Added key/value interning helpers to reduce duplicated
Stringallocations when parsing properties.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| common/src/main/java/org/apache/rocketmq/common/message/MessageVersion.java | Simplifies magic-code to version mapping with direct comparisons. |
| common/src/main/java/org/apache/rocketmq/common/message/MessageExtBrokerInner.java | Adds cached UTF-8 propertiesData and lazy encoding path. |
| common/src/main/java/org/apache/rocketmq/common/message/MessageDecoder.java | Adds byte-based properties encode/decode, interning, and ThreadLocal reuse. |
| common/src/main/java/org/apache/rocketmq/common/message/MessageConst.java | Adds intern tables for frequent property keys to avoid substring allocations. |
| common/src/main/java/org/apache/rocketmq/common/message/MessageClientIDSetter.java | Reuses a ThreadLocal char[] buffer for uniq ID creation. |
| common/src/main/java/org/apache/rocketmq/common/message/Message.java | Removes side-effect allocation in getProperty and optimizes priority parsing. |
| common/src/main/java/org/apache/rocketmq/common/message/FlatPropertiesMap.java | Introduces a compact map for decoded properties with fast encoding support. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
有个问题要注意下, 这个map的put 也需要有后覆盖前的语义(包括entry的正确性),以及多次decode会不会有问题 |
1be93d8 to
a6359a4
Compare
oss-sentinel-ai
left a comment
There was a problem hiding this comment.
Review by github-manager-bot
Summary
This PR introduces performance optimizations for message properties encoding/decoding by replacing HashMap allocation with a compact FlatPropertiesMap backed by a flat Object[] array, adding ThreadLocal reuse for StringBuilder and char[] buffers, and interning high-frequency property keys.
Findings
- [Good] MessageDecoder.java:605-635 — ThreadLocal StringBuilder reuse with proper capacity cap (
REUSABLE_SB_CAP_LIMIT = 64KB) prevents unbounded memory growth. - [Good] MessageDecoder.java:644-673 — New
messageProperties2Bytes()method directly encodes to UTF-8 bytes, avoiding String allocation on the hot path. - [Good] MessageDecoder.java:675-720 — Custom UTF-8 encoder correctly handles edge cases including unpaired surrogates (matches JDK behavior).
- [Good] MessageDecoder.java:722-770 —
bytes2messageProperties()returns independent HashMap (not ThreadLocal-shared), preventing cross-decode corruption. - [Good] MessageConst.java:1-35 — Interning of high-frequency property keys reduces String allocation.
- [Good] MessageDecoderTest.java:431-556 — Comprehensive test coverage including ASCII, multi-byte UTF-8, surrogate pairs, and null handling.
Suggestions
- Consider adding a benchmark (JMH) to quantify the performance improvement on the encode/decode hot path.
- The
REUSABLE_SB_CAP_LIMITof 64KB is reasonable; document the rationale for this specific value.
Verdict
✅ Approved — Well-designed performance optimization with proper memory safety guards and comprehensive test coverage.
Automated review by github-manager-bot
a6359a4 to
e6fd9a2
Compare
|
test3 |
|
Overall suggestion: this PR currently bundles several independent optimizations with different risk profiles: public |
e6fd9a2 to
b7427cb
Compare
FlatPropertiesMap and optimize message properties encode/decode…uce allocation
Focuses on the properties encode/decode hot path (broker send + dispatch):
- MessageDecoder.string2messageProperties: right-size HashMap from
separator count instead of fixed capacity 128; intern well-known
property keys via length-bucketed regionMatches (skip substring+hash);
intern frequent values ("0","1","true","false","DefaultRegion").
- MessageDecoder.messageProperties2Bytes: direct UTF-8 byte serialization
of Map entries, skipping the StringBuilder→String→getBytes round-trip.
- MessageDecoder.bytes2messageProperties: parse directly from byte[]
with ASCII key matching, skipping intermediate String allocation.
- MessageDecoder.messageProperties2String: reuse ThreadLocal StringBuilder
with capacity cap to bound per-thread memory.
- MessageConst: add STRING_INTERN_MAP and STRING_INTERN_BY_LEN for
canonical key interning.
- MessageExtBrokerInner: add propertiesData byte[] cache field to
short-circuit propertiesString→byte[] re-encoding on the write path.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
b7427cb to
e9e34a2
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## develop #10444 +/- ##
=============================================
- Coverage 48.06% 48.01% -0.05%
- Complexity 13313 13337 +24
=============================================
Files 1377 1377
Lines 100632 100827 +195
Branches 12995 13050 +55
=============================================
+ Hits 48369 48417 +48
- Misses 46344 46448 +104
- Partials 5919 5962 +43 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
RockteMQ-AI
left a comment
There was a problem hiding this comment.
Review by github-manager-bot
Summary
Comprehensive performance optimization for message properties serialization: introduces direct byte-level encode/decode to skip intermediate String allocation, pre-computed key intern tables, ThreadLocal StringBuilder reuse, and cached propertiesData in MessageExtBrokerInner.
Findings
- [Critical]
MessageDecoder.java:610-618—REUSABLE_SBThreadLocal with 64KB cap is a good pattern. The cap prevents unbounded growth after a single oversized message while still benefiting from reuse. - [Critical]
MessageDecoder.java:660-720— Custom UTF-8 encoder (writeUtf8,utf8ByteLength) correctly handles surrogate pairs and unpaired surrogates (replacing with?). The benchmark shows 2.3x speedup and 50% less allocation. Thorough testing validates correctness against JDK's implementation. - [Critical]
MessageConst.java:182-196—STRING_INTERN_BY_LENpre-computes known property keys indexed by length. TheinternString()lookup avoidsString.substring()allocation for well-known keys. Good trade-off: a small static array for significant hot-path savings. - [Warning]
MessageExtBrokerInner.java:200-220—propertiesDatacaches pre-encoded UTF-8 bytes. ThegetEffectivePropertiesData()method returns the internal buffer directly (documented as package-private for performance). Ensure no caller mutates the returned array. ThedeleteProperty()correctly invalidates the cache. - [Warning]
MessageDecoder.java:830-850—VALUE_INTERN_BY_LENinterns common values like "0", "1", "true", "false". This is effective for properties with a small set of known values but adds complexity. The 128-length cap is reasonable. - [Info]
MessageDecoder.java:750-780—bytes2messageProperties()parses directly from UTF-8 bytes. TheextraEntriesparameter for HashMap pre-sizing is a nice touch to avoid rehashing.
Suggestions
- Add a comment on
getEffectivePropertiesData()warning that the returned array must not be mutated by callers. - Consider whether
propertiesDatashould betransientifMessageExtBrokerInneris ever serialized (unlikely but worth documenting).
Verdict: Excellent engineering. This is a well-designed, thoroughly tested optimization that targets the right hot path. The custom UTF-8 encoder is complex but well-tested, and the benchmark demonstrates meaningful improvements.
Automated review by github-manager-bot
Which Issue(s) This PR Fixes
Fixes #10442
Brief Description
Optimizes the message properties encode/decode hot path on the broker send and dispatch paths. All changes are within the
commonmodule'sMessageDecoder,MessageConst, andMessageExtBrokerInner.Changes
MessageDecoder.string2messageProperties: right-size HashMap from separator count instead of fixed capacity 128; intern well-known property keys via length-bucketedregionMatches(skip substring + hash lookup); intern frequent values ("0", "1", "true", "false", "DefaultRegion"). Added(String, int)overload for callers that need extra capacity for broker-injected entries.MessageDecoder.messageProperties2Bytes: direct UTF-8 byte serialization of Map entries, skipping theStringBuilder → String → getBytesround-trip on the broker write path.MessageDecoder.bytes2messageProperties: parse directly frombyte[]with ASCII key matching, skipping the intermediatenew String(bytes, ...)allocation on the dispatch path.MessageDecoder.messageProperties2String: reuseThreadLocal<StringBuilder>with capacity cap (64KB) to bound per-thread memory after oversized messages.MessageConst: addSTRING_INTERN_MAPandSTRING_INTERN_BY_LENfor canonical key interning.MessageExtBrokerInner: addpropertiesDatabyte[] cache field to short-circuitpropertiesString → byte[]re-encoding on the write path;getPropertiesData()returns defensive copy;getEffectivePropertiesData()(package-private) returns internal buffer for the encode hot path.Compatibility
string2messageProperties(String)remains the primary public API; the new(String, int)overload is additive.bytes2messagePropertiesandmessageProperties2Bytesare new public methods, additive only.messageProperties2Stringoutput is unchanged; the ThreadLocal is an internal optimization.MessageExtBrokerInner.propertiesDatais a new transient field; existing serialization is unaffected.How Did You Test This Change
New unit tests added in
MessageDecoderTest:testMessageProperties2BytesMatchesString— verifies byte-for-byte equivalence with the String path for ASCII, multi-byte UTF-8, and paired surrogates.testMessageProperties2StringAndBytesSkipNullKeyAndValue— verifies both paths skip null key/value entries consistently.testMessageProperties2BytesUnpairedSurrogateMatchesJdk— verifies unpaired surrogate handling matches JDK behavior.testBytes2messagePropertiesReturnsIndependentMap— verifies consecutive decodes on the same thread produce independent maps.testMessageProperties2BytesNullAndEmpty— verifies null/empty input handling.