Fix core dumps triggered by rocksdb compacting when shutdown bk#4706
Fix core dumps triggered by rocksdb compacting when shutdown bk#4706AnonHxy wants to merge 9 commits intoapache:masterfrom
Conversation
560ace4 to
bc91be9
Compare
...r/src/main/java/org/apache/bookkeeper/bookie/storage/ldb/SingleDirectoryDbLedgerStorage.java
Outdated
Show resolved
Hide resolved
ecaea19 to
7018ea1
Compare
1556a35 to
abee9aa
Compare
...keeper-server/src/main/java/org/apache/bookkeeper/bookie/storage/ldb/EntryLocationIndex.java
Show resolved
Hide resolved
...keeper-server/src/main/java/org/apache/bookkeeper/bookie/storage/ldb/EntryLocationIndex.java
Show resolved
Hide resolved
There was a problem hiding this comment.
There's a need to do some changes in KeyValueStorageRocksDB class too.
Usually RocksDB triggers compaction on it's own without explicit compaction and that type of compaction would continue to cause problems.
Claude suggested this type of pattern:
try {
// Signal compactions to stop
db.cancelAllBackgroundWork(true);
} catch (RocksDBException e) {
// log but continue — we still want to close
} finally {
// Always close column family handles first, then the DB
for (ColumnFamilyHandle cfh : columnFamilyHandles) {
cfh.close();
}
db.close();
}It might be sufficient to simply have this logic in KeyValueStorageRocksDB.close
| if (cache != null) { | ||
| cache.close(); | ||
| } | ||
| if (options != null) { |
There was a problem hiding this comment.
a minor detail is that columnFamilyDescriptors aren't closed. I'd assume that it leaks a tiny bit of memory in cases where the JVM process wouldn't terminate (most likely only in test scenarios). I'm not exactly sure about this. It's not a blocker for this PR.
There was a problem hiding this comment.
It's documented here: https://github.com/facebook/rocksdb/wiki/rocksjava-basics#opening-a-database-with-column-families . Not very great docs, but based on that, columnFamilyDescriptors should be closed after the db is closed.
| try { | ||
| closedLock.writeLock().lock(); | ||
| closed = true; | ||
| db.cancelAllBackgroundWork(true); |
There was a problem hiding this comment.
wrapping this call in a separate try-catch would be recommended
| // mock EntryLocationIndex is compacting | ||
| idx.compacting.set(true); | ||
| AtomicBoolean closeFlag = new AtomicBoolean(false); | ||
| AtomicLong closeEscapedMills = new AtomicLong(0); |
Descriptions of the changes in this PR:
Fix #4705
Motivation
Fix #4705
Changes
1.Setting the
compactingflag ofentryLocationIndexas true when shutdownLedgerStorageto stopping the subsequentcompact.2. When
LedgerStorageshutdown we will wait unit thecompactend.