HDDS-12864. All commit semantics in replication writes#10365
Conversation
yandrey321
left a comment
There was a problem hiding this comment.
Is there enough tests to cover new logic and error condition?
| * <p>{@link #applyTransaction(TransactionContext)} becomes a no-op for PutBlock because | ||
| * {@code persistPutBlock()} detects the already-written BCSID and returns early. | ||
| */ | ||
| private CompletableFuture<Message> writePutBlockData( |
There was a problem hiding this comment.
it should be refactored to smaller functions to improve readability
| stateMachineDataCache.put(entryIndex, write.getData()); | ||
| // Cache the full proto bytes so read() can return them for slow followers | ||
| // without having to reconstruct the proto from raw bytes. | ||
| stateMachineDataCache.put(entryIndex, requestProto.toByteString()); |
There was a problem hiding this comment.
how cache would be cleaned in case if write to disk fails?
How long data would be stored in cache and how it would be evicted over the time?
| executor); | ||
|
|
||
| writeChunkFutureMap.put(entryIndex, new WriteFutures(future, raftFuture, startTime)); | ||
| if (LOG.isDebugEnabled()) { |
There was a problem hiding this comment.
there are no expensive calls or string formatting for LOG.debug() call, so it would check the log level inside and shortcircuit the call.
| // Strip data for the WAL entry (logProto). | ||
| protoBuilder.setWriteChunk(WriteChunkRequestProto.newBuilder(write).clearData()); | ||
| // stateMachineData carries the full proto so followers can parse it directly. | ||
| builder.setStateMachineData(fullProtoWithData.toByteString()); |
There was a problem hiding this comment.
This seems an incompatible change since it changes the format of the stateMachineData. Then, an old datanode will not work with a new datanode.
There was a problem hiding this comment.
Do you mean a case like rolling upgrade when different versions of datanodes would be present at the same time? Let me think about it.
There was a problem hiding this comment.
Yes, we must rolling upgrade for datanodes.
| case PutBlock: | ||
| return writePutBlockData(requestProto, entry.getIndex(), | ||
| entry.getTerm(), writeStateMachineStartTime); |
There was a problem hiding this comment.
The change here is to write PutBlock to RocksDb when writing the RaftLog entry (via DataApi.write(..)). If the RaftLog entry is truncated, what will happen to the RocksDb record?
What changes were proposed in this pull request?
That's WiP patch for All commit replication writes.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-12864
How was this patch tested?
Later will be covered with UTs.