HDDS-14926. Allow QUASI_CLOSED containers in DiskBalancer with improved debug logs for containers#10022
HDDS-14926. Allow QUASI_CLOSED containers in DiskBalancer with improved debug logs for containers#10022Gargi-jais11 wants to merge 1 commit intoapache:masterfrom
Conversation
|
@ChenSammi Please review the PR. |
I think we should remove the config and have all containers that would otherwise be eligible for replication between datanodes (non-open containers) eligible for disk balancing. Since they can be safely copied between datanodes there is no reason to exclude them when copying between disks as well. |
Thanks for the suggestion — the idea that anything safe to replicate between datanodes should be safe to move between disks on the same node is reasonable. |
|
#9964 largely applies to over-replicated containers. It is required because the balancer will create more over-replication as it runs if deletes are not processed. The disk balancer does not create over-replication because a datanode can only load one container at a time, and furthermore the datanodes themselves are not aware of whether a container is over-replicated. The only overlap between the two is handling quasi-closed containers. For quasi-closed containers I don't see a reason for either process to exclude them, and I'm not sure #9964 made the right call here. I can't find any documented reason why quasi-closed containers should be excluded from either process, only jira descriptions saying it will be done with no supporting reasoning. @siddhantsangwan @sarvekshayr @sadanand48 what do you think? |
Same here, Theoretically I think we could skip excluding quasi-closed containers however my guess based on some discussions is that we didn't want cases where same container is acted upon by both balancer and RM, Say balancer schedules a move for a quasi-closed container and RM sends a command to close the container at the same time. While balancer moves the QC container, If we don't do the bcsId check it can increase the containers that have lower bcsId by the balancer selecting it and delete not processing in the system. Still should not be a major issue. I think we just wanted to play safe by only moving immutable container state. If we have tested these scenarios I believe we can include these in the default criteria The origin of the check is in https://github.com/apache/ozone/pull/2441/changes#diff-b0ac1e8086a1da3247489a1a6ad742996052aa17a46544f1f6aaf24bfd92a762R76-R82 @siddhantsangwan please correct me if I am wrong |
What changes were proposed in this pull request?
Currently DiskBalancer only supports CLOSED containers to be moved but if user wants to also move quasi closed containers then we should support that as well.
1st Improvement:
Below scenario can happen in real, so Disk balancer can only attempt those 10 CLOSED containers. Even if it moves all 10 successfully, disk A might drop only from 90% → 85% — nowhere near balance. The QUASI_CLOSED containers are the real bulk occupying the space and disk balancer is completely blind to them.
Added a config
hdds.datanode.disk.balancer.include.non.standard.containersdefault=false. If true, balancer include non-standard states, i.e, QUASI_CLOSED. So both CLOSED and QUASI_CLOSED state containers are eligible for move. If false (default), balancer only moves CLOSED containers.2nd Improvement:
We need to add debug logs for chooseContainer method as user might not understand why if they have over and under utilised volume still container is not moved. This parts needs more clarification. Because I see in escalation it helped a lot with balancer debug logs for container not choose to identify what state the container or volume was in.
I suggest adding these logs for container not choose :
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-14926
How was this patch tested?
Added unit test.