Skip to content

HDDS-14926. Allow QUASI_CLOSED containers in DiskBalancer with improved debug logs for containers#10022

Open
Gargi-jais11 wants to merge 1 commit intoapache:masterfrom
Gargi-jais11:HDDS-14926
Open

HDDS-14926. Allow QUASI_CLOSED containers in DiskBalancer with improved debug logs for containers#10022
Gargi-jais11 wants to merge 1 commit intoapache:masterfrom
Gargi-jais11:HDDS-14926

Conversation

@Gargi-jais11
Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

Currently DiskBalancer only supports CLOSED containers to be moved but if user wants to also move quasi closed containers then we should support that as well.

1st Improvement:
Below scenario can happen in real, so Disk balancer can only attempt those 10 CLOSED containers. Even if it moves all 10 successfully, disk A might drop only from 90% → 85% — nowhere near balance. The QUASI_CLOSED containers are the real bulk occupying the space and disk balancer is completely blind to them.

Disk A: 90% utilized 
├── 600 containers: QUASI_CLOSED 
└── 10 containers: CLOSED 
Disk B: 10% utilized 
Disk C: 11% utilized

Added a config hdds.datanode.disk.balancer.include.non.standard.containers default=false. If true, balancer include non-standard states, i.e, QUASI_CLOSED. So both CLOSED and QUASI_CLOSED state containers are eligible for move. If false (default), balancer only moves CLOSED containers.

2nd Improvement:
We need to add debug logs for chooseContainer method as user might not understand why if they have over and under utilised volume still container is not moved. This parts needs more clarification. Because I see in escalation it helped a lot with balancer debug logs for container not choose to identify what state the container or volume was in.
I suggest adding these logs for container not choose :

// UsedBytes less than 0
 LOG.debug("Skipping container {} from volume {}: bytes used is {}", containerId, src.getStorageDir().getPath(), containerData.getBytesUsed());
              
//  Skip containers already in progress
LOG.debug("Skipping container {} from volume {}: disk balancer move already in progress", containerId, src.getStorageDir().getPath());

// only closed and quasi closed containers should be moved
LOG.debug("Skipping container {} from volume {}: state is {}. Requires {}", containerId, src.getStorageDir().getPath(), containerData.getState(), phase.getEligibilityCriteria());

// skipping container move as its size is more than destination available space.
LOG.debug("Skipping container {} ({}B) from volume {}: exceeds destination {} usable space {}B, containerId, containerSize, src.getStorageDir().getPath(), dst.getStorageDir().getPath(), usableSpace);

// skipping container move as it will make dest more utilised after movement.
LOG.debug("Skipping container {} ({}B) from volume {}: moving to {} would  result in utilization {} exceeding upper threshold {}", containerId, containerSize, src.getStorageDir().getPath(), dst.getStorageDir().getPath(),

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-14926

How was this patch tested?

Added unit test.

@Gargi-jais11 Gargi-jais11 changed the title HDDS-14926. Allow QUASI_CLOSED containers in DiskBalancer with improved debug logging for containers HDDS-14926. Allow QUASI_CLOSED containers in DiskBalancer with improved debug logs for containers Apr 1, 2026
@Gargi-jais11 Gargi-jais11 marked this pull request as ready for review April 2, 2026 03:58
@Gargi-jais11
Copy link
Copy Markdown
Contributor Author

@ChenSammi Please review the PR.

@errose28
Copy link
Copy Markdown
Contributor

errose28 commented Apr 6, 2026

Added a config hdds.datanode.disk.balancer.include.non.standard.containers default=false. If true, balancer include non-standard states, i.e, QUASI_CLOSED. So both CLOSED and QUASI_CLOSED state containers are eligible for move. If false (default), balancer only moves CLOSED containers.

I think we should remove the config and have all containers that would otherwise be eligible for replication between datanodes (non-open containers) eligible for disk balancing. Since they can be safely copied between datanodes there is no reason to exclude them when copying between disks as well.

@Gargi-jais11
Copy link
Copy Markdown
Contributor Author

I think we should remove the config and have all containers that would otherwise be eligible for replication between datanodes (non-open containers) eligible for disk balancing. Since they can be safely copied between datanodes there is no reason to exclude them when copying between disks as well.

Thanks for the suggestion — the idea that anything safe to replicate between datanodes should be safe to move between disks on the same node is reasonable.
But the config was added hdds.datanode.disk.balancer.include.non-standard.containers (default false) to stay aligned with container balancer (in this PR): there, non-standard states such as QUASI_CLOSED are only considered when the corresponding include flag is set. Intially, I and @ChenSammi had the opinion of moving QUASI_CLOSED containers regardless of any config setting. But after discussion we decided to align with container Balancer.

@errose28
Copy link
Copy Markdown
Contributor

errose28 commented Apr 7, 2026

#9964 largely applies to over-replicated containers. It is required because the balancer will create more over-replication as it runs if deletes are not processed. The disk balancer does not create over-replication because a datanode can only load one container at a time, and furthermore the datanodes themselves are not aware of whether a container is over-replicated. The only overlap between the two is handling quasi-closed containers. For quasi-closed containers I don't see a reason for either process to exclude them, and I'm not sure #9964 made the right call here. I can't find any documented reason why quasi-closed containers should be excluded from either process, only jira descriptions saying it will be done with no supporting reasoning.

@siddhantsangwan @sarvekshayr @sadanand48 what do you think?

@sadanand48
Copy link
Copy Markdown
Contributor

I can't find any documented reason why quasi-closed containers should be excluded from either process.

Same here, Theoretically I think we could skip excluding quasi-closed containers however my guess based on some discussions is that we didn't want cases where same container is acted upon by both balancer and RM, Say balancer schedules a move for a quasi-closed container and RM sends a command to close the container at the same time.

While balancer moves the QC container, If we don't do the bcsId check it can increase the containers that have lower bcsId by the balancer selecting it and delete not processing in the system. Still should not be a major issue. I think we just wanted to play safe by only moving immutable container state. If we have tested these scenarios I believe we can include these in the default criteria

The origin of the check is in https://github.com/apache/ozone/pull/2441/changes#diff-b0ac1e8086a1da3247489a1a6ad742996052aa17a46544f1f6aaf24bfd92a762R76-R82

@siddhantsangwan please correct me if I am wrong

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants