Skip to content

placement: optimize bitmap range checks with byte-level skipping#17623

Draft
Copilot wants to merge 26 commits intomasterfrom
copilot/add-interactive-placement-debug-tool
Draft

placement: optimize bitmap range checks with byte-level skipping#17623
Copilot wants to merge 26 commits intomasterfrom
copilot/add-interactive-placement-debug-tool

Conversation

Copy link

Copilot AI commented Mar 1, 2026

tgt_isset_range, dom_isset_range, and dom_isset_2ranges iterated every index one-by-one even when large runs of the bitmap were fully set. Since the bitmap is a uint8_t array, a single byte covers 8 indices — checking bits[index >> 3] == 0xFF lets us skip the entire byte unconditionally.

Changes

  • Single unified loop replaces the prior three-phase (leading/middle/trailing) approach. Alignment pre-work is unnecessary: if bits[index >> 3] == 0xFF, every bit in that byte is set regardless of where index sits within it, so index = (index | 7) + 1 safely jumps to the next byte.
  • dom_isset_2ranges uses (bits1[b] | bits2[b]) == 0xFF as the skip predicate — OR of both bytes being all-ones means isclr(bits1,i) && isclr(bits2,i) is impossible for any i in that byte.
for (index = start; index <= end; /* index advanced in body */) {
    if (bits[index >> 3] == 0xFF) {
        index = (index | 7) + 1;
        continue;
    }
    if (!is_excluded_comp(...) && isclr(bits, index))
        return false;
    ++index;
}

Steps for the author:

  • Commit message follows the guidelines.
  • Appropriate Features or Test-tag pragmas were used.
  • Appropriate Functional Test Stages were run.
  • At least two positive code reviews including at least one code owner from each category referenced in the PR.
  • Testing is complete. If necessary, forced-landing label added and a reason added in a comment.

After all prior steps are complete:

  • Gatekeeper requested (daos-gatekeeper added as a reviewer).
Original prompt

Create a new interactive placement debugging utility pl_debug under src/placement/tests in daos-stack/daos.

Background/goal:
We want a developer tool similar in spirit to jump_map_dist.c but interactive. The tool should construct an in-memory pool map and jump placement map from user-specified topology parameters and allow interactive commands to change pool component status and generate/print object layouts.

Repository/branch:

  • Repo: daos-stack/daos
  • Base branch: master

Binary/tool:

  • Name: pl_debug
  • Location: src/placement/tests/pl_debug.c (or .cc if needed)
  • Ensure it is built by the DAOS build system (CMake) in the src/placement/tests area.

Command-line options:

  • -n <number>: number of nodes
  • -r <number>: number of ranks per node
  • -t <number>: number of targets per rank
    After parsing options, create:
  • an in-memory pool map with a hierarchy such that node is the fault domain
  • ranks distributed with r ranks per node across n nodes
  • t targets per rank
  • a jump placement map based on that pool map

Interactive shell:
After setup, enter interactive mode (read-eval-print loop). Support commands:

  1. obj_class <str_name>

    • Convert str_name to an object class ID as defined in src/include/daos_obj_class.h
    • Set the selected class in a global/current variable used by gen_layout
    • Accept either enum-style names (e.g. OC_EC_8P3GX) and also allow numeric input (optional but preferred)
    • Print the resolved class ID and name.
  2. gen_layout id=<number>

    • Use <number> as oid.lo
    • Use currently-selected obj class and current pool map/placement map to generate an object layout
    • Print the layout in a human-readable form including (at minimum):
      • shard index
      • target id
      • rank
      • target index
      • fseq
      • rebuilding flag if present
    • Include group/stripe group delineation if applicable.
  3. set_down rank=<number>|node=<number>

    • Set specified rank or node and all targets under it to status DOWN
    • Use ds_pool_map_tgts_update() to apply target state change to the pool map (as requested)
  4. set_downout rank=<number>|node=<number>

    • Set status to DOWNOUT (again via ds_pool_map_tgts_update())
  5. set_up rank=<number>|node=<number>

    • Set status to UP (via ds_pool_map_tgts_update())
  6. set_upin rank=<number>|node=<number>

    • Set status to UPIN (via ds_pool_map_tgts_update())

General shell behavior:

  • Provide help and quit/exit commands.
  • Validate inputs with clear error messages.
  • After any pool map update, ensure placement map sees the updated pool map (recreate placement map or refresh as needed).

Implementation notes:

  • Follow existing patterns in src/placement/tests and/or jump_map_dist.c.
  • Keep the tool self-contained; it should not require a running DAOS system.
  • Use DAOS internal APIs to create pool map and jump placement map.
  • Choose/initialize reasonable defaults for OID hi fields, object metadata, etc.

Testing:

  • Add minimal smoke-test instructions in comments or documentation.
  • Ensure it compiles on master.

Deliverables:

  • New source file(s) under src/placement/tests
  • CMake updates to build pl_debug
  • Any small helper changes required for compilation (avoid large refactors)

This pull request was created from Copilot chat.


🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. Learn more about Advanced Security.

Co-authored-by: gnailzenh <7268050+gnailzenh@users.noreply.github.com>
Copilot AI changed the title [WIP] Add interactive placement debugging utility pl_debug Add interactive placement debugging utility pl_debug Mar 1, 2026
@daosbuild3
Copy link
Collaborator

Test stage Build on Leap 15.5 with Intel-C and TARGET_PREFIX completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17623/2/execution/node/305/log

@daosbuild3
Copy link
Collaborator

@daosbuild3
Copy link
Collaborator

@daosbuild3
Copy link
Collaborator

Co-authored-by: gnailzenh <7268050+gnailzenh@users.noreply.github.com>
Copilot AI changed the title Add interactive placement debugging utility pl_debug placement/tests: add interactive pl_debug tool with gen_layout mode= support Mar 1, 2026
@daosbuild3
Copy link
Collaborator

Test stage Build on Leap 15.5 with Intel-C and TARGET_PREFIX completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17623/3/execution/node/304/log

@daosbuild3
Copy link
Collaborator

Co-authored-by: gnailzenh <7268050+gnailzenh@users.noreply.github.com>
@daosbuild3
Copy link
Collaborator

Copilot AI changed the title placement/tests: add interactive pl_debug tool with gen_layout mode= support placement/tests: add mode= and ver= options to pl_debug gen_layout command Mar 1, 2026
@daosbuild3
Copy link
Collaborator

@daosbuild3
Copy link
Collaborator

Test stage Build on Leap 15.5 with Intel-C and TARGET_PREFIX completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17623/4/execution/node/305/log

@daosbuild3
Copy link
Collaborator

@daosbuild3
Copy link
Collaborator

@daosbuild3
Copy link
Collaborator

…debug

Co-authored-by: gnailzenh <7268050+gnailzenh@users.noreply.github.com>
Copilot AI changed the title placement/tests: add mode= and ver= options to pl_debug gen_layout command placement/tests: add ver= to gen_layout and clarify opc naming in pl_debug Mar 1, 2026
@daosbuild3
Copy link
Collaborator

…en_layout

Co-authored-by: gnailzenh <7268050+gnailzenh@users.noreply.github.com>
Copilot AI changed the title placement/tests: add class= parameter to gen_oid; fix -Wshadow build errors placement/tests: fix -Wshadow errors and wrong mode flags in pl_debug.c Mar 4, 2026
Co-authored-by: gnailzenh <7268050+gnailzenh@users.noreply.github.com>
Copilot AI changed the title placement/tests: fix -Wshadow errors and wrong mode flags in pl_debug.c placement/tests: fix wrong mode flags and add timing to gen_layout in pl_debug Mar 4, 2026
Co-authored-by: gnailzenh <7268050+gnailzenh@users.noreply.github.com>
Copilot AI changed the title placement/tests: fix wrong mode flags and add timing to gen_layout in pl_debug placement/tests: add pl_obj_place timing and output= option to gen_layout Mar 4, 2026
Co-authored-by: gnailzenh <7268050+gnailzenh@users.noreply.github.com>
Copilot AI changed the title placement/tests: add pl_obj_place timing and output= option to gen_layout placement/tests: add output= and gr= options to gen_layout in pl_debug Mar 4, 2026
@daosbuild3
Copy link
Collaborator

Test stage Fault injection testing on EL 8.8 completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17623/23/execution/node/1000/log

Co-authored-by: gnailzenh <7268050+gnailzenh@users.noreply.github.com>
@daosbuild3
Copy link
Collaborator

Test stage Build on Leap 15.5 with Intel-C and TARGET_PREFIX completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17623/24/execution/node/304/log

@daosbuild3
Copy link
Collaborator

@daosbuild3
Copy link
Collaborator

…itmap skipping

Co-authored-by: gnailzenh <7268050+gnailzenh@users.noreply.github.com>
@daosbuild3
Copy link
Collaborator

Test stage Build on Leap 15.5 with Intel-C and TARGET_PREFIX completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17623/25/execution/node/304/log

@daosbuild3
Copy link
Collaborator

@daosbuild3
Copy link
Collaborator

Co-authored-by: gnailzenh <7268050+gnailzenh@users.noreply.github.com>
@daosbuild3
Copy link
Collaborator

Test stage Build on Leap 15.5 with Intel-C and TARGET_PREFIX completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17623/26/execution/node/304/log

@daosbuild3
Copy link
Collaborator

@daosbuild3
Copy link
Collaborator

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants