Skip to content

vulkan: add pipeline barriers for memcpy read/write operations#23770

Open
0cc4m wants to merge 2 commits into
masterfrom
0cc4m/vulkan-host-pipeline-barriers
Open

vulkan: add pipeline barriers for memcpy read/write operations#23770
0cc4m wants to merge 2 commits into
masterfrom
0cc4m/vulkan-host-pipeline-barriers

Conversation

@0cc4m
Copy link
Copy Markdown
Contributor

@0cc4m 0cc4m commented May 27, 2026

Overview

Insert Vulkan pipeline barriers before/after memcpy operations for host-visible memory. This resolves the Intel read issue in #22930, and I also added the opposite direction barriers for writes. I am not sure if this is the best way. @jeffbolznv @rillomas

Requirements

@0cc4m 0cc4m requested a review from a team as a code owner May 27, 2026 08:46
@github-actions github-actions Bot added Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning labels May 27, 2026
Comment thread ggml/src/ggml-vulkan/ggml-vulkan.cpp Outdated
std::lock_guard<std::recursive_mutex> guard(src->device->mutex);
vk_context subctx = ggml_vk_create_temporary_context(src->device->compute_queue.cmd_pool);
ggml_vk_ctx_begin(src->device, subctx);
subctx->s->buffer->buf.pipelineBarrier(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do this one conditionally like you have with host_write_dirty?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would that work? All write operations and all shaders write to buffers, so unless we track it on a buffer level in some way, it would always trigger, would it not?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just want to avoid back to back reads causing extra barriers and submits. So we could mark it dirty in graph_compute and then reset it here after we do the barrier.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I'm not totally sure that would be reliable e.g. in a multithreaded situation. So maybe this is fine. But I think we also need the barrier in the non-UMA path?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a simple dirty flag would be difficult to get right with edge cases. Something like an atomic submission/write counter might work, but add some complexity. To unblock #22930, I think it's best to merge this solution and improve it as a follow-up PR.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the same barrier in the else path?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I forgot to respond to that. I thought we only need pipeline barriers for "mixed access" gpu buffers, in this case due to the memcpy reading from the GPU buffer that shaders wrote to. If we use GPU commands to copy from there into a staging buffer, that's still GPU access. I guess that leaves whether host buffers need pipeline barriers, but I thought not.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may get lucky and transfer engines don't need the barrier in practice, but I think in terms of spec behavior we're supposed to do it. But if we've never seen any corruption in that path, I guess I'm OK with omitting it for now.

@0cc4m 0cc4m force-pushed the 0cc4m/vulkan-host-pipeline-barriers branch from f0e0f66 to 0a1ebc5 Compare May 27, 2026 15:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants