fix(linux/xdgportal): Properly support multiple screens by exposing pipewire streams as separate displays#4931
Conversation
1f4fa32 to
58efcab
Compare
|
@psyke83 Sorry to ping you here but can you think of a reason why the keyboard shortcut to switch displays would not work with portalgrab? I've been trying to do so after enumerating them properly for display_names (current state of this PR) but the switch_display_event from video.cpp does not seem to be firing or handled when using portalgrab and I can't figure what I'm missing. |
If you add some debug to video.cpp you'll see that your current code is hitting the |
|
Look: Sunshine/src/platform/linux/kmsgrab.cpp Lines 1221 to 1230 in b172a98 It seems that portalgrab is not checking the push_captured_image_cb callback the same way in the capture loop. The same logic may need to be added. I will look at this further later when I have time, if you don't figure it out. |
That is exactly what was missing to get the keyboard shortcut to work. Thanks! |
5fa6e6f to
6d7942e
Compare
|
I've got the basic functionality working now and can switch multiple displays using keyboard shortcuts. The main issue right now is when the same display currently actively streaming is requested to be switched to again (e.g. currently streaming display 0 and trying to switch to it via shortcut) then the stream will disconnect due to https://github.com/Kishi85/Sunshine/blob/6d7942e0c82aeedfca16b77735f0fc6cf716a4ec/src/platform/linux/portalgrab.cpp#L1294-L1309 which is causing the stream to disconnect with Problem here is that this is necessary to detect if the user stopped the stream via the XDG notification (according to @psyke83 notes in #4914 (comment)). Possible solutions that need more research:
|
6d7942e to
3a22dff
Compare
auto switch_display_event = mail::man->event<int>(mail::switch_display);
if (previous_width.load() == width && previous_height.load() == height && previous_pos_x.load() == pos_x && previous_pos_y.load() == pos_y && switch_display_event->peek()) {Next issue is when changing displays via shortcut too quickly there seems to be a possible race condition which can happen that freezes Sunshine hard until it is restarted and then it is exiting with: |
e097695 to
79773f8
Compare
Scratch that solution as I've had a typo (missing inversion) there and didn't realize that the event is already popped when the display gets reset by video.cpp and the portal re-initialzies. Back to square one on that.
Sunshine hanging when switching too quickly is still an issue. UPDATE: This could be related to chosen encoder, having tried vulkan and vaapi yields different results (hang vs. SEGV). UPDATE 2: More or less consistently reproducible when switching screens back and forth without waiting for the stream to update to the new screen. Does not seem to trigger if waiting for the stream to update to the new screen after switching plus a second or two more. |
36c8067 to
073e73d
Compare
|
The crashing (whether hang or segfault) may be due to the pipewire loop still being active during fake reinit. See this block here: Sunshine/src/platform/linux/portalgrab.cpp Lines 1363 to 1370 in b172a98 And here: Sunshine/src/platform/linux/portalgrab.cpp Lines 1393 to 1399 in b172a98 You may also need to set these same variables when you intend to signal an artificial reinit (i.e., when push_captured_image_cb returns false) to ensure the encoder fully stops. Regarding the race condition that causes a hang, one of the changes in #4875 resolves a hang after disconnect that happens when there are no screen updates happening. Since the on_process callback doesn't fire, the capture loop gets stuck. I worked around that by checking the pull_free_image_cb result during timeout, so you may want to see if you need to do a similar check elsewhere (or perhaps find a better fix). |
|
I'll stop force-pushing most things im doing right now and will be squashing the whole thing in the end after we've got all the kinks ironed out. A few things I've done and learned now:
With all these changes I seemingly have managed to get a stable reproducer for the crash (SEGV) by switching to the same display that is currently streaming. This is working on a single monitor setup as well, allowed by the implementation in video.cpp and should just fully reset the display (+pipewire stream) without segfaulting. The crash only occurs when switching to the same display. Switching to another display and then back to the first one does not seem to crash (as easily) unless you're switching very quickly but that might just be resulting in a switch to the same display internally in the end.
UPDATE: Testing this again and again leads to the same result: Segfault is only happening when trying to switch to the same screen that is already streaming even though the display reset/initialization looks the same in the log as when switching from one screen to another. It is properly reusing the session cache now and connecting to the same pipewire_node for each display. Could this be some quirk when repeatedly streaming the same node from pipewire? Switching to a different node and then back to the first one does not seem to trigger this issue. |
ed38479 to
12c20b7
Compare
|
After a lot of trial and error I've managed to cut the hangs when switching display down to only when very quickly, repeatedly switching (to the same? haven't tried others) display. You have to basically mash the shortcut to get a segfault. My theory for that is that when switching really quickly something in the capture logic combined with pipewire just cannot keep up (some thread sync issue?). This is still an issue but I'm wondering if it can be mitigated in video.cpp as it seems unreasonable to allow queuing another switch_display_event while the previous one's reset_display has not finished: Lines 1478 to 1484 in b172a98 All other issues I've had with this are basically ironed out now:
Things still todo before I'm likely to remove the draft status:
|
aa2ff1f to
b346dd1
Compare
|
Squashed, rebased and description+title updated. This is ready for review. There's still one known issue (that was there before this PR just masked, see description for details) but maybe someone reviewing this has an idea on how to fix or work around that. |
I've managed to fix this issue by invalidating the session cache during
Added a correlation of stream position/resolution to the monitors name by matching to Repeatedly testing both solutions has not shown any errors on my end. |
4ba63a8 to
96bdcf1
Compare
|
Some notes on the latest PR code: I'm testing on latest Arch, Plasma 6.6.3, running headless (DP-2 left, DP-1 set to primary on the right) via an EDID dump extracted from a HDMI dummy dongle inserted into the initramfs and using these kernel parameters to expose them to two DisplayPort outputs on my RX 6600:
Have you looked at the xdg_listener entries? The For context, the monitor name is also detected via wayland.cpp, but the string it prints is actually the description and not the name: Edit: I see that name and description are considered deprecated, so wl_output should by used instead. But it seems that something is not working correctly with the monitor name detection implemented in portalgrab for my configuration. |
From your logs it seems that the correlation with Then it's falling back to position/resolution matching and the already known behaviour. Could you check if in the wayland output just before those lines the viewport information is matching to the screen position/resolution information you see for the pipewire streams found? For comparison here's a working example (with bold/italic highlights) using my config (also Arch-based but CachyOS with Plasma 6.6.3, Fake-EDID is on HDMI-A-1 of my RX9070XT and taken from https://github.com/linuxhw/EDID, DP-1/2 are real monitors with DP-1 mirrored to HDMI-A-1 therefore not in the list): [2026-04-06 07:28:40.719]: Info: [wayland] Offset: 1920x0 The name correlation code (see https://github.com/Kishi85/Sunshine/blob/96bdcf1d1fc893aab4e95415b7402f9f1c2be3b9/src/platform/linux/portalgrab.cpp#L585-L609) is just reusing data from the same code that kmsgrab is during it's monitor correlation: Sunshine/src/platform/linux/wayland.cpp Line 541 in ba4db46 Correlation is done during @psyke83 It's interesting to see that the correlation does not work on your setup and I'm curious to see if matching up wayland log output to the streams found can help identify the issue here. |
c51bf18 to
61d9138
Compare
|
I'll provide some more detailed logs when I have access to the host pc later today. Regarding the default monitor selection, KDE has a dbus call you can use to get the active output display: I can't verify right now, but some Googling shows a similar method for GNOME: https://wiki.gnome.org/Initiatives(2f)Wayland(2f)Gaps(2f)DisplayConfig.html (there's a primary bool in the properties array). I'm not saying that you have to implement these (as it seems quite annoying to tailor detection to specific compositors), but I'm not aware of any generic methods to get the primary (or even just currently active) screen on Wayland. |
|
Perhaps we shouldn't worry about the primary monitor, but instead try to determine the active monitor from the pointer position provided by spa_meta_cursor metadata? From that we may be able to determine the position data and which screen it is currently active on. |
It's not just a bool if you're using more than 2 screens, then there's a whole screen priority list of monitors that you can modify (at least in KDE display settings) and the first in the list is considered primary.
I agree that using screen priorites would be good way to sort streams but is only feasible if we can leverage wayland.h/cpp to get those priorities from a standardized interface. I'll have a look into it when I find some time. But even if possible that should be done in a separate PR. It's big enough as it is now and I'd like to fix any remaining issues first and get this PR merged before improving on functionality further. |
Not sure why you would want to add pointer correlation here? The main problem is that video.cpp is doing the switching based on the display_name selected by index based on the F-key you press from the display_names vector. In case of a capture re-init due to resolution change it'll try to find the last selected display_name from the new display_names vector and if that fails it falls back with the first element of said vector. To make this work on resolution change we need to use a resolution/position independent display_name like the (connector) name from wl::monitors(). EDIT: To clarify what this PR does is that it provides all portal_display_names for video.cpp so that can call portal_display with the name it intends to switch to and initialize the correct stream for what is requested. The whole logic as to what is switched to is done by video.cpp. It also ensures things don't break by destroying the old display before it even enumerates display_names again and then initializing a new display based on what shortcut is pressed. Similar things happen for a capture::re-init due to resolution change. See video.cpp for details. |
…iffer from dbus during display names retrieval Fix capture issues (black- or greenscreen) when changing resolution due problems when doing session cache invalidation during portal display init. The pipewire stream is working properly when session cache check and invalidation are done during portal_display_names while discovering available displays, so do it there. Also make sure pipewire_fd opened by dbus_t are closed by it.
… usage pipewire_screenstream_t -> pipewire_streaminfo_t as it is only containing the metadata necessary to identify the stream streams_to_display_names -> portal_streams_to_display_names so it can be easily identified as portal related
61d9138 to
5aec74c
Compare
|
I've taken the time to cleanup the commits a bit and update a few commit messages with more detail to make them easier to understand. I can do another squash if necessary before this gets merged and/or put 69c05b1 into a separate PR. |
|
Here's a full log taken from the latest PR changes: My comments regarding the pointer location was aimed at figuring out which monitor is to be selected on first connect and persisted on resolution change. Perhaps that won't be needed when the display names are detected correctly? As-is, my setup is still not detecting the name correctly, defaults to the wrong monitor (DP-2), and resolution change on DP-1 still causes it to switch to DP-2. |
…hat as display_name To ensure display_name is independent of postion/resolution so swichting to the same display on resolution change based re-init works properly. On failing correlation fall back to position/resolution matching for a stream to have at least basic display switching working. To have uniquely distinguishable display_names prefix them with 'n' for monitor_name matching and 'p' for position/resolution matching.
…pipewire_streaminfo_t Reduce complexity in other code parts and have the display_name generation/matching done next to each other for better maintainability.
5aec74c to
be75524
Compare
|
The wl::monitors() correlation wasn't using logical_width/logical_height to match against so scaled displays wouldn't work correctly but after seeing @psyke83's first comment I've started looking and in the meantime we both came up with the same solution. So this is fixed now as well. The only remaining feature that would be nice to have is to do stream sorting additionally based on screen priorities (mainly to have the stream start on the primary screen) but that'll need changes to wayland.cpp/h to expose the necessary information (mainly the preferred/primary indicator). I'll try to implement this but might move it to a follow-up PR if it's a larger change as this PR is already doing quite a lot of things (although all are related to and necessary to make this feature work properly). UPDATE: From looking at other capture methods it looks like those have similar sorting issues (e.g. wlgrab takes the displays in the order that wl::monitors() provides them, which seems to be sorted alphabetically by wl_name). There also does not seem to be a unified Wayland interface for screen priorities/primary monitor (at least I've not found one so far). Adding sorting based on additional parameters is trivial but we have to get those parameters somehow and that's IMHO something for a follow-up PR as it will be a bit more involved. |
|
I agree with your assessment that primary/active detection is out of scope. I don't think Wayland has a direct method to probe the primary monitor, hence the suggestion to query the pipewire cursor position. But if it is feasible to do generically via Wayland, it belongs in a separate PR as it that could also be leveraged by kmsgrab. Until then, users can just arrange their displays so that the primary is leftmost. I'm seeing an issue with the xdg stop event. When stopping the stream via the xdg notification icon, it can't resume the session. Log is here; happens on KDE and GNOME. |
This is a tricky one and I might have an idea on how to handle this. I'll try to explain the issue here, maybe you have an idea:
|
|
@psyke83 I've changed the current implementation of XDG stop and made it so that it invalidates the session_cache after the capture errors out, making it possible to connect again without restarting sunshine. It's not the ideal solution I've described but should fix the immediate problem you noted. |
70a8637 to
78020a2
Compare
…validate session_cache This should fix the permanent error after closing the stream by ending it using the XDG portal notification until we can implement the XDG stop event properly.
78020a2 to
a404fd5
Compare
|
❌ The last analysis has failed. |
Description
This PR adds proper multi-monitor support to Sunshine's XDG portal grab by exposing all streams of a capture portal as separate screens (utilizing the multiple capture mode of the screencast portal).
Things this PR does:
Known issues:
Can segfault when trying to switch displays really quickly (by mashing the screen switch shortcut multiple times within a second) and even then it's not fully reproducible. This would have been happening even before this PR when just having a single stream but was masked due to the shortcut handling not being properly implemented in the capture logic (missing return on !push_captured_image_cb).This is a separate issue concerning other capturemethods as well (tested with KMS), see Coredump with SIGSEGV when switching displays too quickly #4943Screenshot
Issues Fixed or Closed
Roadmap Issues
Type of Change
Checklist
AI Usage