[24.04_linux-nvidia-6.17-next] PCI: mirror PI7C9X3G606GPC Port 4 BAR0#442
[24.04_linux-nvidia-6.17-next] PCI: mirror PI7C9X3G606GPC Port 4 BAR0#442nirmoy wants to merge 1 commit into
Conversation
44e1553 to
31881cf
Compare
Boro reviewSummaryNo issues found across the reviewed commits. Findings: no problems found Latest watcher review: open review Kernel deb build: successful (download debs, 4 files) Head: This comment is maintained by nv-pr-bot. It is updated when the GitHub watcher publishes a newer review. |
PR Validation ReportPatchscan ✅ No Missing FixesAll cherry-picked commits checked — no missing upstream fixes found. PR Lint ✅ All checks passedDetailsChecking 1 commits... Cherry-pick digest: ┌──────────────┬──────────────────────────────────────────────────────────────────┬────────────┬─────────┬───────────────────────────┐ │ Local │ Referenced upstream / Patch subject │ Patch-ID │ Subject │ SoB chain │ ├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤ │ 1670e403ccc6 │ [SAUCE] pci: quirks: mirror pi7c9x3g606gpc port 4 bar0 │ N/A │ N/A │ nirmoyd │ └──────────────┴──────────────────────────────────────────────────────────────────┴────────────┴─────────┴───────────────────────────┘ Lint: all checks passed. |
|
Does this PR need to be applied to the 6.18 reference kernel as well? |
|
Do you have tests (scripts) which can verify this patch set is applied and working? |
Yes, we need that for 6.18 too. I will create a PR for that as well
|
So if the patch is not applied to the kernel that message will not be in the dmesg. We should have a test that verifies that message is in the dmesg or the test should fail that the patch has not been applied, correct? |
Yes, ACK we should have a test may be a greenlit one to check the dmesg to verify the patch. |
31881cf to
5edb468
Compare
5edb468 to
e562a4a
Compare
|
Codex gave me this suggestion but not sure in this environment: NV-Kernels/drivers/pci/quirks.c Lines 6318 to 6320 in e562a4a pci_fixup_resume_early, and may resume the subordinate bus in pci_pm_resume_noirq() before normal resume fixups run: see pci-driver.c:963-L970 ( NV-Kernels/drivers/pci/pci-driver.c Lines 963 to 970 in 42cad5f pci-driver.c:581-L586 ( NV-Kernels/drivers/pci/pci-driver.c Lines 581 to 586 in 42cad5f still stale/zero. I’d move this to DECLARE_PCI_FIXUP_RESUME_EARLY or add an early resume fixup so the mirror is rewritten immediately after pci_restore_state() and before subordinate devices resume. |
e562a4a to
377a0dc
Compare
|
@nirmoy Can you confirm that this is limited to 32-bit bars? Is that because 64-bit bars are not possible in this configuration? |
It make sense to have this. I will update it |
The erratum describes the WAR as copying only BAR0 at offset 0x10. That is sufficient for a 32-bit BAR, but would not be complete for a 64-bit BAR because BAR1 contains the upper address bits. So I am defensively skipping the WAR for 64-bit BARs. The device can be configured with a 64-bit BAR0/BAR1 pair, but on our platform it is configured as a 32-bit BAR. I confirmed that from the BMC readback of the upstream BAR0 value, where the BAR type bits indicate a 32-bit memory BAR. |
Thanks for clarifying. If we end up sending this upstream we may need to relax the checking a bit. |
377a0dc to
c2c4176
Compare
Some Pericom/Diodes PI7C9X3G606GPC switches require downstream Port 4 BAR0 to mirror BAR0 of the immediate upstream port. Firmware may apply this during boot, but Linux PCI resource assignment can move the upstream BAR0 and leave Port 4 without the required mirror. Diodes confirmed that Tile0/P4 is OS-visible as device 04, function 0 on the bus below the upstream port. Add a final and early resume quirk for that downstream function. The quirk verifies that the immediate upstream bridge is the same switch, then writes Port 4 BAR0 from the upstream BAR0 after resource assignment and during early resume. Port 4 BAR0 may read back as zero even after a successful write, so the write must be validated by platform-specific means. Change-Id: I86222773f9bb321ea3e24df7fb1b4a3f84a008a4 Signed-off-by: Nirmoy Das <nirmoyd@nvidia.com>
c2c4176 to
1670e40
Compare
|
I looked at the latest version and agree with change to resume early. I see that the 64-bit bar check was removed. We can do that, but then I think the code needs to properly handle 32 and 64-bit bars being fixed up - I should have been clearer when I said "relax the checking" in my prior comment. Here's what Codex has to say:
|
Summary
Validation
1670e403ccc6212fe92c2d678461da87897888b0.scripts/checkpatch.pl --strict --ignore GERRIT_CHANGE_ID --git HEADgit diff --check HEAD~1..HEADmake O=/tmp/nv-kernels-pr442-quirks-build -j$(nproc) drivers/pci/quirks.o6.17.0-1019-nvidia-64kon the Quark DUT: OS172.17.33.143via jumper10.22.18.250; BMC172.17.33.144.6.18.33-pr447-pericom-no64and showed the quirk still firing:journalctl -b -kscan forBTRFS error,I/O error,nvme.*timeout,device inaccessible,read-only,blk_update_request, andBuffer I/O errorreturned no matches.References
Launchpad: https://bugs.launchpad.net/ubuntu/+source/linux-nvidia-6.17/+bug/2154457
NVBug: https://nvbugspro.nvidia.com/bug/6205517
NVBug: https://nvbugspro.nvidia.com/bug/6134331
Test artifacts: http://baseos-internal-tools.nvidia.com:8003/