Skip to content

Syscall rewriter improvements#812

Merged
wdcui merged 8 commits into
mainfrom
wdcui/pr1-trampoline-format
May 13, 2026
Merged

Syscall rewriter improvements#812
wdcui merged 8 commits into
mainfrom
wdcui/pr1-trampoline-format

Conversation

@wdcui
Copy link
Copy Markdown
Member

@wdcui wdcui commented Apr 25, 2026

This PR changes how the return address is saved in rcx when syscalls are patched so that it can used to calculate the restart address. It also adds an empty trampoline to ELF files that don't have syscalls so the runtime knows it's already patched.

@wdcui wdcui force-pushed the wdcui/pr1-trampoline-format branch from 75ea7ec to 0bcb6ff Compare April 25, 2026 04:06
@wdcui wdcui marked this pull request as ready for review April 25, 2026 18:50
@wdcui wdcui force-pushed the wdcui/pr1-trampoline-format branch from a897889 to fd4fcb7 Compare April 25, 2026 21:31
@wdcui
Copy link
Copy Markdown
Member Author

wdcui commented Apr 25, 2026

This PR is ready for review. Ignore the rtld changes since it will be removed after PR #810 is merged.

@jaybosamiya-ms jaybosamiya-ms added the expmt:shadow-kiln Tag to quickly find the different PRs as part of the "shadow kiln" experiment. label Apr 28, 2026
@wdcui wdcui force-pushed the wdcui/pr1-trampoline-format branch 2 times, most recently from d083228 to 097c515 Compare May 7, 2026 03:25
@wdcui
Copy link
Copy Markdown
Member Author

wdcui commented May 7, 2026

@CvvT and @sangho2 this PR is ready for your review. Thanks!

@CvvT
Copy link
Copy Markdown
Contributor

CvvT commented May 8, 2026

Similar to how Linux handles syscall restart, we could update pt_regs->ip directly to where we would like to re-execute from.

Trampoline stub (LEA R11 dropped):
  stub_addr:                        ; ← restart target
      LEA  RSP, [RSP - 0x80]        ; 5 bytes
      LEA  RCX, [post_jmp]          ; 7 bytes — points inside trampoline
      JMP  [syscall_entry_slot]     ; 6 bytes
  post_jmp:                         ; offset 18 from stub_addr
      JMP  [guest_next_insn_slot]   

  Restart:
  ctx.rip = ctx.rcx - 18;   // 18 = sizeof(LEA RSP) + sizeof(LEA RCX) + sizeof(JMP)

For hook syscall and its previous instructions, there is no post_jmp instruction; we need to update rewriter as well.

@CvvT
Copy link
Copy Markdown
Contributor

CvvT commented May 8, 2026

I don't see why we need the red zone.

@wdcui
Copy link
Copy Markdown
Member Author

wdcui commented May 11, 2026

I don't see why we need the red zone.

Good point. It's removed now.

@wdcui
Copy link
Copy Markdown
Member Author

wdcui commented May 11, 2026

Similar to how Linux handles syscall restart, we could update pt_regs->ip directly to where we would like to re-execute from.

Trampoline stub (LEA R11 dropped):
  stub_addr:                        ; ← restart target
      LEA  RSP, [RSP - 0x80]        ; 5 bytes
      LEA  RCX, [post_jmp]          ; 7 bytes — points inside trampoline
      JMP  [syscall_entry_slot]     ; 6 bytes
  post_jmp:                         ; offset 18 from stub_addr
      JMP  [guest_next_insn_slot]   

  Restart:
  ctx.rip = ctx.rcx - 18;   // 18 = sizeof(LEA RSP) + sizeof(LEA RCX) + sizeof(JMP)

For hook syscall and its previous instructions, there is no post_jmp instruction; we need to update rewriter as well.

ctx.rip = ctx.rcx - 6 since we don't need to rerun the lea of rcx (and lea rsp is removed now).

@wdcui wdcui changed the title Add trampoline preamble with red zone reservation and R11 restart address Syscall rewriter improvements May 12, 2026
Copy link
Copy Markdown
Contributor

@CvvT CvvT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

wdcui and others added 8 commits May 12, 2026 20:10
…ress

The syscall rewriter now emits a 12-byte preamble before each trampoline
stub: LEA RSP,[RSP-0x80] to reserve the SysV 128-byte red zone, and
LEA R11,[RIP+disp32] to load the call-site restart address into R11 for
future SA_RESTART support.

Platform callbacks are renamed to syscall_callback_redzone to reflect
the new calling convention. The callback recovers the architectural RSP
with LEA R11,[RSP+128] and saves the restart address to TLS
(saved_restart_addr) before clobbering R11.

The rewriter also emits a size=0 header sentinel for binaries with no
syscall instructions, allowing the loader to distinguish 'checked, no
syscalls' from 'never processed.' The is_already_hooked check treats
size=0 as already-hooked. The rtld_audit library is updated to handle
both the new calling convention (red zone + R11) and size=0 binaries
(via mincore probe before reading trampoline memory).
The pre-built binary had old-format trampolines that jumped to
syscall_callback_redzone without reserving the red zone, causing
an access violation (0xc0000005) on Windows.
…oline anchor

Per CvvT's review: the R11 restart-address load and the SysV red zone
preamble exist to support each other (the Windows callback's R11 shuttle
was the only thing that wrote to the guest stack pre-stack-switch, and
the red zone reservation existed to protect against that shuttle).

Replace the scheme so every stub satisfies:
  pt_regs.rcx - 6 == address of JMP [syscall_entry_slot]

RCX uniformly points at the in-trampoline `post_jmp`, which is now a
5-byte `JMP rel32 -> guest_text` tail in every stub. The callback's
`jmp rcx` return path lands at post_jmp and falls through to guest text.
The SA_RESTART handler can rewind ctx.rip with a single subtract, with
no per-variant layout knowledge.

Drops, both platforms:
- saved_restart_addr TLS slot
- Windows TEB-slot shuttle (mov gs:[0x28], r11; etc.)
- lea r11, [rsp+128] architectural-RSP recovery
- LEA RSP, [RSP-0x80] red zone preamble
- LEA R11, [RIP+disp32] restart-address load

The post_jmp tail adds 5 bytes per stub. Net: -7 bytes per stub vs.
the prior approach, with the SA_RESTART invariant baked in for free.

Snapshot regenerated to reflect the new 18-byte-per-stub layout.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@wdcui wdcui force-pushed the wdcui/pr1-trampoline-format branch from 77b4540 to 9faf56b Compare May 13, 2026 03:10
@wdcui wdcui enabled auto-merge May 13, 2026 03:10
@github-actions
Copy link
Copy Markdown

🤖 SemverChecks 🤖 No breaking API changes detected

Note: this does not mean API is unchanged, or even that there are no breaking changes; simply, none of the detections triggered.

@wdcui wdcui added this pull request to the merge queue May 13, 2026
Merged via the queue into main with commit e728793 May 13, 2026
13 checks passed
@wdcui wdcui deleted the wdcui/pr1-trampoline-format branch May 13, 2026 03:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

expmt:shadow-kiln Tag to quickly find the different PRs as part of the "shadow kiln" experiment.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants