fix sudo TTY race and terminal state corruption#4052
fix sudo TTY race and terminal state corruption#4052niltonperimneto wants to merge 4 commits intomicro-editor:masterfrom
Conversation
|
|
||
| // Init creates and initializes the tcell screen | ||
| func Init() error { | ||
| capturePristineState() |
There was a problem hiding this comment.
All this capture/restore seems to address a separate issue, different from the issue of the race between concurrent sudo processes (although triggered by it, among other things)? So it would be better to do this in a separate commit, to not mix both issues together?
And while we're at it, the commit message of that separate commit could explain why exactly this separate issue is actually an issue and needs to be fixed?
There was a problem hiding this comment.
Fair enough. I just don't actually know how to split commits
There was a problem hiding this comment.
Just implement both things in separate commits and use git push --force to update the history of commits in the PR.
There was a problem hiding this comment.
Otherwise (i.e. if you just keep adding new commits on top of existing ones), the PR will be just unreviewable.
There was a problem hiding this comment.
I think it is done. I rebased into two separate commits and force pushed.
There was a problem hiding this comment.
As said, this capturePristineState() and restorePristineTermState() is nothing we do in micro, because it is done by our dependency tcell and usually this works as long as micro doesn't crash before it can restore the terminal with the help of tcell.
So from my point of view we should drop ab06490.
cmd/micro/micro.go
Outdated
| if screen.Screen != nil { | ||
| screen.Screen.Fini() | ||
| } | ||
| screen.RestorePristineState() |
There was a problem hiding this comment.
Makes sense to move this into the above if screen.Screen != nil?
And then, we might introduce a nice helper screen.Fini() doing both screen.Screen.Fini() and this thing?
Hmm... BTW why restore pristine state when restarting the screen in TempStart() -> screen.Init(), not when stopping the screen in TempFini() already?
And then we won't need a special case for screen.Init() called first time, we can just capture new pristine state in screen.Init() every time? That is both simpler and more correct?
There was a problem hiding this comment.
Hmm... Actually all this capture/restore is already done by tcell, inside screen.Screen.Init() and screen.Screen.Fini(), isn't it?
In upstream tcell:
capture: https://github.com/gdamore/tcell/blob/e62057809ee3eba9cb78a56c76d0612d8bb89fc2/tty/tty_unix.go#L83
restore: https://github.com/gdamore/tcell/blob/e62057809ee3eba9cb78a56c76d0612d8bb89fc2/tty/tty_unix.go#L108
Micro happens to use a very outdated fork of tcell, but it does that too:
capture: https://github.com/micro-editor/tcell/blob/afe1210c81d891e8e27d166e767f69df56681f47/tscreen_unix.go#L37
restore: https://github.com/micro-editor/tcell/blob/afe1210c81d891e8e27d166e767f69df56681f47/tscreen_unix.go#L69
So, this custom implementation of capture/restore seems redundant (it solves an already solved problem)?
There was a problem hiding this comment.
This is the key point worth clarifying: tcell calls term.MakeRaw() on every Init() call, which snapshots whatever terminal state is current at that moment. After a failed child process corrupts the TTY, tcell's subsequent Init() faithfully snapshots the corrupted state and uses it as the baseline for Fini(). The fix captures the state once, before the first MakeRaw(), and restores it before each re-init so that tcell always starts from a clean baseline. At least it is what I was thinking when I wrote it
There was a problem hiding this comment.
Got it. Yeah, seems reasonable.
I also see that, for instance, vim also restores the initial "pristine" state, not the one left by the last shell command it ran.
More precisely, in the case when I "suspend" vim via Ctrl-Z (SIGSTOP) and then resume it via fg (SIGSTOP), I see it respects the tty state left by this shell while it was suspended. For example, if I run stty -echo to disable echo in this shell before resuming vim, then after I exit vim, I see echo is still disabled.
But, in the case when I let vim itself execute some shell command, then vim ignores the tty state left by this command, and restores the original state instead. For example, if I run :!(stty -echo; cat) command in vim, then echo is disabled in this sub-shell, but after exiting vim, echo is enabled.
And that makes sense.
So, it seems your fix is doing the right thing overall.
But it seems this fix belongs in the tcell library, rather than in micro itself. It seems ugly, and maybe potentially fragile, to let micro itself mess with the tty, without tcell knowing about it?
I see the upstream tcell has these APIs:
// Suspend pauses input and output processing. It also restores the
// terminal settings to what they were when the application started.
// This can be used to, for example, run a sub-shell.
Suspend() error
// Resume resumes after Suspend().
Resume() errorUnfortunately, at least from the code it looks like Resume() doesn't really restore the initial state captured in Init(), it just re-captures it once again. Maybe that is not the intended behavior, and should be fixed in tcell?
So maybe what we should do is: backport Suspend() and Resume() from upstream to our fork of tcell, and fix them to make them work as we want them to? (And then probably also upstream the fix to tcell, since, I guess, it would be useful not just for micro.)
And then in micro we would be just using Suspend() and Resume() (instead of those tricky repeated Fini()/Init() calls with a bunch of redundant re-initialization in Init() every time).
There was a problem hiding this comment.
So maybe what we should do is: backport
Suspend()andResume()from upstream to our fork of tcell, and fix them to make them work as we want them to?
Exactly! This is something, which definitely belongs to tcell instead of micro.
Therefore I vote to drop ab06490.
link: micro-editor#4052 (comment) Signed-off-by: Nilton Perim Neto <niltonperimneto@gmail.com>
When saving a protected file, micro spawned two sudo dd subprocesses concurrently: a truncation dd and a write dd. Both competed for the TTY to prompt for the sudo password, racing on TCGETS2/TCSETS2 ioctls and deadlocking stdin routing. This strictly fails with sudo-rs, which enforces auditable escalation boundaries. Fix by removing notrunc and the separate truncation subprocess entirely. A single dd subprocess now handles truncation and write in one shot, making the race impossible by construction. The conv=fsync guarantee is preserved on platforms that support it. Move writeBackup() before openFile() in safeWrite() for the sudo path so the backup always exists before dd truncates the file. If openFile() fails after the backup is created, return OverwriteError so the user knows their data is safe in the backup. The non-sudo path is unchanged. Link: micro-editor#4050 Signed-off-by: Nilton Perim Neto <niltonperimneto@gmail.com>
After a failed sudo subprocess, TempStart() called tcell Init() directly on top of the corrupted TTY state left by the dead child. Since tcell captures the current terminal state on every Init() call via MakeRaw(), it would snapshot the corrupted state and use it as the baseline for Fini(). On exit, this poisoned configuration was written back to the parent shell. Fix by capturing the canonical terminal state once at startup, before tcell's first MakeRaw() call. In TempStart(), restore that pristine state before calling Init() so tcell always starts from a clean baseline regardless of what a failed child process left behind. Also call RestorePristineState() in exit() and the panic handler as a defense in depth measure. Add a screen.Fini() helper that combines Screen.Fini() and RestorePristineState() to remove duplication in exit paths. Align build constraints with tcell's Unix platform list to cover aix and zos, which were previously falling through to no-op stubs. Link: micro-editor#4050 Signed-off-by: Nilton Perim Neto <niltonperimneto@gmail.com>
internal/buffer/save.go
Outdated
| backupName, resolveName, err := b.writeBackup(path) | ||
| // When using sudo, create the backup before opening the file because | ||
| // openFile() truncates the target immediately. For non-sudo saves, | ||
| // openFile() does not truncate on open, so the existing order is kept. |
There was a problem hiding this comment.
"The existing order" is gonna sound confusing. It refers to the old behavior before this change, but this is a comment in the code, not a changelog. So someone reading this comment will not know what "the existing order" refers to.
So e.g. "...For non-sudo saves, openFile() does not truncate on open, i.e. open is non-destructive, so only create the backup after open succeeds." ?
| } | ||
|
|
||
| // Backup saved, so cancel pending periodic backup, if any | ||
| delete(requestedBackups, b) |
There was a problem hiding this comment.
What about this? We should cancel pending backups right after saving the backup in the sudo case as well?
Signed-off-by: Nilton Perim Neto <niltonperimneto@gmail.com>
moved the delete(requestedBackups, b) right after writeBackup() in the sudo path so it runs before openFile() can fail. As suggested by dmaluka Link: micro-editor#4052 Signed-off-by: Nilton Perim Neto <niltonperimneto@gmail.com>
This was mentioned in #4050
If you want to know how I came to this bug and why I'm making this commit.
In short: two sudo dd subprocesses (truncate and write) were spawned
concurrently, racing on TCGETS2/TCSETS2 ioctls for the password
prompt. This strictly fails with sudo-rs, which enforces auditable
escalation boundaries. Serialize them by running truncation to
completion before starting the write process.
After a failed subprocess, TempStart() re-initialized tcell on top
of the corrupted TTY state left by the dead child, poisoning the
parent shell on exit. Fix this by capturing the canonical terminal
state once at startup and restoring it in TempStart() before
tcell's Init() call.
Also close pipe fds on error paths to prevent fd leaks, wire
truncation stderr so sudo errors are visible, and align build
constraints with tcell's Unix platform list (aix, zos).
Signed-off-by: Nilton Perim Neto niltonperimneto@gmail.com