Update diagnostics docs for dotnet-trace collect-linux#52273
Update diagnostics docs for dotnet-trace collect-linux#52273mdh1418 wants to merge 1 commit intodotnet:mainfrom
Conversation
Update 3 docs to reflect dotnet-trace collect-linux capabilities: - dotnet-trace.md: Add symbol resolution section for collect-linux - eventpipe.md: Add EventPipe (user_events) column to comparison table, document how EventPipe can emit events as user_events for unified managed + native trace collection on Linux - debug-highcpu.md: Integrate collect-linux as Linux alternative, clarify that safe-point bias and managed-only callstacks apply on all platforms Key points documented: - Native debug symbols must be on disk for symbol resolution - No environment variables or process restarts needed - EventPipe (user_events) enables unified tracing on Linux (.NET 10+) - Still framed as preview feature Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Updates .NET diagnostics documentation to reflect dotnet-trace collect-linux capabilities, including unified managed + native tracing on Linux via EventPipe user_events, and guidance for symbol resolution.
Changes:
- Add guidance on native symbol resolution when using
dotnet-trace collect-linux. - Update EventPipe vs. ETW/perf_events comparison to include EventPipe
user_events(Linux/.NET 10+). - Update the high-CPU troubleshooting doc to position
collect-linuxas the Linux alternative for kernel-aware profiling.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| docs/core/diagnostics/eventpipe.md | Adds user_events mode explanation and updates the comparison table. |
| docs/core/diagnostics/dotnet-trace.md | Documents native runtime symbol resolution workflow for collect-linux. |
| docs/core/diagnostics/debug-highcpu.md | Mentions collect-linux as a Linux option alongside existing profiling guidance. |
Comments suppressed due to low confidence (1)
docs/core/diagnostics/debug-highcpu.md:180
- This section introduces
dotnet-trace collect-linuxas the Linux option that removes the limitations, but the Linux walkthrough that follows still only documents usingperf+DOTNET_PerfMapEnabled. If the intent is to integratecollect-linuxas an alternative, consider adding an explicit alternative set of steps (or a note) in the Linux tab that points tocollect-linux.
When analyzing an app with high CPU usage, use a profiler to understand what the code is doing. `dotnet-trace` works on all operating systems, but safe-point bias and managed-only callstacks limit it to more general information than a kernel-aware profiler like ETW for Windows or `perf` for Linux. On Linux, [`dotnet-trace collect-linux`](dotnet-trace.md#dotnet-trace-collect-linux) eliminates these limitations by combining EventPipe with OS-level perf_events in a single unified trace. If your performance investigation involves only managed code, `dotnet-trace collect` is generally sufficient.
### [Linux](#tab/linux)
The `perf` tool can be used to generate .NET Core app profiles. We will demonstrate this tool, although dotnet-trace could be used as well. Exit the previous instance of the [sample debug target](/samples/dotnet/samples/diagnostic-scenarios).
|
|
||
| `collect-linux` captures native frames in callstacks. To resolve native method names for runtime libraries (such as `libcoreclr.so`), place the corresponding debug symbol files on disk beside the libraries. Without these symbols, native frames appear as unresolved addresses in the trace. | ||
|
|
||
| Unlike [`perfcollect`](./trace-perfcollect-lttng.md), `collect-linux` doesn't require you to set environment variables like `DOTNET_PerfMapEnabled` or `DOTNET_EnableEventLog` before starting your application. `collect-linux` dynamically enables perfmap generation for JIT-compiled code when the trace begins, so you don't need to restart any .NET processes. |
There was a problem hiding this comment.
"perfmap" should be "perf map" (or "perf maps") to match the terminology used elsewhere (for example, "Export perf maps and jit dumps").
| dotnet-symbol --symbols /usr/share/dotnet/shared/Microsoft.NETCore.App/10.0.0/lib*.so | ||
| ``` | ||
|
|
||
| 1. Place the downloaded `.so.dbg` files beside the runtime libraries they correspond to (for example, `libcoreclr.so.dbg` next to `libcoreclr.so`). If you run `dotnet-symbol` from the runtime directory, it places the symbols there automatically. |
There was a problem hiding this comment.
The sentence about needing to run dotnet-symbol from the runtime directory is misleading. By default, dotnet-symbol writes next to the input file (or you can use -o/--output), and placing .so.dbg files under /usr/share/dotnet/... typically requires elevated permissions. Consider rewording to describe the default output behavior and mention using sudo or --output + copy if needed.
| 1. Place the downloaded `.so.dbg` files beside the runtime libraries they correspond to (for example, `libcoreclr.so.dbg` next to `libcoreclr.so`). If you run `dotnet-symbol` from the runtime directory, it places the symbols there automatically. | |
| 1. Place the downloaded `.so.dbg` files beside the runtime libraries they correspond to (for example, `libcoreclr.so.dbg` next to `libcoreclr.so`). By default, `dotnet-symbol` writes symbol files next to each input file. If your runtime libraries live under a protected path such as `/usr/share/dotnet/...`, run `dotnet-symbol` with elevated permissions (for example, by using `sudo`), or use the `-o`/`--output` option to write to a writable directory, then copy the `.so.dbg` files beside the runtime libraries. |
| EventPipe is part of the .NET runtime and is designed to work the same way across all the platforms .NET Core supports. This allows tracing tools based on EventPipe, such as `dotnet-counters`, `dotnet-gcdump`, and `dotnet-trace`, to work seamlessly across platforms. | ||
|
|
||
| However, because EventPipe is a runtime built-in component, its scope is limited to managed code and the runtime itself. EventPipe events include stacktraces with managed code frame information only. If you want events generated from other unmanaged user-mode libraries, CPU sampling for native code, or kernel events you should use OS-specific tracing tools such as ETW or perf_events. On Linux the [perfcollect tool](./trace-perfcollect-lttng.md) helps automate using perf_events and [LTTng](https://en.wikipedia.org/wiki/LTTng). | ||
| However, because EventPipe is a runtime built-in component, its scope is limited to managed code and the runtime itself. Without other tracing tools, EventPipe events include stacktraces with managed code frame information only. To get events from other unmanaged user-mode libraries, CPU sampling for native code, or kernel events, use OS-specific tracing tools such as ETW or perf_events. On Linux, the [perfcollect tool](./trace-perfcollect-lttng.md) helps automate using perf_events and [LTTng](https://en.wikipedia.org/wiki/LTTng). |
There was a problem hiding this comment.
Use "stack traces" (two words) instead of "stacktraces" for consistency with other diagnostics docs.
| However, because EventPipe is a runtime built-in component, its scope is limited to managed code and the runtime itself. Without other tracing tools, EventPipe events include stacktraces with managed code frame information only. To get events from other unmanaged user-mode libraries, CPU sampling for native code, or kernel events, use OS-specific tracing tools such as ETW or perf_events. On Linux, the [perfcollect tool](./trace-perfcollect-lttng.md) helps automate using perf_events and [LTTng](https://en.wikipedia.org/wiki/LTTng). | |
| However, because EventPipe is a runtime built-in component, its scope is limited to managed code and the runtime itself. Without other tracing tools, EventPipe events include stack traces with managed code frame information only. To get events from other unmanaged user-mode libraries, CPU sampling for native code, or kernel events, use OS-specific tracing tools such as ETW or perf_events. On Linux, the [perfcollect tool](./trace-perfcollect-lttng.md) helps automate using perf_events and [LTTng](https://en.wikipedia.org/wiki/LTTng). |
| Starting in .NET 10, EventPipe on Linux can emit events as [user_events](https://docs.kernel.org/trace/user_events.html), enabling collection of managed events, OS/kernel events, and native callstacks in a single unified trace. This mode requires admin/root privileges and Linux kernel 6.4+. For more information, see [`dotnet-trace collect-linux`](./dotnet-trace.md#dotnet-trace-collect-linux). | ||
|
|
||
| Another major difference between EventPipe and ETW/perf_events is admin/root privilege requirement. To trace an application using ETW or perf_events you need to be an admin/root. Using EventPipe, you can trace applications as long as the tracer (for example, `dotnet-trace`) is run as the same user as the user that launched the application. |
There was a problem hiding this comment.
The privilege requirement paragraph now reads like EventPipe never requires admin/root, but the new user_events mode does. Consider clarifying that the "same user" rule applies to the default EventPipe session, while EventPipe (user_events) requires admin/root.
| ```bash | ||
| dotnet tool install -g dotnet-symbol | ||
| ``` | ||
|
|
||
| 1. Download the debug symbols for your runtime version. For example, if your runtime is installed at `/usr/share/dotnet/shared/Microsoft.NETCore.App/10.0.0`: | ||
|
|
||
| ```bash |
There was a problem hiding this comment.
These code blocks run dotnet commands; this article generally uses dotnetcli fenced blocks for CLI examples. Consider switching the fence info strings from bash to dotnetcli here for consistency and proper styling.
| ```bash | |
| dotnet tool install -g dotnet-symbol | |
| ``` | |
| 1. Download the debug symbols for your runtime version. For example, if your runtime is installed at `/usr/share/dotnet/shared/Microsoft.NETCore.App/10.0.0`: | |
| ```bash | |
| ```dotnetcli | |
| dotnet tool install -g dotnet-symbol |
-
Download the debug symbols for your runtime version. For example, if your runtime is installed at
/usr/share/dotnet/shared/Microsoft.NETCore.App/10.0.0:
brianrob
left a comment
There was a problem hiding this comment.
LGTM. A couple of comments below.
|
|
||
| When analyzing an app with high CPU usage, you need a diagnostics tool that can provide insights into what the code is doing. The usual choice is a profiler, and there are different profiler options to choose from. `dotnet-trace` can be used on all operating systems, however, its limitations of safe-point bias and managed-only callstacks result in more general information compared to a kernel-aware profiler like 'perf' for Linux or ETW for Windows. If your performance investigation involves only managed code, generally `dotnet-trace` will be sufficient. | ||
| When analyzing an app with high CPU usage, use a profiler to understand what the code is doing. `dotnet-trace` works on all operating systems, but safe-point bias and managed-only callstacks limit it to more general information than a kernel-aware profiler like ETW for Windows or `perf` for Linux. On Linux, [`dotnet-trace collect-linux`](dotnet-trace.md#dotnet-trace-collect-linux) eliminates these limitations by combining EventPipe with OS-level perf_events in a single unified trace. If your performance investigation involves only managed code, `dotnet-trace collect` is generally sufficient. | ||
|
|
There was a problem hiding this comment.
I would say that dotnet-trace collect is often sufficient for managed code investigations within a single process. But often not knowing what native runtime frames are on the stack makes the investigations difficult. I would recommend dotnet-trace collect-linux on Linux for any .NET 10+ investigations. The set of tools is the same, so it seems ideal to bias towards a higher fidelity trace.
There was a problem hiding this comment.
Fully agree on the recommendation. In terms of how we phrase it in this doc I'd suggest we keep the initial blurb above very simple, just mentioning that 'dotnet-trace collect' is a cross-platform option and that depending on OS and .NET version improved capabilities may be available. Then in the tabs below we can give more refined guidance that is Windows or Linux specific.
In order to make the doc self-consistent if we mention dotnet-trace collect-linux as a recommended option then we need to provide a guide that tells people how to use it. Right now the section below only tells users about 'perf'.
|
|
||
| 1. Place the downloaded `.so.dbg` files beside the runtime libraries they correspond to (for example, `libcoreclr.so.dbg` next to `libcoreclr.so`). If you run `dotnet-symbol` from the runtime directory, it places the symbols there automatically. | ||
|
|
||
| After you place the symbols, `collect-linux` resolves native method names when it processes the trace. |
There was a problem hiding this comment.
| After you place the symbols, `collect-linux` resolves native method names when it processes the trace. | |
| After you place the symbols, `collect-linux` resolves native method names when it collects the trace. |
|
|
||
| ## (Linux-only) Collect a machine-wide trace using dotnet-trace | ||
|
|
||
| ### Get symbols for native runtime frames |
There was a problem hiding this comment.
I thought Beau told us earlier that acquiring debug binaries was unnecessary and has no bearing on the collect-linux stackwalking results? Did that prove to be untrue?
There was a problem hiding this comment.
I'll need to investigate that, I might still be hitting the non-deterministic symbol resolution issue.
| |Can resolve native callstacks|No|Yes|Yes| | ||
| |Feature|EventPipe|EventPipe (user_events)|ETW|perf_events| | ||
| |-------|---------|----------------------|---|-----------| | ||
| |Cross-platform|Yes|No (Linux only)|No (only on Windows)|No (only on supported Linux distros)| |
There was a problem hiding this comment.
| |Cross-platform|Yes|No (Linux only)|No (only on Windows)|No (only on supported Linux distros)| | |
| |Cross-platform|Yes|No (only on supported Linux distros)|No (only on Windows)|No (only on supported Linux distros)| |
Update 3 docs to reflect dotnet-trace collect-linux capabilities:
Key points documented:
Internal previews