Skip to content

Add dotnet-trace to Debug ThreadPool Starvation #46251

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

verdie-g
Copy link
Contributor

@verdie-g verdie-g commented May 18, 2025

Summary

In .NET 9, a new event was added (dotnet/runtime#94737) to help diagnose thread pool starvation. It is documented in dotnet-trace #40852.

In this PR, I'm adding a section to Debug ThreadPool Starvation to show how to collect this event, and analyze it using both PerfView and a cross-platform tool that I made in Blazor.

cc @kouvel @noahfalk who helped implementing this event


Internal previews

📄 File 🔗 Preview link
docs/core/diagnostics/debug-threadpool-starvation.md Debug ThreadPool Starvation

@verdie-g verdie-g requested review from tommcdon and a team as code owners May 18, 2025 21:22
@dotnetrepoman dotnetrepoman bot added this to the May 2025 milestone May 18, 2025
@dotnet-policy-service dotnet-policy-service bot added dotnet-fundamentals/svc community-contribution Indicates PR is created by someone from the .NET community. labels May 18, 2025
@tommcdon tommcdon requested a review from noahfalk May 19, 2025 00:50
Copy link
Member

@noahfalk noahfalk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting these docs together! I made some suggestions inline and probably the docs team (@gewarren or @CamSoper) will have suggestions as well.

@@ -7,7 +7,7 @@ ms.date: 04/19/2022

# Debug ThreadPool Starvation

**This article applies to: ✔️** .NET Core 3.1 and later versions
**This article applies to: ✔️** .NET 9.0 and later versions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The metrics shown by default in dotnet-counters changed in .NET 9.0, but the article still refers to pre-9.0 metrics. I know it wasn't the focus of your change, but if we are changing this to a 9.0 guide we should change that part to make it consistent.

- [Sample web app](/samples/dotnet/samples/diagnostic-scenarios) to demonstrate ThreadPool starvation behavior
- [Bombardier](https://github.com/codesenberg/bombardier/releases) to generate load for the sample web app
- [dotnet-counters](dotnet-counters.md) to observe performance counters
- [dotnet-stack](dotnet-stack.md) to examine thread stacks
- [dotnet-trace](dotnet-trace.md) to collect wait events
- Optional: [PerfView](https://github.com/microsoft/perfview/releases) to analyze .nettrace files
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Optional: [PerfView](https://github.com/microsoft/perfview/releases) to analyze .nettrace files
- Optional: [PerfView](https://github.com/microsoft/perfview/releases) to analyze the wait events

its true that it analyzes .nettrace files of course, but this way its clearer that the output of dotnet-trace will be the input of PerfView.

@@ -192,7 +177,9 @@ On .NET 6 you should observe the pool increase the thread count more quickly tha

## Resolving ThreadPool starvation

To eliminate ThreadPool starvation, ThreadPool threads need to remain unblocked so that they're available to handle incoming work items. There are two ways to determine what each thread was doing, either using the [dotnet-stack](dotnet-stack.md) tool or capturing a dump with [dotnet-dump](dotnet-dump.md) that can be viewed in [Visual Studio](/visualstudio/debugger/using-dump-files). dotnet-stack can be faster because it shows the thread stacks immediately on the console, but Visual Studio dump debugging offers better visualizations that map frames to source, Just My Code can filter out runtime implementation frames, and the Parallel Stacks feature can help group large numbers of threads with similar stacks. This tutorial shows the dotnet-stack option. See the [diagnosing ThreadPool starvation tutorial video](/shows/on-net/diagnosing-thread-pool-exhaustion-issues-in-net-core-apps) for an example of investigating the thread stacks using Visual Studio.
To eliminate ThreadPool starvation, ThreadPool threads need to remain unblocked so that they're available to handle incoming work items. There are multiple ways to determine what each thread was doing: using the [dotnet-stack](dotnet-stack.md) tool, capturing a dump with [dotnet-dump](dotnet-dump.md) that can be viewed in [Visual Studio](/visualstudio/debugger/using-dump-files), or collecting wait events with [dotnet-trace](dotnet-trace.md). dotnet-stack can be faster because it shows the thread stacks immediately on the console, but Visual Studio dump debugging offers better visualizations that map frames to source, Just My Code can filter out runtime implementation frames, and the Parallel Stacks feature can help group large numbers of threads with similar stacks. This tutorial first shows the dotnet-stack option, then demonstrates how to use dotnet-trace. See the [diagnosing ThreadPool starvation tutorial video](/shows/on-net/diagnosing-thread-pool-exhaustion-issues-in-net-core-apps) for an example of investigating the thread stacks using Visual Studio.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
To eliminate ThreadPool starvation, ThreadPool threads need to remain unblocked so that they're available to handle incoming work items. There are multiple ways to determine what each thread was doing: using the [dotnet-stack](dotnet-stack.md) tool, capturing a dump with [dotnet-dump](dotnet-dump.md) that can be viewed in [Visual Studio](/visualstudio/debugger/using-dump-files), or collecting wait events with [dotnet-trace](dotnet-trace.md). dotnet-stack can be faster because it shows the thread stacks immediately on the console, but Visual Studio dump debugging offers better visualizations that map frames to source, Just My Code can filter out runtime implementation frames, and the Parallel Stacks feature can help group large numbers of threads with similar stacks. This tutorial first shows the dotnet-stack option, then demonstrates how to use dotnet-trace. See the [diagnosing ThreadPool starvation tutorial video](/shows/on-net/diagnosing-thread-pool-exhaustion-issues-in-net-core-apps) for an example of investigating the thread stacks using Visual Studio.
To eliminate ThreadPool starvation, ThreadPool threads need to remain unblocked so that they're available to handle incoming work items. There are multiple ways to determine what each thread was doing. If the issue occurs only occasionally then collecting a trace with [dotnet-trace](dotnet-trace.md) is best to record application behavior over a period of time. If the issue is constantly occurring then you can use the [dotnet-stack](dotnet-stack.md) tool or capture a dump with [dotnet-dump](dotnet-dump.md) that can be viewed in [Visual Studio](/visualstudio/debugger/using-dump-files). dotnet-stack can be faster because it shows the thread stacks immediately on the console, but Visual Studio dump debugging offers better visualizations that map frames to source, Just My Code can filter out runtime implementation frames, and the Parallel Stacks feature can help group large numbers of threads with similar stacks. This tutorial shows the dotnet-stack and dotnet-trace options. See the [diagnosing ThreadPool starvation tutorial video](/shows/on-net/diagnosing-thread-pool-exhaustion-issues-in-net-core-apps) for an example of investigating the thread stacks using Visual Studio.

To eliminate ThreadPool starvation, ThreadPool threads need to remain unblocked so that they're available to handle incoming work items. There are two ways to determine what each thread was doing, either using the [dotnet-stack](dotnet-stack.md) tool or capturing a dump with [dotnet-dump](dotnet-dump.md) that can be viewed in [Visual Studio](/visualstudio/debugger/using-dump-files). dotnet-stack can be faster because it shows the thread stacks immediately on the console, but Visual Studio dump debugging offers better visualizations that map frames to source, Just My Code can filter out runtime implementation frames, and the Parallel Stacks feature can help group large numbers of threads with similar stacks. This tutorial shows the dotnet-stack option. See the [diagnosing ThreadPool starvation tutorial video](/shows/on-net/diagnosing-thread-pool-exhaustion-issues-in-net-core-apps) for an example of investigating the thread stacks using Visual Studio.
To eliminate ThreadPool starvation, ThreadPool threads need to remain unblocked so that they're available to handle incoming work items. There are multiple ways to determine what each thread was doing: using the [dotnet-stack](dotnet-stack.md) tool, capturing a dump with [dotnet-dump](dotnet-dump.md) that can be viewed in [Visual Studio](/visualstudio/debugger/using-dump-files), or collecting wait events with [dotnet-trace](dotnet-trace.md). dotnet-stack can be faster because it shows the thread stacks immediately on the console, but Visual Studio dump debugging offers better visualizations that map frames to source, Just My Code can filter out runtime implementation frames, and the Parallel Stacks feature can help group large numbers of threads with similar stacks. This tutorial first shows the dotnet-stack option, then demonstrates how to use dotnet-trace. See the [diagnosing ThreadPool starvation tutorial video](/shows/on-net/diagnosing-thread-pool-exhaustion-issues-in-net-core-apps) for an example of investigating the thread stacks using Visual Studio.

### dotnet-stack
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### dotnet-stack
### Diagnose a continuous issue with dotnet-stack

@@ -275,6 +262,67 @@ Thread (0x25968):
DiagnosticScenarios!testwebapi.Controllers.DiagScenarioController.TaskWait()
```

### dotnet-trace
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### dotnet-trace
### Diagnose an intermittent issue with dotnet-trace


The dotnet-stack approach is effective only for obvious, consistent blocking operations that occur in every request. In some scenarios, the blocking could happen sporadically only every few minutes, making dotnet-stack less useful for diagnosing the issue. dotnet-trace can be used to collect events over a period of time and save them in a nettrace file that can be analyzed later.

There is one particular event that helps diagnosing thread pool starvation that was introduced in .NET 9, it's the WaitHandleWait event. It's emitted when a thread pool thread becomes blocked by operations such as sync-over-async calls (e.g. `Task.Result`, `Task.Wait`, `Task.GetAwaiter().GetResult()`) or by other locking operations like `lock`, `Monitor.Enter`, `ManualResetEventSlim.Wait`, or `SemaphoreSlim.Wait`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
There is one particular event that helps diagnosing thread pool starvation that was introduced in .NET 9, it's the WaitHandleWait event. It's emitted when a thread pool thread becomes blocked by operations such as sync-over-async calls (e.g. `Task.Result`, `Task.Wait`, `Task.GetAwaiter().GetResult()`) or by other locking operations like `lock`, `Monitor.Enter`, `ManualResetEventSlim.Wait`, or `SemaphoreSlim.Wait`.
There is one particular event that helps diagnosing thread pool starvation that was introduced in .NET 9, the WaitHandleWait event. It's emitted when a thread becomes blocked by operations such as sync-over-async calls (e.g. `Task.Result`, `Task.Wait`, `Task.GetAwaiter().GetResult()`) or by other locking operations like `lock`, `Monitor.Enter`, `ManualResetEventSlim.Wait`, or `SemaphoreSlim.Wait`.


Let's see how to use each tool to read the nettrace file.

### Analyze a nettrace with Perfview
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### Analyze a nettrace with Perfview
#### Analyze a nettrace with Perfview

![Screenshot of the expanded any stacks view for wait events in PerfView](./media/perfview-any-stacks-waithandle-expanded.png)
7. This stack trace can be read as: `Task<T>.Result` emitted a WaitHandleWait event with a WaitSource MonitorWait (`Task<T>.Result` uses `Monitor.Wait` to perform a wait). It was called by `DiagScenarioController.TaskWait`, which was called by some lambda, which was called by some ASP.NET code

### Analyze a nettrace with .NET Events Viewer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### Analyze a nettrace with .NET Events Viewer
#### Analyze a nettrace with .NET Events Viewer


### Analyze a nettrace with .NET Events Viewer

1. Go to [verdie-g.github.io/dotnet-events-viewer](https://verdie-g.github.io/dotnet-events-viewer)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat tool!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community-contribution Indicates PR is created by someone from the .NET community. dotnet-fundamentals/svc
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants