Introduction and Concepts
{ { t.Start(); // running WriteY() // Simultaneously, do something on the main thread. for (static void WriteY() { for (int i = 0; i < 1000; i++) Console.Write ("y"); }}
{ } static void Go() { // Declare and use a cycles < 5; cycles++) Console.Write ('?');}
??????????
{ bool done; static void Ma new Thread (tt.Go).Start(); tt.Go(); } // Note that Go is now an instance method void nsole.WriteLine ("Done"); } }}
Because both threads call
Go()
on the same ThreadTest
instance, they share the done
field. This results in "Done" being printed once instead of twice:{ static void Main() { new Thread (Go).Start(); Go(); } static void Go() { if (!done) { done = true; Console.WriteLine
Both of these examples illustrate another key concept: that of thread safety (or rather, lack of it!) The output is actually indeterminate: it’s possible (though unlikely) that “Done” could be printed twice. If, however, we swap the order of statements in the
Go
method, the odds of “Done” being printed twice go up dramatically:{
The remedy is to obtain an exclusive lock while reading and writing to the common field. C# provides the lockstatement for just this purpose:
{ static void Main() { new Thread (Go).Start( )oid Go() { lock (locker) { if (!done) { Console.WriteLine
When two threads simultaneously contend a lock (in this case,
locker
), one thread waits, or blocks, until the lock becomes available. In this case, it ensures only one thread can enter the critical section of code at a time, and “Done” will be printed just once. Code that's protected in such a manner — from indeterminacy in a multithreading context — is called thread-safe.Join and Sleep
{ t.Start(); t.Join(); Console.WriteLine ("Th r for (int i = 0; i < 1000; i++) Console.Write ("y");}
While waiting on a
Sleep
or Join
, a thread is blocked and so does not consume CPU resources.Sleep(0)
or Yield
is occasionally useful in production code for advanced performance tweaks. It’s also an excellent diagnostic tool for helping to uncover thread safety issues: if insertingThread.Yield()
anywhere in your code makes or breaks the program, you almost certainly have a bug.How Threading Works
Threads vs Processes
Threading’s Uses and Misuses
- Maintaining a responsive user interface
- By running time-consuming tasks on a parallel “worker” thread, the main UI thread is free to continue processing keyboard and mouse events.
- Making efficient use of an otherwise blocked CPU
- Multithreading is useful when a thread is awaiting a response from another computer or piece of hardware. While one thread is blocked while performing the task, other threads can take advantage of the otherwise unburdened computer.
- Parallel programming
- Code that performs intensive calculations can execute faster on multicore or multiprocessor computers if the workload is shared among multiple threads in a “divide-and-conquer” strategy (see Part 5).
- Speculative execution
- On multicore machines, you can sometimes improve performance by predicting something that might need to be done, and then doing it ahead of time. LINQPad uses this technique to speed up the creation of new queries. A variation is to run a number of different algorithms in parallel that all solve the same task. Whichever one finishes first “wins” — this is effective when you can’t know ahead of time which algorithm will execute fastest.
- Allowing requests to be processed simultaneously
- On a server, client requests can arrive concurrently and so need to be handled in parallel (the .NET Framework creates threads for this automatically if you use ASP.NET, WCF, Web Services, or Remoting). This can also be useful on a client (e.g., handling peer-to-peer networking — or even multiple requests from the user).
With technologies such as ASP.NET and WCF, you may be unaware that multithreading is even taking place — unless you access shared data (perhaps via static fields) without appropriate locking, running afoul of thread safety.
Threads also come with strings attached. The biggest is that multithreading can increase complexity. Having lots of threads does not in and of itself create much complexity; it’s the interaction between threads (typically via shared data) that does. This applies whether or not the interaction is intentional, and can cause long development cycles and an ongoing susceptibility to intermittent and nonreproducible bugs. For this reason, it pays to keep interaction to a minimum, and to stick to simple and proven designs wherever possible. This article focuses largely on dealing with just these complexities; remove the interaction and there’s much less to say!
A good strategy is to encapsulate multithreading logic into reusable classes that can be independently examined and tested. The Framework itself offers many higher-level threading constructs, which we cover later.
A good strategy is to encapsulate multithreading logic into reusable classes that can be independently examined and tested. The Framework itself offers many higher-level threading constructs, which we cover later.
Threading also incurs a resource and CPU cost in scheduling and switching threads (when there are more active threads than CPU cores) — and there’s also a creation/tear-down cost. Multithreading will not always speed up your application — it can even slow it down if used excessively or inappropriately. For example, when heavy disk I/O is involved, it can be faster to have a couple of worker threads run tasks in sequence than to have 10 threads executing at once. (In Signaling with Wait and Pulse, we describe how to implement a producer/consumer queue, which provides just this functionality.)
Creating and Starting Threads
public delegate void ThreadStart();
{ { t.Start(); // Run Go() on the new thread. Go(); // Simultaneously run Go() in tine ("hello!"); }}
Thread t = new Thread (Go); // No need to explicitly use ThreadStart
{ t.Start();}
Passing Data to a Thread
{ } stat iConsole.WriteLine (message);}
{ Console.WriteLine ("This is so easy!");}).Start();
{ }t a Console.WriteLine (message);}
Lambda expressions and captured variables
The problem is that the
This is analogous to the problem we describe in “Captured Variables” in Chapter 8 of C# 4.0 in a Nutshell. The problem is less about multithreading and more about C#'s rules for capturing variables (which are somewhat undesirable in the case of
i
variable refers to the same memory location throughout the loop’s lifetime. Therefore, each thread calls Console.Write
on a variable whose value may change as it is running!This is analogous to the problem we describe in “Captured Variables” in Chapter 8 of C# 4.0 in a Nutshell. The problem is less about multithreading and more about C#'s rules for capturing variables (which are somewhat undesirable in the case of
for
and foreach
loops).{ int temp = i; new Thread (()
text = "t2";Thread t2 = ne
Naming Threads
{ { Thread worker = new Thread (Go); worker.Name = "worker"; wo r static void Go() { Console.WriteLine ("H
Foreground and Background Threads
By default, threads you create explicitly are foreground threads. Foreground threads keep the application alive for as long as any one of them is running, whereas background threads do not. Once all foreground threads finish, the application ends, and any background threads still running abruptly terminate.
A thread’s foreground/background status has no relation to its priority or allocation of execution time
A thread’s foreground/background status has no relation to its priority or allocation of execution time
You can query or change a thread’s background status using its
IsBackground
property. Here’s an example:class PriorityTest{ { if (args.Length > 0) worker.IsBackground = true; worker.Start(); }}
C# 4.0 in a Nutshell
If this program is called with no arguments, the worker thread assumes foreground status and will wait on the
ReadLine
statement for the user to press Enter. Meanwhile, the main thread exits, but the application keeps running because a foreground thread is still alive.On the other hand, if an argument is passed to
Main()
, the worker is assigned background status, and the program exits almost immediately as the main thread ends (terminating the ReadLine
).When a process terminates in this manner, any
finally
blocks in the execution stack of background threads are circumvented. This is a problem if your program employs finally
(or using
) blocks to perform cleanup work such as releasing resources or deleting temporary files. To avoid this, you can explicitly wait out such background threads upon exiting an application. There are two ways to accomplish this:- If you’ve created the thread yourself, call
Join on the thread.
- If you’re on a pooled thread, use an event wait handle.
In either case, you should specify a timeout, so you can abandon a renegade thread should it refuse to finish for some reason. This is your backup exit strategy: in the end, you want your application to close — without the user having to enlist help from the Task Manager!
If a user uses the Task Manager to forcibly end a .NET process, all threads “drop dead” as though they were background threads. This is observed rather than documented behavior, and it could vary depending on the CLR and operating system version
If a user uses the Task Manager to forcibly end a .NET process, all threads “drop dead” as though they were background threads. This is observed rather than documented behavior, and it could vary depending on the CLR and operating system version
Foreground threads don’t require this treatment, but you must take care to avoid bugs that could cause the thread not to end. A common cause for applications failing to exit properly is the presence of active foreground threads.
Thread Priority
A thread’s
Priority
property determines how much execution time it gets relative to other active threads in the operating system, on the following scale:enum ThreadPriority { Lowest, BelowNormal, Normal, AboveNormal, Highest }
This becomes relevant only when multiple threads are simultaneously active.
Think carefully before elevating a thread’s priority — it can lead to problems such as resource starvation for other threads
Elevating a thread’s priority doesn’t make it capable of performing real-time work, because it’s still throttled by the application’s process priority. To perform real-time work, you must also elevate the process priority using the
Elevating a thread’s priority doesn’t make it capable of performing real-time work, because it’s still throttled by the application’s process priority. To perform real-time work, you must also elevate the process priority using the
Process
class in System.Diagnostics
(we didn’t tell you how to do this):using (Process p = Process.GetCurrentProcess())p.PriorityClass = ProcessPriorityClass.High;
ProcessPriorityClass.High
is actually one notch short of the highest priority: Realtime
. Setting a process priority toRealtime
instructs the OS that you never want the process to yield CPU time to another process. If your program enters an accidental infinite loop, you might find even the operating system locked out, with nothing short of the power button left to rescue you! For this reason, High
is usually the best choice for real-time applications.If your real-time application has a user interface, elevating the process priority gives screen updates excessive CPU time, slowing down the entire computer (particularly if the UI is complex). Lowering the main thread’s priority in conjunction with raising the process’s priority ensures that the real-time thread doesn’t get preempted by screen redraws, but doesn’t solve the problem of starving other applications of CPU time, because the operating system will still allocate disproportionate resources to the process as a whole. An ideal solution is to have the real-time worker and user interface run as separate applications with different process priorities, communicating via Remoting or memory-mapped files. Memory-mapped files are ideally suited to this task; we explain how they work in Chapters 14 and 25 of C# 4.0 in a Nutshell
Even with an elevated process priority, there’s a limit to the suitability of the managed environment in handling hard real-time requirements. In addition to the issues of latency introduced by automatic garbage collection, the operating system may present additional challenges — even for unmanaged applications — that are best solved with dedicated hardware or a specialized real-time platform.
Even with an elevated process priority, there’s a limit to the suitability of the managed environment in handling hard real-time requirements. In addition to the issues of latency introduced by automatic garbage collection, the operating system may present additional challenges — even for unmanaged applications — that are best solved with dedicated hardware or a specialized real-time platform.
Exception Handling
Any
try
/catch
/finally
blocks in scope when a thread is created are of no relevance to the thread when it starts executing. Consider the following program:public static void Main(){ try {ad (Go).Start(); } catchnew Thr e (Exception ex) {t here! Console.WriteLin// We'll never g ee ("Exception!"); } }ll; } // Throws a NullReferenceExceptionstatic void Go() { throw n u
The
try
/catch
statement in this example is ineffective, and the newly created thread will be encumbered with an unhandled NullReferenceException
. This behavior makes sense when you consider that each thread has an independent execution path.The remedy is to move the exception handler into the
Go
method:public static void Main(){new Thread (Go).Start(); }static void Go() { try;{ // ... throw null // The NullReferenceException will get caught below// ... } catch (Exception ex) {signal another thread // that we've come unstuck // ...// Typically log the exception, and/o r }}
You need an exception handler on all thread entry methods in production applications — just as you do (usually at a higher level, in the execution stack) on your main thread. An unhandled exception causes the whole application to shut down. With an ugly dialog!
In writing such exception handling blocks, rarely would you ignore the error: typically, you’d log the details of the exception, and then perhaps display a dialog allowing the user to automatically submit those details to your web server. You then might shut down the application — because it’s possible that the error corrupted the program’s state. However, the cost of doing so is that the user will lose his recent work — open documents, for instance
The “global” exception handling events for WPF and Windows Forms applications (
The “global” exception handling events for WPF and Windows Forms applications (
Application.DispatcherUnhandledException
and Application.ThreadException
) fire only for exceptions thrown on the main UI thread. You still must handle exceptions on worker threads manually.AppDomain.CurrentDomain.UnhandledException
fires on any unhandled exception, but provides no means of preventing the application from shutting down afterwardThere are, however, some cases where you don’t need to handle exceptions on a worker thread, because the .NET Framework does it for you. These are covered in upcoming sections, and are:Thread Pooling
Whenever you start a thread, a few hundred microseconds are spent organizing such things as a fresh private local variable stack. Each thread also consumes (by default) around 1 MB of memory. The thread pool cuts these overheads by sharing and recycling threads, allowing multithreading to be applied at a very granular level without a performance penalty. This is useful when leveraging multicore processors to execute computationally intensive code in parallel in “divide-and-conquer” style.
The thread pool also keeps a lid on the total number of worker threads it will run simultaneously. Too many active threads throttle the operating system with administrative burden and render CPU caches ineffective. Once a limit is reached, jobs queue up and start only when another finishes. This makes arbitrarily concurrent applications possible, such as a web server. (The asynchronous method pattern is an advanced technique that takes this further by making highly efficient use of the pooled threads; we describe this in Chapter 23 of C# 4.0 in a Nutshell).
There are a number of ways to enter the thread pool:
- Via the Task Parallel Library (from Framework 4.0)
- By calling
ThreadPool.QueueUserWorkItem
- Via asynchronous delegates
- Via
BackgroundWorker
The following constructs use the thread pool indirectly: - WCF, Remoting, ASP.NET, and ASMX Web Services application servers
System.Timers.Timer and
System.Threading.Timer
- Framework methods that end in Async, such as those on
WebClient
(the event-based asynchronous pattern), and mostBegin
XXX methods (the asynchronous programming model pattern) - PLINQ
The Task Parallel Library (TPL) and PLINQ are sufficiently powerful and high-level that you’ll want to use them to assist in multithreading even when thread pooling is unimportant. We discuss these in detail in Part 5; right now, we'll look briefly at how you can use the
Task class as a simple means of running a delegate on a pooled thread.There are a few things to be wary of when using pooled threads:
- You cannot set the
Name
of a pooled thread, making debugging more difficult (although you can attach a description when debugging in Visual Studio’s Threads window). - Pooled threads are always background threads (this is usually not a problem).
- Blocking a pooled thread may trigger additional latency in the early life of an application unless you call
ThreadPool.SetMinThreads
(see Optimizing the Thread Pool).
You can query if you’re currently executing on a pooled thread via the property
Thread.CurrentThread.IsThreadPoolThread
.Entering the Thread Pool via TPL
You can enter the thread pool easily using the
Task classes in the Task Parallel Library. The Task
classes were introduced in Framework 4.0: if you’re familiar with the older constructs, consider the nongeneric Task
class a replacement for ThreadPool.QueueUserWorkItem
, and the generic Task
a replacement forasynchronous delegates. The newer constructs are faster, more convenient, and more flexible than the old.To use the nongeneric
Task
class, call Task.Factory.StartNew
, passing in a delegate of the target method:static void Main() // The Task class is in System.Threading.Tasks{Task.Factory.StartNew (Go);} static void Go() {ello from the thread pool!"); }Console.WriteLine ("H")
Task.Factory.StartNew
returns a Task
object, which you can then use to monitor the task — for instance, you can wait for it to complete by calling its
Wait method.Any unhandled exceptions are conveniently rethrown onto the host thread when you call a task's
Wait method. (If you don’t call Wait
and abandon the task, an unhandled exception will shut down the process as with an ordinary thread.)The generic
Task
class is a subclass of the nongeneric Task
. It lets you get a return value back from the task after it finishes executing. In the following example, we download a web page using Task
:static void Main(){// Start the task executing:Tasktask = Task.Factory.StartNew ( () => DownloadString ("http://www.linqpad.net") );// We can do other work here and it will execute in parallel:RunSomeOtherMethod();'s return value, we query its Result property: // If it's still execut// When we need the tas king, the current thread will now block (wait) // until the task finishes:task.Result;string result = }tic string DownloadString (string uri) {sta using (var wc = new System.Net.WebClient())return wc.DownloadString (uri);}
(The
type argument highlighted is for clarity: it would be inferred if we omitted it.)Any unhandled exceptions are automatically rethrown when you query the task's
Result
property, wrapped in an
AggregateException. However, if you fail to query its Result
property (and don’t call Wait
) any unhandled exception will take the process down.The Task Parallel Library has many more features, and is particularly well suited to leveraging multicore processors. We’ll resume our discussion of TPL in Part 5.
Entering the Thread Pool Without TPL
You can't use the Task Parallel Library if you're targeting an earlier version of the .NET Framework (prior to 4.0). Instead, you must use one of the older constructs for entering the thread pool:
ThreadPool.QueueUserWorkItem
and asynchronous delegates. The difference between the two is that asynchronous delegates let you return data from the thread. Asynchronous delegates also marshal any exception back to the caller.QueueUserWorkItem
To use
QueueUserWorkItem
, simply call this method with a delegate that you want to run on a pooled thread:static void Main(){ThreadPool.QueueUserWorkItem (Go);ThreadPool.QueueUserWorkItem (Go, 123);Console.ReadLine(); }a) // data will be null with the first call.static void Go (object da t {onsole.WriteLine ("Hello from the thread pool! " + data); }CHello from the thread pool!Hello from the thread pool! 123
Our target method,
Go
, must accept a single object
argument (to satisfy the WaitCallback
delegate). This provides a convenient way of passing data to the method, just like with ParameterizedThreadStart
. Unlike with Task
,QueueUserWorkItem
doesn't return an object to help you subsequently manage execution. Also, you must explicitly deal with exceptions in the target code — unhandled exceptions will take down the program.Asynchronous delegates
ThreadPool.QueueUserWorkItem
doesn’t provide an easy mechanism for getting return values back from a thread after it has finished executing. Asynchronous delegate invocations (asynchronous delegates for short) solve this, allowing any number of typed arguments to be passed in both directions. Furthermore, unhandled exceptions on asynchronous delegates are conveniently rethrown on the original thread (or more accurately, the thread that callsEndInvoke
), and so they don’t need explicit handling.Don’t confuse asynchronous delegates with asynchronous methods (methods starting withBegin or End, such as
File.BeginRead
/File.EndRead
). Asynchronous methods follow a similar protocol outwardly, but they exist to solve a much harder problem, which we describe in Chapter 23 of C# 4.0 in a NutshellHere’s how you start a worker task via an asynchronous delegate:
- Instantiate a delegate targeting the method you want to run in parallel (typically one of the predefined
Func
delegates). - Call
BeginInvoke
on the delegate, saving itsIAsyncResult
return value.BeginInvoke
returns immediately to the caller. You can then perform other activities while the pooled thread is working. - When you need the results, call
EndInvoke
on the delegate, passing in the savedIAsyncResult
object.
In the following example, we use an asynchronous delegate invocation to execute concurrently with the main thread, a simple method that returns a string’s length:
static void Main(){Funcmethod = Work; IAsyncResult cookie = method.BeginInvoke ("test", null, null);// // ... here's where we can do other work in parallel...ine// int result = method.EndInvoke (cookie); Console.Write L("String length is: " + result); }Length; }static int Work (string s) { return s .
EndInvoke
does three things. First, it waits for the asynchronous delegate to finish executing, if it hasn’t already. Second, it receives the return value (as well as any ref
or out
parameters). Third, it throws any unhandled worker exception back to the calling thread.If the method you’re calling with an asynchronous delegate has no return value, you are still (technically) obliged to call
EndInvoke
. In practice, this is open to debate; there are noEndInvoke
police to administer punishment to noncompliers! If you choose not to callEndInvoke
, however, you’ll need to consider exception handling on the worker method to avoid silent failuresYou can also specify a callback delegate when calling
BeginInvoke
— a method accepting an IAsyncResult
object that’s automatically called upon completion. This allows the instigating thread to “forget” about the asynchronous delegate, but it requires a bit of extra work at the callback end:static void Main(){Funcmethod = Work; method.BeginInvoke ("test", Done, method);// ... // }tring s) { return s.Length; } static void Donstatic int Work ( se (IAsyncResult cookie) {, int>) cookie.AsyncState; int result = target.EndIvar target = (Func
}
The final argument to
BeginInvoke
is a user state object that populates the AsyncState
property of IAsyncResult
. It can contain anything you like; in this case, we’re using it to pass the method
delegate to the completion callback, so we can call EndInvoke
on it.Optimizing the Thread Pool
The thread pool starts out with one thread in its pool. As tasks are assigned, the pool manager “injects” new threads to cope with the extra concurrent workload, up to a maximum limit. After a sufficient period of inactivity, the pool manager may “retire” threads if it suspects that doing so will lead to better throughput.
You can set the upper limit of threads that the pool will create by calling
ThreadPool.SetMaxThreads
; the defaults are:- 1023 in Framework 4.0 in a 32-bit environment
- 32768 in Framework 4.0 in a 64-bit environment
- 250 per core in Framework 3.5
- 25 per core in Framework 2.0
(These figures may vary according to the hardware and operating system.) The reason there are that many is to ensure progress should some threads be blocked (idling while awaiting some condition, such as a response from a remote computer).
You can also set a lower limit by calling
The default lower limit is one thread per processor core — the minimum that allows full CPU utilization. On server environments, though (such ASP.NET under IIS), the lower limit is typically much higher — as much as 50 or more
ThreadPool.SetMinThreads
. The role of the lower limit is subtler: it’s an advanced optimization technique that instructs the pool manager not to delay in the allocation of threads until reaching the lower limit. Raising the minimum thread count improves concurrency when there are blocked threads (see sidebar).The default lower limit is one thread per processor core — the minimum that allows full CPU utilization. On server environments, though (such ASP.NET under IIS), the lower limit is typically much higher — as much as 50 or more