Sunday, December 12, 2010

threading in c# part 1

Introduction and Concepts

{ 
  {   
  t.Start();                               // running WriteY()
     // Simultaneously, do something on the main thread.
    for 
(static void WriteY()
  {
    for (int i = 0; i < 1000; i++) Console.Write ("y");
  }}
Starting a new Thread
{ }
 
static void Go()
{  // Declare and use a
  cycles < 5; cycles++) Console.Write ('?');}
??????????
{
  bool done;
   static void Ma    new Thread (tt.Go).Start();
    tt.Go();
  }
   // Note that Go is now an instance method
  void
 nsole.WriteLine ("Done"); }
  }}
{ 
 
  static void Main()
  {
    new Thread (Go).Start();
    Go();
  }
   static void Go()
  {
    if (!done) { done = true; Console.WriteLine 
{ 
{  
  static void Main()
  {    new Thread (Go).Start(
)oid Go()
  {
    lock (locker)
    {      if (!done) { Console.WriteLine 

Join and Sleep

{ 
  t.Start();
  t.Join();  Console.WriteLine ("Th
r for (int i = 0; i < 1000; i++) Console.Write ("y");}

How Threading Works

Threads vs Processes

Threading’s Uses and Misuses

Maintaining a responsive user interface
By running time-consuming tasks on a parallel “worker” thread, the main UI thread is free to continue processing keyboard and mouse events.
Making efficient use of an otherwise blocked CPU
Multithreading is useful when a thread is awaiting a response from another computer or piece of hardware. While one thread is blocked while performing the task, other threads can take advantage of the otherwise unburdened computer.
Parallel programming
Code that performs intensive calculations can execute faster on multicore or multiprocessor computers if the workload is shared among multiple threads in a “divide-and-conquer” strategy (see Part 5).
Speculative execution
On multicore machines, you can sometimes improve performance by predicting something that might need to be done, and then doing it ahead of time. LINQPad uses this technique to speed up the creation of new queries. A variation is to run a number of different algorithms in parallel that all solve the same task. Whichever one finishes first “wins” — this is effective when you can’t know ahead of time which algorithm will execute fastest.
Allowing requests to be processed simultaneously
On a server, client requests can arrive concurrently and so need to be handled in parallel (the .NET Framework creates threads for this automatically if you use ASP.NET, WCF, Web Services, or Remoting). This can also be useful on a client (e.g., handling peer-to-peer networking — or even multiple requests from the user).
Threads also come with strings attached. The biggest is that multithreading can increase complexity. Having lots of threads does not in and of itself create much complexity; it’s the interaction between threads (typically via shared data) that does. This applies whether or not the interaction is intentional, and can cause long development cycles and an ongoing susceptibility to intermittent and nonreproducible bugs. For this reason, it pays to keep interaction to a minimum, and to stick to simple and proven designs wherever possible. This article focuses largely on dealing with just these complexities; remove the interaction and there’s much less to say!
A good strategy is to encapsulate multithreading logic into reusable classes that can be independently examined and tested. The Framework itself offers many higher-level threading constructs, which we cover later.

Creating and Starting Threads

public delegate void ThreadStart();
{ 
  {   
    t.Start();   // Run Go() on the new thread.    Go();        // Simultaneously run Go() in 
tine ("hello!");
  }}
Thread t = new Thread (Go);    // No need to explicitly use ThreadStart
{ 
  t.Start();}

Passing Data to a Thread

{ 
  }
 stat
iConsole.WriteLine (message);}
{ 
  Console.WriteLine ("This is so easy!");}).Start();
{ }t
a 
  Console.WriteLine (message);}

Lambda expressions and captured variables

{
  int temp = i;  new Thread (() 
text = "t2";Thread t2 = ne

Naming Threads

{ 
  {   
  Thread worker = new Thread (Go);    worker.Name = "worker";
    wo
r 
  static void Go()
  {    Console.WriteLine ("H

Foreground and Background Threads

By default, threads you create explicitly are foreground threads. Foreground threads keep the application alive for as long as any one of them is running, whereas background threads do not. Once all foreground threads finish, the application ends, and any background threads still running abruptly terminate.
A thread’s foreground/background status has no relation to its priority or allocation of execution time
You can query or change a thread’s background status using its IsBackground property. Here’s an example:
class PriorityTest
{  {    if (args.Length > 0) worker.IsBackground = true;    worker.Start();   }}
C# 4.0 in a Nutshell
If this program is called with no arguments, the worker thread assumes foreground status and will wait on the ReadLine statement for the user to press Enter. Meanwhile, the main thread exits, but the application keeps running because a foreground thread is still alive.
On the other hand, if an argument is passed to Main(), the worker is assigned background status, and the program exits almost immediately as the main thread ends (terminating the ReadLine).
When a process terminates in this manner, any finally blocks in the execution stack of background threads are circumvented. This is a problem if your program employs finally (or using) blocks to perform cleanup work such as releasing resources or deleting temporary files. To avoid this, you can explicitly wait out such background threads upon exiting an application. There are two ways to accomplish this:
  • If you’ve created the thread yourself, call Join on the thread.
  • If you’re on a pooled thread, use an event wait handle.
In either case, you should specify a timeout, so you can abandon a renegade thread should it refuse to finish for some reason. This is your backup exit strategy: in the end, you want your application to close — without the user having to enlist help from the Task Manager!
If a user uses the Task Manager to forcibly end a .NET process, all threads “drop dead” as though they were background threads. This is observed rather than documented behavior, and it could vary depending on the CLR and operating system version
Foreground threads don’t require this treatment, but you must take care to avoid bugs that could cause the thread not to end. A common cause for applications failing to exit properly is the presence of active foreground threads.

Thread Priority

A thread’s Priority property determines how much execution time it gets relative to other active threads in the operating system, on the following scale:
enum ThreadPriority { Lowest, BelowNormal, Normal, AboveNormal, Highest }
This becomes relevant only when multiple threads are simultaneously active.
Think carefully before elevating a thread’s priority — it can lead to problems such as resource starvation for other threads
Elevating a thread’s priority doesn’t make it capable of performing real-time work, because it’s still throttled by the application’s process priority. To perform real-time work, you must also elevate the process priority using theProcess class in System.Diagnostics (we didn’t tell you how to do this):
using (Process p = Process.GetCurrentProcess())
  p.PriorityClass = ProcessPriorityClass.High;
ProcessPriorityClass.High is actually one notch short of the highest priority: Realtime. Setting a process priority toRealtime instructs the OS that you never want the process to yield CPU time to another process. If your program enters an accidental infinite loop, you might find even the operating system locked out, with nothing short of the power button left to rescue you! For this reason, High is usually the best choice for real-time applications.
If your real-time application has a user interface, elevating the process priority gives screen updates excessive CPU time, slowing down the entire computer (particularly if the UI is complex). Lowering the main thread’s priority in conjunction with raising the process’s priority ensures that the real-time thread doesn’t get preempted by screen redraws, but doesn’t solve the problem of starving other applications of CPU time, because the operating system will still allocate disproportionate resources to the process as a whole. An ideal solution is to have the real-time worker and user interface run as separate applications with different process priorities, communicating via Remoting or memory-mapped files. Memory-mapped files are ideally suited to this task; we explain how they work in Chapters 14 and 25 of C# 4.0 in a Nutshell
Even with an elevated process priority, there’s a limit to the suitability of the managed environment in handling hard real-time requirements. In addition to the issues of latency introduced by automatic garbage collection, the operating system may present additional challenges — even for unmanaged applications — that are best solved with dedicated hardware or a specialized real-time platform.

Exception Handling

Any try/catch/finally blocks in scope when a thread is created are of no relevance to the thread when it starts executing. Consider the following program:
public static void Main()
{   try   {
ad (Go).Start();   }   catch
    new Thr e (Exception ex)   {
t here!     Console.WriteLin
    // We'll never g ee ("Exception!");   } }
ll; }   // Throws a NullReferenceException
static void Go() { throw n u
The try/catch statement in this example is ineffective, and the newly created thread will be encumbered with an unhandled NullReferenceException. This behavior makes sense when you consider that each thread has an independent execution path.
The remedy is to move the exception handler into the Go method:
public static void Main()
{
new Thread (Go).Start(); }
    static void Go() {   try
;  
  {     // ...     throw nul
l  // The NullReferenceException will get caught below
    // ...   }   catch (Exception ex)   {
signal another thread     // that we've come unstuck     // ...
    // Typically log the exception, and/o r   }}
You need an exception handler on all thread entry methods in production applications — just as you do (usually at a higher level, in the execution stack) on your main thread. An unhandled exception causes the whole application to shut down. With an ugly dialog!
In writing such exception handling blocks, rarely would you ignore the error: typically, you’d log the details of the exception, and then perhaps display a dialog allowing the user to automatically submit those details to your web server. You then might shut down the application — because it’s possible that the error corrupted the program’s state. However, the cost of doing so is that the user will lose his recent work — open documents, for instance
The “global” exception handling events for WPF and Windows Forms applications (Application.DispatcherUnhandledException and Application.ThreadException) fire only for exceptions thrown on the main UI thread. You still must handle exceptions on worker threads manually.
AppDomain.CurrentDomain.UnhandledException fires on any unhandled exception, but provides no means of preventing the application from shutting down afterwardThere are, however, some cases where you don’t need to handle exceptions on a worker thread, because the .NET Framework does it for you. These are covered in upcoming sections, and are:
  • Asynchronous delegates
  • BackgroundWorker
  • The Task Parallel Library (conditions apply)

Thread Pooling

Whenever you start a thread, a few hundred microseconds are spent organizing such things as a fresh private local variable stack. Each thread also consumes (by default) around 1 MB of memory. The thread pool cuts these overheads by sharing and recycling threads, allowing multithreading to be applied at a very granular level without a performance penalty. This is useful when leveraging multicore processors to execute computationally intensive code in parallel in “divide-and-conquer” style.
The thread pool also keeps a lid on the total number of worker threads it will run simultaneously. Too many active threads throttle the operating system with administrative burden and render CPU caches ineffective. Once a limit is reached, jobs queue up and start only when another finishes. This makes arbitrarily concurrent applications possible, such as a web server. (The asynchronous method pattern is an advanced technique that takes this further by making highly efficient use of the pooled threads; we describe this in Chapter 23 of C# 4.0 in a Nutshell).
There are a number of ways to enter the thread pool:
  • Via the Task Parallel Library (from Framework 4.0)
  • By calling ThreadPool.QueueUserWorkItem
  • Via asynchronous delegates
  • Via BackgroundWorker
    The following constructs use the thread pool indirectly:
  • WCF, Remoting, ASP.NET, and ASMX Web Services application servers
  • System.Timers.Timer and System.Threading.Timer
  • Framework methods that end in Async, such as those on WebClient (the event-based asynchronous pattern), and most BeginXXX methods (the asynchronous programming model pattern)
  • PLINQ
The Task Parallel Library (TPL) and PLINQ are sufficiently powerful and high-level that you’ll want to use them to assist in multithreading even when thread pooling is unimportant. We discuss these in detail in Part 5; right now, we'll look briefly at how you can use the Task class as a simple means of running a delegate on a pooled thread.
There are a few things to be wary of when using pooled threads:
  • You cannot set the Name of a pooled thread, making debugging more difficult (although you can attach a description when debugging in Visual Studio’s Threads window).
  • Pooled threads are always background threads (this is usually not a problem).
  • Blocking a pooled thread may trigger additional latency in the early life of an application unless you call ThreadPool.SetMinThreads (see Optimizing the Thread Pool).
You are free to change the priority of a pooled thread — it will be restored to normal when released back to the pool.
You can query if you’re currently executing on a pooled thread via the propertyThread.CurrentThread.IsThreadPoolThread.

Entering the Thread Pool via TPL

You can enter the thread pool easily using the Task classes in the Task Parallel Library. The Task classes were introduced in Framework 4.0: if you’re familiar with the older constructs, consider the nongeneric Task class a replacement for ThreadPool.QueueUserWorkItem, and the generic Task a replacement forasynchronous delegates. The newer constructs are faster, more convenient, and more flexible than the old.
To use the nongeneric Task class, call Task.Factory.StartNew, passing in a delegate of the target method:
static void Main()    // The Task class is in System.Threading.Tasks
{
  Task.Factory.StartNew (Go);
} static void Go() {
ello from the thread pool!"); }
  Console.WriteLine ("H")
Task.Factory.StartNew returns a Task object, which you can then use to monitor the task — for instance, you can wait for it to complete by calling its Wait method.
Any unhandled exceptions are conveniently rethrown onto the host thread when you call a task'sWait method. (If you don’t call Wait and abandon the task, an unhandled exception will shut down the process as with an ordinary thread.)
The generic Task class is a subclass of the nongeneric Task. It lets you get a return value back from the task after it finishes executing. In the following example, we download a web page using Task:
static void Main()
{
// Start the task executing:  
 
Task task = Task.Factory.StartNew
    ( () => DownloadString ("http://www.linqpad.net") );
// We can do other work here and it will execute in parallel:  
  RunSomeOtherMethod();
's return value, we query its Result property:   // If it's still execut
  // When we need the tas king, the current thread will now block (wait)   // until the task finishes:
task.Result;
  string result = }
tic string DownloadString (string uri) {
st
a  using (var wc = new System.Net.WebClient())
    return wc.DownloadString (uri);}
(The  type argument highlighted is for clarity: it would be inferred if we omitted it.)
Any unhandled exceptions are automatically rethrown when you query the task's Result property, wrapped in anAggregateException. However, if you fail to query its Result property (and don’t call Wait) any unhandled exception will take the process down.
The Task Parallel Library has many more features, and is particularly well suited to leveraging multicore processors. We’ll resume our discussion of TPL in Part 5.

Entering the Thread Pool Without TPL

You can't use the Task Parallel Library if you're targeting an earlier version of the .NET Framework (prior to 4.0). Instead, you must use one of the older constructs for entering the thread pool: ThreadPool.QueueUserWorkItemand asynchronous delegates. The difference between the two is that asynchronous delegates let you return data from the thread. Asynchronous delegates also marshal any exception back to the caller.

QueueUserWorkItem

To use QueueUserWorkItem, simply call this method with a delegate that you want to run on a pooled thread:
static void Main()
{
ThreadPool.QueueUserWorkItem (Go);  
 
ThreadPool.QueueUserWorkItem (Go, 123);
  Console.ReadLine(); }
a)   // data will be null with the first call.
static void Go (object da t {
onsole.WriteLine ("Hello from the thread pool! " + data); }
 
C
Hello from the thread pool!
Hello from the thread pool! 123
Our target method, Go, must accept a single object argument (to satisfy the WaitCallback delegate). This provides a convenient way of passing data to the method, just like with ParameterizedThreadStart. Unlike with Task,QueueUserWorkItem doesn't return an object to help you subsequently manage execution. Also, you must explicitly deal with exceptions in the target code — unhandled exceptions will take down the program.

Asynchronous delegates

ThreadPool.QueueUserWorkItem doesn’t provide an easy mechanism for getting return values back from a thread after it has finished executing. Asynchronous delegate invocations (asynchronous delegates for short) solve this, allowing any number of typed arguments to be passed in both directions. Furthermore, unhandled exceptions on asynchronous delegates are conveniently rethrown on the original thread (or more accurately, the thread that callsEndInvoke), and so they don’t need explicit handling.
Don’t confuse asynchronous delegates with asynchronous methods (methods starting withBegin or End, such as File.BeginRead/File.EndRead). Asynchronous methods follow a similar protocol outwardly, but they exist to solve a much harder problem, which we describe in Chapter 23 of C# 4.0 in a Nutshell
Here’s how you start a worker task via an asynchronous delegate:
  1. Instantiate a delegate targeting the method you want to run in parallel (typically one of the predefined Funcdelegates).
  2. Call BeginInvoke on the delegate, saving its IAsyncResult return value.
    BeginInvoke returns immediately to the caller. You can then perform other activities while the pooled thread is working.
  3. When you need the results, call EndInvoke on the delegate, passing in the saved IAsyncResult object.
In the following example, we use an asynchronous delegate invocation to execute concurrently with the main thread, a simple method that returns a string’s length:
static void Main()
{
Func method = Work;  
 
IAsyncResult cookie = method.BeginInvoke ("test", null, null);
  //   // ... here's where we can do other work in parallel...
ine
  //   int result = method.EndInvoke (cookie);   Console.Write L("String length is: " + result); }
Length; }
static int Work (string s) { return s .
EndInvoke does three things. First, it waits for the asynchronous delegate to finish executing, if it hasn’t already. Second, it receives the return value (as well as any ref or out parameters). Third, it throws any unhandled worker exception back to the calling thread.
If the method you’re calling with an asynchronous delegate has no return value, you are still (technically) obliged to call EndInvoke. In practice, this is open to debate; there are noEndInvoke police to administer punishment to noncompliers! If you choose not to callEndInvoke, however, you’ll need to consider exception handling on the worker method to avoid silent failures
You can also specify a callback delegate when calling BeginInvoke — a method accepting an IAsyncResult object that’s automatically called upon completion. This allows the instigating thread to “forget” about the asynchronous delegate, but it requires a bit of extra work at the callback end:
static void Main()
{
Func method = Work;  
 
method.BeginInvoke ("test", Done, method);
  // ...   // }
tring s) { return s.Length; } static void Don
static int Work ( se (IAsyncResult cookie) {
, int>) cookie.AsyncState;   int result = target.EndI
  var target = (Func
}
The final argument to BeginInvoke is a user state object that populates the AsyncState property of IAsyncResult. It can contain anything you like; in this case, we’re using it to pass the method delegate to the completion callback, so we can call EndInvoke on it.

Optimizing the Thread Pool

The thread pool starts out with one thread in its pool. As tasks are assigned, the pool manager “injects” new threads to cope with the extra concurrent workload, up to a maximum limit. After a sufficient period of inactivity, the pool manager may “retire” threads if it suspects that doing so will lead to better throughput.
You can set the upper limit of threads that the pool will create by calling ThreadPool.SetMaxThreads; the defaults are:
  • 1023 in Framework 4.0 in a 32-bit environment
  • 32768 in Framework 4.0 in a 64-bit environment
  • 250 per core in Framework 3.5
  • 25 per core in Framework 2.0
(These figures may vary according to the hardware and operating system.) The reason there are that many is to ensure progress should some threads be blocked (idling while awaiting some condition, such as a response from a remote computer).
You can also set a lower limit by calling ThreadPool.SetMinThreads. The role of the lower limit is subtler: it’s an advanced optimization technique that instructs the pool manager not to delay in the allocation of threads until reaching the lower limit. Raising the minimum thread count improves concurrency when there are blocked threads (see sidebar).
 The default lower limit is one thread per processor core — the minimum that allows full CPU utilization. On server environments, though (such ASP.NET under IIS), the lower limit is typically much higher — as much as 50 or more