Blog.Rees.Biz: Parallel Extensions

Tuesday, August 31, 2010

Parallel Extensions

31-August-2010

The new Task Parallel Library included in .NET4 is an incredibly easy way to parallelise processing, that would otherwise have to be done with such devices as SpinLock, Semaphores, Monitors and others. Unfortunately those previous devices were crazy easy to get wrong, and remember how they worked after 6 months has elapsed.

One of my favourite sessions at TechEd this year showed in some detail how to Parallise various code fragments using the new constructs.

Parallel Loops

Loops are the easiest place to start (shortly followed thereafter with reconsidering all LINQ statements).

namespace TPLTest1
{
    using System;
    using System.Threading;
    using System.Threading.Tasks;

    public static class Program
    {
        public static void Main(string[] args)
        {
            Action a1 = () =>
            {
                Thread.Sleep(500);
                Console.WriteLine("Action1");
            };
            Action a2 = () =>
            {
                Thread.Sleep(500);
                Console.WriteLine("Action2");
            };
            Action a3 = () =>
            {
                Thread.Sleep(500);
                Console.WriteLine("Action3");
            };
            Action a4 = () =>
            {
                Thread.Sleep(500);
                Console.WriteLine("Action4");
            };
            Parallel.Invoke(a1, a2, a3, a4);
            Console.WriteLine("Finished");

            Parallel.For(0, 4, index =>
                {
                    Thread.Sleep(500);
                    Console.WriteLine("Enumerator " + index);
                });
            Console.WriteLine("Finished For");
        }
    }
}


The handy thing about these new For constructs are the threads will be synchronised back into the main thread after the loop. Here's the output:







Action1

Action2

Action3

Action4

Finished

Enumerator 2

Enumerator 1

Enumerator 0

Enumerator 3

Finished For

Press any key to continue . . .







PLINQ

To be able to effectively and safely filter a collection and copy results into a new collection you used to have to do something like this:







IEnumerable<RaceCarDriver> drivers = …;
var results = new List<RaceCarDriver>();
int partitionsCount = Environment.ProcessorCount;
int remainingCount = partitionsCount;
var enumerator = drivers.GetEnumerator();
try {
    using (var done = new ManualResetEvent(false)) {
        for(int i = 0; i < partitionsCount; i++) {
            ThreadPool.QueueUserWorkItem(delegate {
                while(true) {
                    RaceCarDriver driver;
                    lock (enumerator) {
                        if (!enumerator.MoveNext()) break;
                        driver = enumerator.Current;
                    }
                    if (driver.Name == queryName &&
                        driver.Wins.Count >= queryWinCount) {
                            lock(results) results.Add(driver);
                    }
                }
                if (Interlocked.Decrement(ref remainingCount) == 0) done.Set();
            });
        }
        done.WaitOne();
        results.Sort((b1, b2) => b1.Age.CompareTo(b2.Age));
    }
}
finally { if (enumerator is IDisposable) ((IDisposable)enumerator).Dispose(); }





Now you do this:





var results = from driver in drivers
              where driver.Name == queryName &&
                    driver.Wins.Count >= queryWinCount              
              orderby driver.Age ascending
              select driver;


Crazy easy.



One of the few options you might need to consider when using PLINQ is the partitioning algorithm used.  Here's a great slide from Ivan Towlson session at TechEd NZ showing the different algorithms in action:





I believe the Chunking algorithm is the default, and usually is an ok choice for most things. Except if you are looking for a certain grouping of data, for example searching a list of people and processing the first one of a series of duplicates, then you should use the Hash algorithm.



Tasks

Best described by two more slides from the same session:









Awesome.