The new Task Parallel Library included in .NET4 is an incredibly easy way to parallelise processing, that would otherwise have to be done with such devices as SpinLock, Semaphores, Monitors and others. Unfortunately those previous devices were crazy easy to get wrong, and remember how they worked after 6 months has elapsed.
One of my favourite sessions at TechEd this year showed in some detail how to Parallise various code fragments using the new constructs.
Parallel Loops
Loops are the easiest place to start (shortly followed thereafter with reconsidering all LINQ statements).
namespace TPLTest1
{
using System;
using System.Threading;
using System.Threading.Tasks;
public static class Program
{
public static void Main(string[] args)
{
Action a1 = () =>
{
Thread.Sleep(500);
Console.WriteLine("Action1");
};
Action a2 = () =>
{
Thread.Sleep(500);
Console.WriteLine("Action2");
};
Action a3 = () =>
{
Thread.Sleep(500);
Console.WriteLine("Action3");
};
Action a4 = () =>
{
Thread.Sleep(500);
Console.WriteLine("Action4");
};
Parallel.Invoke(a1, a2, a3, a4);
Console.WriteLine("Finished");
Parallel.For(0, 4, index =>
{
Thread.Sleep(500);
Console.WriteLine("Enumerator " + index);
});
Console.WriteLine("Finished For");
}
}
}
The handy thing about these new For constructs are the threads will be synchronised back into the main thread after the loop. Here's the output:
Action1
Action2
Action3
Action4
Finished
Enumerator 2
Enumerator 1
Enumerator 0
Enumerator 3
Finished For
Press any key to continue . . .
PLINQ
To be able to effectively and safely filter a collection and copy results into a new collection you used to have to do something like this:
IEnumerable<RaceCarDriver> drivers = …;
var results = new List<RaceCarDriver>();
int partitionsCount = Environment.ProcessorCount;
int remainingCount = partitionsCount;
var enumerator = drivers.GetEnumerator();
try {
using (var done = new ManualResetEvent(false)) {
for(int i = 0; i < partitionsCount; i++) {
ThreadPool.QueueUserWorkItem(delegate {
while(true) {
RaceCarDriver driver;
lock (enumerator) {
if (!enumerator.MoveNext()) break;
driver = enumerator.Current;
}
if (driver.Name == queryName &&
driver.Wins.Count >= queryWinCount) {
lock(results) results.Add(driver);
}
}
if (Interlocked.Decrement(ref remainingCount) == 0) done.Set();
});
}
done.WaitOne();
results.Sort((b1, b2) => b1.Age.CompareTo(b2.Age));
}
}
finally { if (enumerator is IDisposable) ((IDisposable)enumerator).Dispose(); }
Now you do this:
var results = from driver in drivers
where driver.Name == queryName &&
driver.Wins.Count >= queryWinCount
orderby driver.Age ascending
select driver;
Crazy easy.
One of the few options you might need to consider when using PLINQ is the partitioning algorithm used. Here's a great slide from Ivan Towlson session at TechEd NZ showing the different algorithms in action:
I believe the Chunking algorithm is the default, and usually is an ok choice for most things. Except if you are looking for a certain grouping of data, for example searching a list of people and processing the first one of a series of duplicates, then you should use the Hash algorithm.
Tasks
Best described by two more slides from the same session:
Awesome.
http://msdn.microsoft.com/en-us/library/dd992041.aspx
ReplyDelete