Monday, January 18, 2010

Currently I am working on upgrading an existing complicated spreadsheet which makes extensive use of VBA.  It is used for formulae prototyping and then the code is translated (manually) into C#.  This is far from ideal not to mention slow so I am converting it to use VSTO.

Personally, I am not a big fan of VSTO it feels like a hack to allow .Net developers to extend Office applications (usually Excel).  But under the hood it is basically wrapping the code into a COM wrapper to allow it to work.  Can you believe that after ten years of .Net, there is still no native support for .Net inside Office applications. Ok, thats my rant for the day.

The process is pretty straight forward, using the templates provided in Vs2008, however there are a number of "gotchas" I thought I would record for when I come back to this project some months later and wonder why.

  1. TURN OFF RESHARPER!!
    Currently I am using Resharper for C# v4.5, and although this is one of my favourite tools of all time, it doesn't play nice with Excel and VSTO. Turn it off using Tools-Addin Manager and untick the box in Vs2008.  With it on it will consistently crash (hang more precisely) Vs and Excel about every 2-4 minutes.

  2. When writing data to an Excel sheet, do not loop through many cells writing data this will be very slow.  Obviously there is a lot of overhead writing data to a cell.  Instead write it to a jagged array and write the array in one chunk to a range. Using a named range will probably be tidiest.

              
                var lTable = new object[calc.Term + 1, 10];
                foreach (var lLine in calc.CashflowSolution) {
                    // Get Data - details not important
                    int lPeriodNumber = ++lLineNumber;
                    DateTime lPeriodDate = GetDate();
                    var lPrincipal = GetPrincipal();
                    var lPayment = GetPayment();
                    var lClosingBalance = GetClosingBalance();
                    var lInterest = GetInterest();
                    var lAdminFee = GetAdminFee();
                    var lTotalInstalment = lPeriodNumber == 0 ? 0 : GetInstalment();

                    lTable[lRow, PeriodColumn] = lPeriodNumber;
                    lTable[lRow, DateColumn] = lPeriodDate;
                    lTable[lRow, PrincipalColumn] = lPrincipal;
                    lTable[lRow, InterestColumn] = lInterest;
                    lTable[lRow, PaymentColumn] = lPayment;
                    lTable[lRow, ClosingBalanceColumn] = lClosingBalance;
                    lTable[lRow, AdminFeeColumn] = lAdminFee;
                    lTable[lRow, TotalInstalmentColumn] = lTotalInstalment;
                    lRow++;
                }

                Range lRange = this.get_Range(string.Format(CultureInfo.InvariantCulture, "B40:I{0}", lTable.GetUpperBound(0) + 40), Missing.Value);
  3. When using an object array to "paste" data in one big hit into a sheet, be sure the clear the area first, otherwise the second invocation could produce a smaller array than the first leaving some bad data!
  4. There are at least two commonly used overloads for getting a Range.
    this.InterestAccrualTable.ClearContents(); // A named ranged defined in the sheet as B2:C10000
    int lNumberOfDays = 1082;
    var lTable = new object[lNumberOfDays + 1, 2];
    // Overload 1
    Range lRange = this.get_Range("B2", "C" + lNumberOfDays + 2);
    // Overload 2Range lRange = this.get_Range(string.Format(CultureInfo.InvariantCulture, "B2:C{0}", lTable.GetUpperBound(0) + 2), Missing.Value);

    Overload 1
    The first overload seem to produce "#N/A"'s underneath the range specified in the method.

    Overload 2
    Works much better:

  5. I'm currently having deployment problems as well. This is the error message I'm getting after installing on a target machine (with Office 2007 SP2).
    "The common language runtime could not be loaded by ..."

    After dealing with various issues (ensuring .Net runtime, Office 2007 Sp2 etc), I am starting to see VSTO is shakey at best to deploy and use.  I read about a known issue with Office apps not being able to reliably select the appropriate .Net runtime.  The recommendation in a forum was to create an Excel.Exe.Config file...
    <?xml version="1.0"?> 
    <configuration> 
        <startup>
            <supportedRuntime version="v2.0.50727"/>
        </startup>
    </configuration>
    This, however does not work in my case.

    I found another article recommending not using a Setup Project (which is what I was doing here) but rather use the Publish feature by right clicking the VSTO Project. This creates a folder containing various files, most importantly a Setup.Exe.

    This fixes the issue of being able to install on a target machine, however the install process is not customisable! So you cannot add a shortcut etc.

Friday, January 15, 2010

PLINQ

Here's a summary of some great new (and some improved) features in LINQ and most importantly PLINQ.


PLINQ will actually be slower than LINQ to objects if there is no expensive filtering, or projection process.  Here are some examples when using PLINQ is a good idea.

Where selecting has a time consuming process:

IEnumerable<int> src = ...
var query = 
      src.AsParallel()
      .Where(x => x % 2 == 0)
      .Select(x => Foo(x));
Using ForAll with an expensive projection function and possibly an expensive loop:

int[] src = Enumerable.Range(0, 100).ToArray();
var query = src.AsParallel()
             .Select(x => ExpensiveFunc(x));

int resultSum = query.ForAll(
      x => Console.WriteLine(x)
);
Using a Partitioner to ensure load balancing:

int[] src = Enumerable.Range(0, 100).ToArray();
var query = Partitioner.Create(src, true).AsParallel()
             .Select(x => ExpensiveFunc(x));

foreach(var x in query)
{
      Console.WriteLine(x);
}

An expensive filter:

int[] src = Enumerable.Range(0, 100).ToArray();
var query = src.AsParallel()
             .Where(x => ExpensiveFilter(x));

foreach(var x in query)
{
      Console.WriteLine(x);
}
Sequence zipping (takes two elements from two input arrays and outputs one resulting element):

int[] arr1 = ..., arr2 = ...;
int[] results =
      arr1.AsParallel().AsOrdered()
      .Zip(
           arr2.AsParallel().AsOrdered(),
           (arr1Elem, arr2Elem) => ExpensiveFunc(arr1Elem, arr2Elem))
      .ToArray();
A reduction extension method:

public static double Average(this IEnumerable<int> source)
{
      return source.Aggregate(
             () => new double[2],
             (acc, elem) => {
                   acc[0] += elem; acc[1]++; return acc;
             },
             (acc1, acc2) => {
                   acc1[0] += acc2[0]; acc1[1] += acc2[1]; return acc1;
             },
             acc => acc[0] / acc[1]);
}

Wednesday, January 6, 2010

Accessing Internal Members

5-January-2010
Happy new year!  Just had to document this quickly. Accessing internal members is necessary for testing, but also sometimes for production code.  I found this to be the case optimising a day-end process which read some database records copied them elsewhere and needed to save. Unfortunately by doing so resets the connection object inside the persistence layer (an old proprietary ORM unchangeable right now). This meant that the records then needed to be re-read, to prevent access to stale objects that would not have future changes updated into the database.

This was fixed by calling an internal method on the connection object that registers database entities into an internal collection.  A bit of a hack, but fixes the performance issue without getting caught up in fixing a persistence layer that will be replaced soon.

But there were two possible approaches.  
1) You can grant another library permission to access its internal (or friend in vb) members; 
2)  Or you can allow access to a internal member with an extension method.

Grant access to all internals
Lets start with granting access to another library to access the internal members. This will give the other library full access to all internal members in the source library. This will only work if both libraries are signed with a strong name. To find out how to do that go here.

Open the library (DLL) that contains the internal members you would like to access externally.  Find the assemblyinfo.vb (or .cs in c#) and add the following line at the bottom:



[assembly: InternalsVisibleTo("MyUnitTestLibrary, PublicKey=002400000480000094000000060200000024000052534131000400000100010015b2942f80eea6e7c6ee614974fefc2820236c27c4c0dd221c6469178e7e34884466e5dfb8f1958f0feff6dc71f00176927ce3a3d360b50c08880efa9de0a8cb33a80d4cf3ed6849a5f1d0699b64346f94b505c2b6c585b1e0dd3929640ad3f23ce35a0f79e6539dd74f1ab9f25c9a366124f1d84117126b7c5d83e37fbc6ddb")]

You will need to import "Imports  System.Runtime.CompilerServices".

Replace the "MyUnitTestLibrary" with the name of the other library where you need to call the internal methods of this library.  

The next thing to do, is to obtain the public key signature data.  This is done by using the Sn.exe command line utility.
Use the following command line (in which the parameters are case sensitive):
sn -Tp YourDllNameHere.dll

Output should look something like this:

Microsoft (R) .NET Framework Strong Name Utility  Version 3.5.30729.1
Copyright (c) Microsoft Corporation.  All rights reserved.

Public key is
002400000480000094000000060200000024000052534131000400000100010015b2942f80eea6
e7c6ee614974fefc2820236c27c4c0dd221c6469178e7e34884466e5dfb8f1958f0feff6dc71f0
0176927ce3a3d360b50c08880efa9de0a8cb33a80d4cf3ed6849a5f1d0699b64346f94b505c2b6
c585b1e0dd3929640ad3f23ce35a0f79e6539dd74f1ab9f25c9a366124f1d84117126b7c5d83e3
7fbc6ddb

Public key token is ee02eaa036161368
Copy and past the public key data into the assemblyinfo file, obviously removing the carriage returns.

Build both libraries and you can now access the internals.  This is normally fine for testing but to expose all internals for only one use case feels dirty for production code.

Grant access to one method using an extension

Heres the class that owns the internal member.

Public Class Connection
   Friend Sub RegisterObject(newObject As BaseEntity)
        ' Inner workings not important
    End Sub
End Class

And the use case from the other library...

Dim objects As IEnumerable(Of BaseEntity) = connection.GetObjects(...) 'How to get objects is unimportant...
Me.PerformSomeOperation(objects)
connection.Commit() ' This resets the internal collection and the above array of objects is now stale and unusable, all changes to them will not be saved.
For Each(Dim x As BaseEntity In objects)
    connection.RegisterObject(x)  ' Need to be able to do something like this, but can't right now.
Next

The plan is to add an extension to the library that contains the extension.

Imports System.Runtime.CompilerServices

Namespace Extensions
    Public Module ContainerJobExtensions
        <Extension()> _
        Public Sub RegisterObjects(ByVal instance As Connection, ByVal objects As BaseEntity())
            Array.ForEach(Of BaseEntity)(objects, AddressOf instance.RegisterObject)
        End Sub
    End Module
End Namespace

Now with that I can re-register the objects with the connection like so...

Dim objects As IEnumerable(Of BaseEntity) = connection.GetObjects(...) 'How to get objects is unimportant...
Me.PerformSomeOperation(objects)
connection.Commit() ' This resets the internal collection and the above array of objects is now stale and unusable, all changes to them will not be saved.
connection.RegisterObjects(objects)

By putting the extension in another namespace you can selectively import this only into the use case where you need to use it.

Done!