Saturday, September 18, 2010

How to build a custom StyleCop Rule

18-September-2010

How to build a StyleCop Rule to enforce 'this' keyword use on fields.

First make sure you have the latest version of StyleCop.  I wrote my first rule against 4.4.  In my opinion, using StyleCop without Resharper and the StyleCop-For-Resharper plug-in is just plain silly.  These tools make it so easy to write code in accordance to established rules its trivial. Next you might find it very useful to download the full source for StyleCop, not necessary but I found it interesting.

Next you need to figure out how you are going to find the fragments not in accordance with your rule.  But first here's how StyleCop breaks down a file of code.

StyleCop Code File Structure
StyleCop breaks a code file down into Elements, statements and expressions.
Elements are large chunks of code, the red blocks below are elements. Basically, namespaces, types, and members.
Each statement line is obviously included in the statement category.  I never tested this by having two statements on one line, but having two statements one a line is an incredibly bad idea anyway, so you're on your own there. All lines above are included in this category except whitespace and braces.

Lastly each expression is then every granular item thereafter.  For example "this.apple = banana" is an expression and so is "this", ".", "=", and "\"banana\"".  The expression list therefore is significantly larger than the previous two categories.  Each expression also offers a list of child "tokens" which are the subordinate expressions to the current one.  The subordinate expressions will still be visited by the main loop despite being a child of some other expression.

StyleCop's Process 
StyleCop accepts three delegates from your custom rule, each one walking each of the above code element categories.  Ideally, if you can discover the fault with the line you can raise a Violation right away.  If not you will need to analyse the entire file then process the collected data afterwards.  (This is the case forJSL StyleCop rules - see below). Each of these expression tree walkers run on its own thread, so cannot easily share state during the process.  

Debugging a Rule
Ok, how to debug a rule?  I personally looked into writing unit tests against my rule, but found this far too tricky and time consuming.  The other option is to have two instances of Visual Studio open one to load some sample code, and the other with your rule code loaded and attach to the other's process.  The only downside to this, is when you make changes you will need to close both instances to update your DLL.  Suck it up, getting it done will be quicker than fiddling around with unit tests in this case. StyleCop isn't really designed for unit testing.  

More specifically you need to copy your built DLL than contains your rule into the StyleCop folder inside your Program Files (or Program Files (x86)) folder, just in the root StyleCop folder is fine (although sub-folders will be searched).  Your StyleCop assembly must also contain an XML file describing your rule.  This is the file StyleCop will use to add it to the Configuration tree.

<?xml version="1.0" encoding="utf-8" ?>
<SourceAnalyzer Name="Extensions">
    <Description>
        These custom rules provide extensions to the ones provided with StyleCop.
    </Description>
    <Rules>
        <Rule Name="InstanceVariablesThisPrefix" CheckId="EX1001">
            <Context>Instance fields should be prefixed with the 'this' keyword.</Context>
            <Description>Instance variables (fields) are easier to distinguish and scope when prefixed with 'this'.</Description>
        </Rule>
    </Rules>
</SourceAnalyzer>


Build a 'This' Rule
In my case I am building a rule to enforce only the prefixing of fields with the 'this' keyword. There is a default StyleCop rule for this, but it enforces it in all instance cases (basically where it can be used it should).
I have achieved this by using the Expression delegate of the Analyse Document Walker.  I can't use the element part because this is not focusing on individual statements.  I have chosen not to use the statement walker delegate because I found it easier to discover the fields and check for violations using the expressions.

There's  some code clean up required on the code, but here it is:


namespace CustomStyleCopRules
{
    using System;
    using System.Collections.Generic;
    using System.Diagnostics;
    using System.Linq;
    using Microsoft.StyleCop;
    using Microsoft.StyleCop.CSharp;

    /// <summary>
    /// This StyleCop Rule makes sure that instance variables are prefixed with this.
    /// </summary>
    [SourceAnalyzer(typeof(CsParser))]
    public class InstanceVariablesThisPrefix : SourceAnalyzer
    {
        private static readonly string RuleName = typeof(InstanceVariablesThisPrefix).Name;

        public override void AnalyzeDocument(CodeDocument document)
        {
            IDictionary<string, Field> fields = new Dictionary<string, Field>();

            var csdocument = (CsDocument)document;
            if (csdocument.RootElement == null || csdocument.RootElement.Generated)
            {
                return;
            }

            csdocument.WalkDocument(VisitElement, StatementWalker, ExpressionWalker, fields);
        }

        private static bool StatementWalker(Statement statement, Expression parentExpression, Statement parentStatement, CsElement parentElement, object context)
        {
            return true;
        }

        private static bool VisitElement(CsElement element, CsElement parentElement, object context)
        {
            return true;
        }

        private static bool ExpressionWalker(Expression expression, Expression parentExpression, Statement parentStatement, CsElement parentElement, object context)
        {
            try
            {
                if (parentElement != null && parentElement.Generated)
                {
                    return true;
                }

                var fields = context as IDictionary<string, Field>;
                if (fields == null)
                {
                    return true;
                }

                if (parentElement is Field)
                {
                    // We have found a field -  add it to the field collection. This assumes that fields are declared at the top
                    // of your class, which the of course they should be.
                    var field = parentElement as Field;
                    string name = field.VariableDeclarationStatement.Declarators.First().Identifier.Text;
                    if (field.Declaration.Tokens.Any(x => x.Text == "static") || field.Const)
                    {
                        // Ignore static fields. These do not require the 'this' prefix. So dont add them to the collection.
                        return true;
                    }

                    fields[name] = field;
                    return true;  // No need to check this expression further.
                }

                var tokens = expression.Tokens.ToArray();
                var keys = fields.Keys.ToArray();
                if ((expression.ExpressionType == ExpressionType.Literal && parentStatement.StatementType == StatementType.Return)
                    || (expression.ExpressionType == ExpressionType.Literal && parentStatement.StatementType == StatementType.Yield))
                {
                    // Check for already prefixed.
                    if (parentExpression == null || parentExpression.ExpressionType != ExpressionType.MemberAccess)
                    {
                        // short statements with no other tokens slip thru the cracks. Want to ignore singular literals,
                        // they have no children to figure out acceptability.
                        // for example return field; is one expression that has no child tokens that is an exception to this rule.
                        CheckForFieldUsingTokens(parentElement, expression, tokens, keys);
                    }
                }

                // Expressions that will never contain field references, an absence of tokens normally means a literal. We are unable to check
                // for compliance with the rule at this level because the 'this' part will be a sibling expression.  We need to check the parent
                // expression compliance by looking at it as a whole then looking at its tokens.
                if (expression.ExpressionType == ExpressionType.Literal || tokens.Length == 0)
                {
                    return true;
                }

                if (CheckForFieldUsingTokens(parentElement, expression, tokens, keys))
                {
                    return true;
                }

                return true;
            } catch (Exception ex)
            {
                Debug.WriteLine(ex.Message);
                throw;
            }
        }

        private bool CheckForFieldUsingTokens(CsElement parentElement, Expression expression, CsToken[] tokens, string[] keys)
        {
            if (parentElement is Constructor)
            {
                // If its a constructor and there are duplicate names in the tokens this indicates a syntax like this
                // this.apple = apple; where apple is a constructor argument.  Ignore these as they will be handled by another
                // rule.
                // Find duplicate tokens and they are also field names
                var query = from token in tokens
                            where keys.Contains(token.Text) && tokens.Count(t => t.Text == token.Text) > 1
                            select token;
                if (query.Any())
                {
                    return true;
                }
            }

            for (int i = 0; i < tokens.Length; i++)
            {
                var token = tokens[i];
                if (keys.Contains(token.Text))
                {
                    if (i < 2)
                    {
                        this.AddViolation(parentElement, expression.LineNumber, RuleName, "Fields must be prefixed with the 'this' keyword.");
                        return true;
                    }

                    if (tokens[i - 1].Text != "." && tokens[i - 2].Text != "this")
                    {
                        this.AddViolation(parentElement, expression.LineNumber, RuleName, "Fields must be prefixed with the 'this' keyword.");
                        return true;
                    }
                }
            }

            return true;
        }
    }
}



Resources:

No comments:

Post a Comment