Tuesday, September 17, 2013

Documenting Technical Design - Ideas Smorgasbord

I get asked a lot what Technical Design is, sometimes what it isn't, and what to include in a technical document. As always the answer is it depends.  It depends on your audience, their skill set and level, and the environment. 

Here's a few ideas that I have found work.  This isn't a comprehensive list that should always be included, again think about who your audience is and would they be interested (relevance) in this section.  Consider the analogy of building a car: The person who will use it doesn't care about the tensile strength of the second and fourth piston. They care about how to drive it, maybe maintaining it, and what to do when something abnormal happens. But remember it depends on the audience, the mechanic will want different information, but he still won't necessarily care about the technology used to build the piston.

Technical Design should consider the following topics/sections: (In no particular order).

Tech Design is not:
  • A complete War And Peace of how every line works. (Tech Design should be as short as possible using as many diagrams as possible to avoid text).
  • Sequence Diagrams based exactly on code or Class diagrams based exactly on code. (This will go stale too quickly and creates a maintenance nightmare)
  • An install guide. 
  • Xml Code Comments/SandCastle. 
  • Any Documentation Generated from code. 
  • An API Guide. 
  • A Test Plan. 
  • How-to-use / user guide (this should be a separate document).

A See Also Section.
    List links to other relevant or related information.
Overview Section.
    No more than two paragraphs.  This is for business or non-techies who just want to know something about how the system works and how it fits into
    the bigger picture with other systems. What is it used for? What problem does it solve?
System Context.
    A single diagram showing this system as a single box and the other systems it integrates into.
Main Sequences and Flows.
    This is the main or significant work flows thru the system.  Use Abstract high level sequence diagrams. Each line should not represent a class but
    a major component / library / namespace within the system. Do not get into class and object detail, it will get out of date too quickly.
Hosting and Deployment.
    List all different options on how to host in a production environment. Also state the recommended or preferred hosting model.  State why there are
    multiple models.  Use deployment diagrams not text.  Try to show one box per deployed application, not one box per library. Draw lines only where
    external calls are made between applications.  Add text to each line describing http/tcp, authentication, authorisation, json/soap etc.
    Could include details of configuration options and environment settings required, if necessary.  This is not a installation guide.
How to integrate.
    Link to a document (best to keep it separate as it will not be relevant to all readers and could be long) describing how a developer takes the
    system as a framework and uses it. How do they install it, use it, and what do they need to do, how long is this getting up and running process
    going to take.  Obviously only applies if the system is a framework for developers rather than for consumers.
    Is the system multi-tenanted. If so how have you acheived this, trade-offs made, limitations etc.  How many tenants can it handle, how few?
    Multi-threading? STA? UI Threading? ASP.Net async? Task Factory - how?
Code Organisation.
    How is the code organised and why.  Possibly code metrics too so another party can get a sense for the size and complexity of the application.
Service Contracts, Endpoints and Data Contracts
    Describe all inbound services and their contracts. Describe all outbound service calls, their purpose, when they happen.  Also define or link to a
    data dictionary that describes all SOAP / Json payloads and fields.
Diagnostic Logging, Monitoring and Auditing
    Database tables involved.  What logging framework? Links to configure it. Default config (debug/release builds). What is auditing vs logging.
    Why is auditing required.  Performance counters and definition.  Other standard recommended performance counters. How do the Ops team support
    the system? Monitor it? Can they tell when it is failing before it fails? Windows Event Log. Health / Support service endpoints that might be available.
    (Ping, heartbeat, health check services).
    Estimated usage stats (year 1,2,3). Estimate db size, requests per second, tenants, 24 hour (or other time period) load patterns.
    Where are the fail points from a business perspective? Is 1000ms per request ok? 3000? 10,000ms? 2 hours to process the last item in a queue behind a
    large batch?
    Exceptions: Different scenarios - (db is down, external 3rd party service is down, one server in the farm is down, entry point service endpoint is down,
    validation issues with inbound service data, inflight data cannot be updated back to db after reading it, system crashes during processing - how to
    recover? etc)  General exception strategy within the app. Where are they logged or not.
    What security is applied to all endpoints.  User security. Pen testing considerations. OWASP. STRIDE. Known security concerns to address and how they
    are addressed.
    Trusted subsystem?
    Required infrastructure security.
    Deployment time required security config not covered by installer.
Configuration Options for Runtime System.
    Are there runtime settings applied immediately without restart? App.config only? Other config? How are server farms updated?
    All config options listed and described. Or links to other doc.
Database Schema
    Data Dictionary or link to
    Describe data access strategy - stored procs, ORM, both, etc

Backwards Compatibility
    Are the existing devices out there with a version of the application. Ie are you upgrading backend services and there are existing UI's out there?
    Forwards Compatibility: How will future versions be compatible with this?
    Are your service contracts / Json transmissions versioned adequately?
    Are you recommending to clients that the URL contain a version number?
Other important / architecturally significant items
    Queueing / ServiceBus / Asynchronous design
    Installation issues - not an install guide
    Availability (do you need a draining shutdown, how much down time can you get / need)


No comments:

Post a Comment