Hemiposterical: SSDM

Showing posts with label SSDM. Show all posts

2011-07-20

What should you deliver?

Writing code is all very well, but what should you deliver - and when? I've seen more projects come a cropper over delivery than from any other cause. The classic failure is that the project spends week after week in development mode, has a first tentative delivery which isn't good enough, then spends many more weeks in test-fix-test-fix mode. Finally the sponsor has enough and pulls the plug as she realises there is no way of showing with any confidence that the system will be delivered in a reasonable working state in any acceptable timeframe.

The alternative is that the sponsor gets a delivery each week or two but has absolutely no control over what is actually delivered by his team; it may represent a step towards the functionality he requires, or may just be the Nth refactoring of a module that a developer is honing towards perfection. This could be considered the Royal Mail approach.

After whatever prototyping and requirements work is appropriate, your team's first priority should be to deliver something that represents the system running end-to-end accompanied by a testing / QA toolset that allows you to test the functionality of the key parts of the system. This gives you an immediate basis for deciding whether any new delivery is acceptable - does it represent a strict improvement in functionality / reliability / performance to the system as it stands?

From that point, you should have a fairly clear idea of the major points of functionality that are deficient compared to the requirements. That gives you your first category of deliveries - those that represent a single distinct feature. You can easily verify whether or not the functionality is present, although verifying that it is complete is likely to require substantial manual QA effort.

Bugfixes are another category of delivery which are crucial to success. The prerequisite to efficient bugfixes is an effective bug tracking system - the delivery should specify exactly what bug it fixes, and there should already be tests that verify the bug's presence or absence.

It is possible that your delivery / release process is not automated and requires substantial manual effort and risk for each release. The obvious solution is to roll up many individual changes into one delivery. This is a recipe for disaster. If something breaks, how do you determine what the cause was? If the changes come from several developers, who bears responsibility for the delivery release? If your answer is "everyone who contributed", I admire your optimism.

2011-07-12

Of FADECs and Failures

I've been in the software game a while, but it's quite telling that I remain mildly astonished when any program runs through to completion without raising any errors. Note that errors are distinct from crashes; it is nearly always possible to write a program which is crash-free, but error-free is a little trickier. See for instance this snippet of Python:

#!/usr/bin/python
from errorprone_code import main_program
from time import sleep
complete = False
while not complete:
  try:
    main_program()
    complete = True
  except Exception, err:
    print "Strewth! %s" % err
    sleep(1)

which should be crash-free, but we clearly have not made the main_program() run any more free from errors.

The ongoing furore about the Chinook helicopter crash into the Mull of Kintyre in 1994 is primarily focused on the FADEC (full-authority digital engine controller) and whether it is reasonably possible that a FADEC failure could have induced the crash, or at least contributed to it. The best write-up I've found so far on the topic is from the House of Lords inquiry in 2002. I'm wary of any inquiry conducted by the Air Force itself (the original Board of Inquiry by two Air Marshals, for instance) due to the incentives to cover up procurement or operational screw-ups. I'm equally wary of any study by outside "experts" commissioned by politicians as they are incentivised to produce the result that the commissioning politicians would like. The Lords seem to be the least amenable to influence, and are generally diligent and relatively impartial.

The essential problem with the FADEC code that Boeing wrote for the Chinook HC2 and that Boscombe Down disliked so much was that it was unverifiable. EDS-Scicon reviewed the code and found "486 anomalies" in the first 18% of the code they checked. The problem here is that we don't know what those 'anomalies' were. I've done any amount of code review under a wide range of analysis criteria, and 'anomaly' can mean practically anything. It can mean an uninitialised variable value being used (bad, definitely needs fixing), an unreachable code path (generally safe but needs explaining), an inconsistency between comments and code (potentially dangerous if the code was incorrect, just annoying if the comment is incorrect) or just a violation of coding guidelines (e.g. a variable name in StudlyCaps instead of underscore_separated style). Boscombe Down's main concern was that the code was structured in such a way that it was not amenable to any useful form of analysis. In other words, they couldn't tell with any degree of certainty where it might be incorrect or unsafe.

There is a very large gap between "unverifiable" and "incorrect". Tony Hoare's quote from his Turing Award lecture comes to mind:

There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult. It demands the same skill, devotion, insight, and even inspiration as the discovery of the simple physical laws which underlie the complex phenomena of nature.

Unverifiable code in a safety-critical system is clearly bad. That doesn't mean that it's actually wrong, nor that it caused the crash. You certainly wouldn't want to let an aircraft with unverifiable engine code into service, but Boscombe Down was overruled by MoD (no doubt a conversation along the lines of "we've already bought the damn things, we'd look pretty stupid if we didn't let them fly"). There did appear to be real problems with the FADEC, including uncommanded engine run-ups experienced on the Chinook HC2, which doesn't surprise me in the least. But as long as the Chinooks flew in regular flight regimes, with standard power settings, they'd be running through the best-tested parts of the FADEC code which would therefore be the least prone to error. There's nothing in the crash which indicates any abnormal engine operation, commanded or uncommanded.

(For the record, here's what I believe. I do not believe that the FADEC failed in any significant way around the time of the crash. I think the crash was a classic controlled flight into terrain, in very bad visibility. I think that the two pilots, both flight lieutenants who were flying more than their recommended hours, were pressured into making the flight in circumstances where they might otherwise have delayed and waited for better flying conditions. We will never know exactly what happened in that cockpit, but there are plenty of people in Boeing, Textron, MoD Procurement and the RAF senior officers who contributed to this crash in some way. Blaming the pilots alone is deeply unfair and smacks of some pretty disgusting expediency by the MoD and RAF.)

Producing code which is effectively free from errors is possible but very expensive. That expense may be justified, if failure would be even more expensive. More likely is that the occasional error would be acceptable as long as it is handled safely (e.g. an engine controller hitting an error condition re-initialises itself, thereby refusing operator commands for a few seconds, and logs that an error has occurred). Even more likely is that the developers hack something together that mostly works, test it as much as they can to remove the more obvious bugs, stick in exception handlers to manage the unexpected, and then charge the client for "functional upgrades" when they report operational errors or strange behaviour after the system has been accepted. But if you want a system that could possibly be made reasonably free of errors, it needs to be a design that is amenable to analysis. That is where Boeing / Textron failed in the FADEC design, and accepting a software system with such a design is where MoD Procurement and the RAF failed.

2011-07-13 Update: as expected, Lord Philip has overturned the verdict of gross negligence saying, in effect, there's sufficient doubt about the circumstances of the accident that the standard of proof for negligence can't be met. Sir William Wratten (who was Commander British Forces during Gulf War #1) and Sir John Day from the original RAF inquiry should feel suitably chastened, but I expect they won't.

2011-07-11

Picking the right tools and technologies

One thing that project managers almost get right is in spending time at the start of their project selecting the tools and technologies that they want to use. The only snag is that so many times they seem to get it completely wrong. What are they missing?

An illustrative anecdote: a project team was developing for an embedded system, written primarily in Ada, which was going to be compiled on a VAX system since the cross-compiler for the target hardware was only available there. All ten members shared a single VAX (remote terminal access from their Windows desktop) which was woefully underpowered for such a load. Each compile of even a small part of the system took many minutes; if you changed a public interface (Ada package specification) and needed to recompile the whole thing it would be half an hour. At least 50% of an active developer's day was spent waiting for compilations to complete. You could wipe out at least 30% of the remaining time due to the awkward VMS interface and primitive editor.

What was the alternative? The GNAT Ada compiler was freely available and ran just fine on Windows. It would compile the Ada 83 language just fine, and compile the development system in tens of seconds, not tens of minutes. Running on the desktop would allow any number of modern editors to be used (e.g. emacs, vim) which supported syntax colouring, better searching and revision control integration. Productivity would have doubled at a minimum. Once a system was passing all its tests on the PC, it could finally be cross-compiled and retested on the VAX. Ada is much better than C at preserving behaviour across different architectures so there would have been minimal changes required.

So what should the project manager look for in his tools and technologies?

Pick well-established development languages and supporting tools (e.g. database, httpd), ideally those that you or your team have already used for a successful project;
Choose the most recent version of a language or tool which has been in productive use for at least 6 months, not just the most recently released;
Plan for changing major versions of each language or tool at least once in the project lifecycle, e.g. Python 2.4 to Python 2.7, Postgres 8.4 to 9.x; have a very small number of places where this version change needs to be made;

Don't forget that the hardware on which you develop and test is also part of your tools:

Provide sufficient shared hardware to make life-like testing easy without developers or testers having to fight for resources;
Ensure that your company standard OS image already has the libraries and tools that you need for development and testing, and if they don't then establish immediately how you are going to get them added (and updated);
Know your hardware and software ordering process and lead times; you're going to need more than you initially expected, but won't yet know what (or have the figures to justify it)
Cost out one day of tester or developer non-productivity and one week of delivery slip and use this to justify your additional hardware / software requests

2011-07-06

Picking the right people

There are few decisions that will doom a software development project as surely as picking the wrong people for it. The problem, of course, is that the people you actually need on the project are quite rare and getting hold of them for your project even if they're already working in your firm may be tricky - if their current project manager is even slightly awake they will really not want to let them go.

Joel Spolsky reckons that his key criteria for hiring is "smart and gets things done". With all due respect to him (after all, his firm has bashed out some commercially successful software over the years) I don't think that's enough. I would modify that to "smart and gets the right things done". I've known any number of smart and productive people over the years who spend at least half their time doing work that never ends up being used - either it's irrelevant to the main thrust of development, or it does the right thing but in a way that's never going to scale. Someone who's always asking themselves "what does my current work actually do for the project?" will be at least partly aligned with your goals.

Get people who are familiar with the technologies you plan on using in your project. The ideal is to find people with experience developing either at the size of code base / complexity you are aiming at, or at worst one level below that (so for an estimated 100KLoC Python codebase you should find people who have written systems with at least 10KLoC and preferably at least 50KLoC of Python). Never use your project as the basis for testing a new technology - or, if you must, confine it in one place in your design and have a fall-back plan if the new technology doesn't cut the mustard.

Good developers need an ego - they have to take pride in producing the best possible system - but they also need to be able to take criticism and deal with it appropriately. If your developer is a prima donna, you're going to end up with the system that they want to build, and damn the customer.

Always consider the one-under-a-bus rule. Your team should be able to tolerate any single team member being run over by a bus, minimising the inevitable resulting delay to the project. This means that no team member may be irreplaceable, and you should ensure that each system component (which as noted above is probably developed by a single team member) has at least two team members who are capable of developing and testing it. If you're requiring that any code change be reviewed by another team member, this should fall out automatically. If you see a team member actively hoarding information and expertise, you should seriously consider dropping that person from the team. I assure you that ignoring the issue and hoping for the best will not improve matters.

You need to get your team size right, and my personal feeling is that the team should be as small as possible but no smaller. The problems caused by oversized teams, or teams that have people firehosed on them late in development, are well documented. Fred Brooks Jr's "The Mythical Man Month" is timeless, and peerless on this subject. Start by picking out your developers; you need at least two (one needs to check the other's work) but no two developers should be focused on a single part of the system. Slice up the design between developers.

Once you know the size of your development team, consider what you want to do about testing / QA. My finger-in-the-air rule is that the testing / QA headcount shouldn't be more than half the developer headcount, and quite possibly less. Perhaps you need more of them in the first couple of months when you're building out the unit/system testing and developer environments, and fewer in the middle phase before customers get their hands on the system.

If you have anyone technical on your team who is happy doing repetitive tasks, you need to re-educate them. With a small development team you don't have the spare resource for someone to spend their day pushing buttons. They should be automating wherever they can - everyone should be happy in a scripting language like bash, Perl, Python or (heaven forfend) .NET.

Don't forget the support that won't be part of your official team but is nevertheless vital - sysadmins who maintain your hosts, admin staff who handle your procurement and organisation. Don't try to do their job yourselves. A talented developer who is spending half his day deploying new OS images is not making good use of your limited time.

2011-07-04

Building the right system

If you want to build a software system that will make you (or your company) money, it's quite important to ensure that you build what your customer really wants. This is, please note, often very different to what your customer actually wants, what your boss wants you to build, what the chosen technologies allow you to build, or what you know how to build.

A classic example was the NHS über-screwup Connecting for Health, which was only successful from the viewpoint of the consulting and implementing companies that managed to squeeze a cool £10bn+ from Government before the public outcry became too loud and the relevant management saw the writing on the wall. The medical staff didn't want most of the functionality that was being built in, patients weren't interested in the much-vaunted "Choose and Book" functionality, and the Summary Care Records provoked privacy outcries. If you want to try building a massive centralised project like this, good luck, but please note that as a taxpayer I'm going to be lobbying for public whipping to be an integral part of the failure-to-deliver penalties.

So what do you need to asking before you start planning your system?

Who are the end-user customers?: Some poor schmucks are going to be the main users of your system once delivered, and a subset of them will be trialling out the early delivery. Know who these people are. Have an idea of their daily tasks, workflow, education, expertise, blind spots. Identify not just your "normal" users but also the pathological "experts" (who will try to make your system do things it was never designed to do, and expect it to keep up) and "abusers" (who will sadistically mis-enter data, jump forwards and backwards in the workflow changing and re-changing items and howl that the world is ending if so much as an unexpected warning box pops up).
Who's holding the purse strings?: Someone's going to be paying for this system to be developed; specifically, someone in the finance department is going to be cutting cheques (or the electronic equivalent) to you at various stages of delivery. Find out who this is, and what they need and want to see before they sign those cheques. This is going to lead you to ask:
Who does the purseholder listen to?: The purseholder is unlikely to have computer expertise beyond a grasp of Excel. They're going to have a "technical expert", who may or may not justify that title, who will tell the purseholder whether the system has met the requirements for the next cheque. You need to know exactly what that expert is really looking for, which will likely be a strict superset of:
What does your contract say that you must build?: If you're lucky, you'll get into this process before the contract is written, and you can get involved in the details of gateways, acceptance criteria, contract variation etc. You're seldom lucky, so are more likely to have the contract waved in your face as a fait accompli. Ensure that you know it backwards.

Given this knowledge about what you should be building, your next step should be to ensure that you're actually going to build this. Some of the pitfalls to avoid and tricks to employ:

Avoid early technology decisions: The temptation to nail down technologies at requirements time is nearly irresistable: "oh, I know the kind of thing that's needed, let's do Linux + Perl + Apache". It is extremely important to resist. Apart from anything else, you don't have enough information yet to know if your technology is good enough, can scale sufficiently or will be supported for the required timescale. To make a start on gaining this knowledge you need to:
Build a working prototype: Throw together something that demonstrates 50%+ of your system functionality, and (importantly) goes end-to-end. It doesn't have to scale, it doesn't have to be bug-free, it doesn't have to run on the target hardware. What it does have to do is allow end-users to play; to enter data, give you feedback on what works and what doesn't, tell you where they need it to be faster. Do not plan on any code in this prototype making its way into the production system, but do keep it working so that you can test e.g. proposed user interface changes.
Dogfood during development, if you can: Eating your own dogfood during systems development is an excellent idea to improve quality and usability. The idea is when the product in question is related to your daily work, e.g. a bug tracker or revision control system; however, even if it's a completely separate business function you can get some way towards this. As soon as it's in alpha release, get an end user or two sitting next you and using the new release. They have carte blanche to whack your team with a rolled-up newspaper and tell them what's irritating them or making them unproductive. It's amazing what can get fixed when the results of bugs are immediately apparent.
Early worst-case scaling: Once you have a good idea of the expected data size, performance requirements and target hardware, make a performance challenge system. Have some way of loading up 10x the required data and measuring the impact. Run your user test system on underpowered hardware. (Note: don't run your automated tests like this - these need to be fast to flush out errors ASAP).

The Software Systems Delivery Minefield

I'm starting a blog category talking around the aforementioned Software Systems Delivery Minefield (SSDM) and how to avoid getting your legs blown off.

SSDM is not to be confused with SSADM, a development methodology devised by the UK Government in the 1980's. This aimed to improve the reliability and predictability of IT systems development for Government use, and was the outstanding success that any experienced software developer could have predicted.

The scope of the SSDM blogs are as follows:

software systems, not just isolated programs;
covering the full lifecycle, from inception through development and delivery, to operation;
taking the viewpoint of testers, developers and the project manager;
limited to a team size from 2-10 people; and
technology-agnostic, trying not to prescribe a specific technology but rather enable the reader to form a view on what properties of their candidate technologies make them likely to either help or hinder.

I hope to be able to pass on some of the lessons I've learned and show a few of my scars
(in tasteful locations only) and would be interested in others' feedback on their experiences.