The Birth of the Requirements Wiki

As a development organization we’re by no means perfect, though we’re constantly looking for ways to improve, and one of the ways in which we’ve historically had a lot of room for improvement is around internal documentation of requirements and feature specifications.  We’ve come up with what we hope will be a much better long-term solution, but before I describe what that solution is, I’d like to rewind and tell the story of how we ended up where we are right now.

Imagine that your company is writing a policy administration system (such a randomly chosen example, I know), and you’d like to know the answer to the question “How do policy renewals need to work?”  How would you go about trying to answer it?

Historically, we’ve done our requirements documentation and feature specification in a fairly old-school manner:  product management would write up a document describing the requirements for a new feature, development would up a design topic on the wiki for anything that requires some more thought and discussion, and development would write up more of a full specification afterwards describing how things actually work.  The end result is that the “requirements” for a given piece of functionality tend to be difficult to discern after the fact:  they’re scattered across a bunch of early-stage PM documents that are generally 1) deltas against each other, 2) don’t always resemble what was actually built, 3) tend to have a lot of ambiguities (since they’re written as normal prose), and 4) don’t capture any of the little things discussed and agreed upon by dev, PM, and QA over the course of actually building a feature.  The development specs (when they’re actually up to date) tend to describe how things actually are rather than what the underlying business requirements are, are also written as deltas, are written at a semi-arbitrary level of detail, and aren’t written for things like UI functionality, while the design topics serve more to explain why things were implemented as they were.  So if you want to know how policy renewals work your best bet is just to ask someone; the information is so scattered, out of date, and incomplete that it’s impossible to piece it back together.

We do a lot of test-driven development, so you might ask “But what about the tests?”  The agile philosophy is that the tests can often serve as the documentation, and that’s kind of true of well-written, complete unit tests (I still don’t think that’s 100% true, but that’s a different argument).  But the problem is that unit tests themselves are at too low a level to be useful for answering higher-level questions like “How does a policy renewal work?”  Even questions like “What happens when I click the ‘Add Vehicle’ button?” are difficult to document via tests because they require an entirely different level of tests than “unit” tests.  They require end-to-end tests, and those tests tend to be harder to write and harder to read; they’re also much more difficult to ensure completeness for, since you can’t measure test coverage using a tool or even match up the set of methods against your set of tests.  In addition, for infrastructure work the tests tend to help describe the implementation, not the high level requirements.

The other problem with using tests is, unfortunately, that they tend to get deleted when they break too badly; at some point it’s inevitable that some refactoring or other major change will break enough unit tests that you just don’t have the energy or inclination to fix them all right away, so you rewrite and fix what you can and just comment out or delete the ones you can’t.  That’s also true of tests that are written against the actual implementation rather than at some higher level of abstraction; if you change the implementation, all those tests are simply irrelevant, so you have no choice but to kill them.  That might not be ideal, but practically speaking that’s what actually happens in the real world where real people write real tests, and as such tests are a bit shaky to rely on as the sole source of documentation about business requirements.

“What about story cards?” you might ask.  Well, one unfortunate fact is that the policy team that I work on hasn’t used story cards in the past (we will be in the next release cycle).  One of our other teams does drive everything off of story cards, but even then I think there are some problems.  First of all, stories are inherently deltas, and over the course of a release or over many releases the same functionality is often continuously changed, making it difficult to piece together an answer to “How does policy renewal work?” because doing so requires assembling all the stories relating to renewals over the course of several releases in chronological order so that the appropriate deltas are applied in order.  Ouch.  Story cards are also inherently somewhat unorganized and can contain information relating to multiple different parts of the system, so just assembling that set of cards in the first place can be difficult.  Story cards would still be lightyears ahead of where we were a year ago, so perhaps if we’d had them we wouldn’t have built the tools that we did, but since we didn’t have those cards we had to find a different way to do things.

So that was our situation a year ago:  information about how things were supposed to work was largely in people’s heads, and we had scattered, generally untargeted end-to-end test coverage that touched many parts of the system.

That’s around the time we started to rework some major portions of our application, and before we started we thought the main risk we ran was that we’d break things without realizing it.  In order to mitigate that, we wanted to fill out all (or at least a good number) of the tests for a given area of the application before we changed things.  But how would we know we had “all” the tests and weren’t missing something?  Without any obvious “units” to test we’d have no chance, so we decided to make our own units.  They weren’t really stories in the traditional sense:  they were statements like “The ‘Add Vehicle’ button takes you to the ‘New Vehicle’ page” and “The ‘Clone Vehicle’ button clones all selected vehicles, cloning all of their fields except for the VIN number.”  Some of them could have been stories in the story card sense, but plenty of them were too fine-grained for story cards.  For lack of a better term, we decided to call them “requirements” instead.  Our process then became that we’d first attempt to reverse engineer the requirements for a page before we rewrote it, generally by reading any existing documentation and then by playing around with the page to see what it actually did.  After we wrote those down, we’d try to have them reviewed by the product managers for accuracy and completeness, and then we’d use the requirements to drive a set of tests around the page.  Ensuring a sufficient level of testing became much easier, because we could target the tests to the requirements just as you’d target unit tests to a method.  Once we were done we were pretty sure we’d catch most of the breaks we might introduce, and we’d go ahead with whatever refactoring/rearchitecting needed to happen.

The idea was the right one, I think, but we had questions over how exactly we’d manage the requirements docs.  What format would they be in?  How would we organize them so people could find them?  Hardest of all, how would we ensure they stayed up to date?  We really wanted to measure coverage of the requirements as well, so how would we do that?  To get the ball rolling we started out just using Google spreadsheets to track the requirements; the spreadsheet format ensured the requirements were relatively small and targeted (and hopefully unambiguous) line items instead of prose paragraphs describing things.  I even wrote a way, using annotations and the Google SOAP API, to create some simple HTML reports about what requirements had tests.  It was pretty clear that was a sub-optimal solution, but it was a start.

The question really became where to go with things:  if we wanted to try to cover our whole application this way and really drive a lot of our automated end-to-end testing off of it, we’d really need everyone on the team to be on board with it, and doing that would probably require some much better tool for managing things.  Thankfully we had an engineer who was fairly amazing at coming up with little tools to solve all sorts of development problems, and he agreed to take the lead on formalizing things and later driving adoption of the tool.  The end result was basically an addition to MediaWiki that we called the “requirements wiki” and was eventually nicknamed “The Riki”.  The modification added in some special tags for listing requirements, which would then (on a first update) assign them unique IDs.  It also allows you to tag requirements with labels like “agreed” and “implemented,” along with several other clever things.  The IDs can then be used as annotations for test methods to tie the test methods back to the requirements, and the Riki has a background process that periodically takes a build and processes all the annotations to link things up, resulting in the ability to display the current test methods inline with the requirements as well as coverage reports about what percentage of requirements have any tests at all as well as what percentage of the tests have actually been implemented.  The latter statistic allows us to add in empty test methods as a way of sketching out a test plan without implementing them all immediately.

The riki is still pretty young in its life, so the jury is still out on our ability to really keep it up to date.  So far, though, it’s proven useful as a way to coordinate dev, QA, and PM by giving everyone a shared, authoritative reference point about how things are supposed to work.  I’m hopeful that by making it an indispensible part of our development process we’ll manage to overcome the inherent problems with keeping documentation up to date and that it’ll drive clearer, less ambigious requirements, better testing, better communication between dev, QA, and PM, and serve as an ongoing reference for anyone new to the team or to a particular area.


4 Comments on “The Birth of the Requirements Wiki”

  1. Raoul Duke says:

    Thanks for sharing (seriously).

    Specs/reqs are so hard to do right. Things change so fast, and it really is an impediment to record the high level things with clumsy tools, and then have things do that inevitable change and then (a) not know things are out of sync through some automagic tool and (b) not having time to spend on re-syncing. so our tool and social experiment are interesting.

    So there’s a meme from smart people like Cockburn that writing down specs is a waste of time; the humans will do human communication and it will all work out just fine. I think that is clearly going to far in the opposite direction from the other unworkable extreme of filling out ISO9XYZ forms ad nauseum.

    dream – sort of like having a GUI drawer that creates code and which then somehow can parse from human changed code back to the GUI drawering, I wish there were something to sanity check the supposed bijection between the specs and the bytes.

    (http://attempto.ifi.uzh.ch/site/ looks cool for the top-down side of the equation but isn’t a complete holy grail solution or anything, and they said it was not generally available.)

    (p.s. the text edit field still doesn’t work right in ff (v. 3 now), it still lets me type stuff that is invisible on the rhs of the field.)

  2. Alan Keefer says:

    Hmm, I use FireFox 3 and the textbox seems to work fine for me . . . odd.

    Yeah, I think most of the agile guys rebel a little too much against documentation, though as with everything it probably depends on what size your team is, what you’re building, and how long it’ll be around for. Some sort of documentation is critical for larger teams and complicated, long-lived projects.

    I’m skeptical of any of those code/test/natural language tools; I’ve never see one work well enough, and the bar is really high. Requirements specifications need to be tight, consistent, clear, and unambiguous, so good writing skills are really important. On the other side, there tends to be an N X M mapping of reqs to tests in our code, so just extracting reqs based on the tests wouldn’t be very useful: you’d either end up with a too many requirements or with two few tests if you tried to make anything like a 1:1 mapping. Similarly, writing good, non-fragile tests is really, really hard (especially at the UI level) and requires some serious development skill, so my guess is that anything natural-language-y will be either too limiting to test the full class of things you’d like to tests or will generate reams of fragile tests.

    That said, our primary concern really is getting people to use the riki and keep it up to date, which means it 1) needs to be totally lightweight, 2) needs to be useful for everyone, and 3) needs at least some ways to avoid getting out of date. Using a modified wiki helps with part 1: it’s just wiki-text so you can free-form edit it, copy and paste, etc. pretty easily. We’ve actively avoided adding too many features to it (like any sort of enforcement of workflow, i.e. we decided not to try to unset the “agreed” tag when requirements change) because we want to keep it simple enough that people will use it. We’re trying to make it useful to everyone as a central communication point for dev, PM, and QA: if QA drives their testing and bug reporting off of it, they have an incentive to makes sure it’s kept up to date, and if dev drives their development (and their testing) at least somewhat off of it they’ll work to keep it up to date too. For PM (and all parties, really), it ideally cuts down on the number of questions since the answers are written down. Lastly, the test annotations help with the third point because the tests generally stay fairly up to date; if the app changes, the tests change, break, or get removed. So if a requirement doesn’t have tests, it’s maybe an indication the app has changed and the tests have been removed, so it’s at least a signal that the requirements might be out of date. Likewise, if the requirements change but the tests don’t, assuming we’re good about not re-using IDs we’ll end up with tests pointing to stale IDs, which we can detect and report on, which is yet another flag that it’s time to sync up the requirements better with the tests and the app behavior. It’s certainly not perfect, but so far it’s worked pretty well.

    The real test will be what happens as we really fill out the specification of more of the application using it (we’re maybe halfway there right now). Then we’ll really see what the holes are and what kind of maintenance burden there really is.

  3. Raoul Duke says:

    Re: matching specs & tests, reminds me I need to go re-read the FITness stuff.

  4. [...] Novel in 31 Days: Preparing for National Novel Writing Month Saved by GOGETHERE on Tue 30-9-2008 The Birth of the Requirements Wiki Saved by AronBranam on Mon 29-9-2008 New IIS7 Releases: URL Rewrite, Application Routing and Load [...]


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 38 other followers