Monday, 13 April 2009

Feature-creep in scientific code

[caption id="" align="alignleft" width="300" caption="Photo by blmurch"]Photo by blmurch[/caption]

(with thanks to Joe Z for the comment that inspired this post )

Feature-creep can be a real killer for your carefully crafted code.  Scientific code can be especially prone to it, simply because planning is hard when you're working on the edge of human knowledge.  This article explores the topic in a bit more detail.

What is feature-creep?
Feature-creep is when "what your code should do" changes.  Perhaps it now needs to fit a different statistical model, or you need to add some new data processing modules.  It might need to generate a different set of outputs to the ones you'd originally planned on.  You may even have hit on a better underlying technique that, while great and hopefully world-changing (!), requires your code to do substantially different things to those in your original plan.  In other words, the required features of your code have moved/changed/crept.


Why is it a problem?
Because you haven't planned for it.  You've previously spent time working out the best way to structure your code, to get all its different aspects working nicely together.  And now you're having to add new features which you hadn't anticipated.  Feature-creep effectively invalidates some or all of your planning, so your code starts to suffer from the problems that plague under-planned code.  Perhaps you find that the data structure you've been using for three months actually doesn't do what you need it to.  Maybe storing all the raw data in memory isn't actually practical because there turns out to be too much of it.  Any number of unforeseen problems can crop up that your code isn't well-suited to handle.


Refactoring as you go...
If you can't avoid feature-creep, then at least try to stay on top of it.  Refactoring isn't a substitute for having planned for all the features ahead of writing the code, but it can help a lot.  If you suddenly realise that chunks of your code need to change in order to best accommodate the new features, then take the time to do it!


Automated testing
Feature-creep implies significant changes to your code.  Such changes imply bugs.  So, you need to be rigorous about testing your changing code.  As you'll likely be testing many times, building some automated tests will save you a lot of time and help you spot problems as they occur.


Keeping control
Try to keep control of your code.  It will tend to get more complex, more messy and less well-understood (even by you) as more features creep in.  This is partly a consideration of refactoring, but it's also more basic than that.  Simple, tidy code is easier to handle than complex, messy code and so taking the time to prevent the latter happening will save you time and stress at every subsequent stage.


Back to the drawing board...
Sometimes, your list of features will have changed so much that it's sensible to stop coding and redo your plan.  Treat the old plan and your old code as a prototype that you've used to learn more about what the best design solutions are.  You've also used it to find a better list of features that are required.


There's a judgement call here, because you may well be discarding quite a lot of effort.  Try to weigh up if you'll save yourself time in the long run by keeping going with your current code, or whether it's better to bite the bullet and start a new version now.

Conclusions

The nature of developing scientific code means that feature-creep is often unavoidable - often, part of the process is figuring out different and better ways to do something.  This is a particular challenge, but with some care and attention there's no reason why it has to be a killer problem.

1 comment: