Monday 29 June 2009

The point of clever methods

The phrase "clever methods" is my label for statistical methods and/or algorithms that go beyond basic and/or standard approaches (which I think of as "vanilla methods"). Those of us whose research involves methodological work aim to write papers that detail new clever methods. Clever methods aim to go beyond the capabilities of the relevant vanilla methods in some meaningful way, as well as hopefully doing so without becoming intractably complicated or slow to run. In the same way that software engineers might be trying to craft better software for a given task, in researching clever methods we're trying to find better mathematical/statistical/algorithmic ways of doing something.

(By way of full disclosure, I should mention that I'm a fan of clever methods in that I really enjoy working on them and finding cunning and sneaky ways to make a method work better. This is great for motivation, but comes with the health warning to be careful to not make something more complicated just for the sake of it :-) )

Performance vs. complexity
Clever methods tend to be more complex than the equivalent simple methods. So, our goal in researching clever methods is often an attempt to trade off complexity for performance. The trick then becomes to minimise the increase in complexity, while maximising the improvement in performance. This is the benefit that most vanilla method possess; they provide reasonable performance in a very uncomplicated way.

In many case where reasonable performance is all we need, this is a very good solution. For example, if you're trying to detect local stars in an astronomical image, the signal-to-noise ratio of your image might be very high. In which case, even a basic method should be able to detect them all with little problem, meaning that such a choice will do everything you require and do so in a simple and easy-to-understand way (which is often a hidden benefit of vanilla methods).

When creating clever methods, it's very easy to ignore the complexity aspect and simply go all-out for performance (there are many, many papers for which this is true). While this can be okay if performance is so vital (relative to handling the complexity), usually this leads to methods that are so narrow in their application that they're not very useful.

Happily, there are also many cases where a little extra complexity gives you a significantly better method with which to work. And there can be other benefits; for example, if you generalise a vanilla method, the resulting clever method will probably be more complex but it may also be more reliable or allow the automation of some parts of its use. Consider a clustering method that has a well-defined, automated way for choosing the number of clusters into which to partition the data; the user no longer has to worry about doing this, thus saving them time. They probably don't care that the underlying maths is more complicated.

And of course very occasionally, you'll manage to create a clever method that's no more complex (or in extreme cases, less complex) than the simple method/s. Congratulations, you've probably discovered something genuinely importance!

The 10% improvement
Often, clever methods can provide and order 10% improvement (in whatever metric is important) over the simple method. The question then becomes, "is this worth the effort?"

The answer is "it depends". If you're trying to extract a signal from some noise, but the signal-to-noise ratio (SNR) is already 105 then increasing it by 10% may well be irrelevant. If on the other hand you're trying to detect signals right at the detection limit of your data, then it might be vital in uncovering that Nobel-winning new class of whatever. I've worked on astronomical source extraction where the data-set has had an effective cost of tens of millions of pounds (actually quite common when the data come from a space telescope). In this case (and assuming Gaussian noise), a 10% improvement in SNR using the simple extraction methods would require 20% extra data, at a cost of millions of pounds. Or you can just use the clever extraction algorithm.

One very important consideration in all of this is that if you develop a clever method that is reasonably general, then that 10% improvement will be a benefit many, many times.

The undiscovered country
So far I've focused on the mundane benefits of clever methods. There is also another aspect to consider. If your method of analysis is too simple, you might miss something important.

Think about a very rich, complex data-set where it's not obvious how to model the data. Gene expression measurements of whole genomes are a good example. We can certainly use simple methods to analyse this and to get some useful scientific results. But what if there is structure in the data to which our choice of method is insensitive? Imagine what would happen if you only fitted your data with straight lines! You'd miss peaks, troughs, oscillations and all manner of other interesting structure in your data. Your methods need to be able to account for all the interesting structure in given data-set and if that structure is complex, a vanilla method may well miss it.

A related point is that a good way of spotting complex patterns can be to use your eyes to look at the data. There are many examples where the best signal detection method is a person (eg. objects in an image, CAPTCHAs). But this doesn't work if your data-set is too big for a person to meaningfully do this. In this case, you need a clever method.

Clever methods as their own research discipline
There is justification for creating clever methods simply because doing so adds to the sum total of human knowledge. This is especially useful when that clever method extends an existing method and/or when it can be further built upon by you or other people. Whole new areas of methodology can be uncovered in this way, whether through being created or through making some existing ideas more widely known. And often reading a clever idea in one context can spark a thought in someone's mind about their own area of research (this is why it's very important to be well-read as a researcher).

If, like me, your research involves creating new clever methods, a burden of proof falls to you. Because there are infinitely many clever methods one could create, it's important to find the ones that are actually useful (defined as having superior performance to the vanilla methods, at the very least). And this means that you need to test your methods and compare them to other existing ones. This is actually one of the real tricks of methodological research; figuring out as many ways as you can to test a new method, to see if it's worth using. A few things I think are really important for this include:

  • Test in many different ways
  • Test on many different data-sets
  • Test using many different metrics
  • Testing on synthetic data can be good because you know the right answer
  • Testing on real data is very important; real data will always contain more junk than synthetic data
  • Real data where you know the right answer (eg. from some other source of information) are wonderful to have
  • Realistically simulated data can be very useful. But it takes a lot of effort to build a software/hardware simulation of most types of data
  • Use your methods. Do some science with them (or help other people to do so), because in the process you'll learn more about how the methods work and how to improve them

In conclusion...
The creation of clever methods is a craft, a balancing act between performance and complexity. But the right method in the right context can be a powerful solution and even open up whole new areas of research.

No comments:

Post a Comment