Friday, 3 October 2008

How to Optimise code

[caption id="" align="alignleft" width="300" caption="Photo by NathanFromDeVryEET"]Photo by NathanFromDeVryEET[/caption]

We've covered the theory of optimisation and how to write a profiler in previous articles.  This article is about specific optimisations that you can use to make your code go faster or use less resources.

Profile first
At the risk of sounding like a broken record: optimisations are useless if you don't have any data about your code before you put in the optimisations. Optimisation is like running an experiment on your code, and just like an experiment, you have to data to tell if your hypothesis (this piece of code runs faster) is true.
Just as in an experiment you don't just collect one set of data, when optimising always gather several sets of data under different inputs and repeat the same tests after any changes have been made. A change that is positive under one set of conditions may be negative under different conditions.
When making optimisations only make one change at a time. You don't change several variables at once in an experiment, similarly you don't do several optimisations at once as you can't tell how effective each one is.
The worst case is that you make two changes, one positive and one negative, the outcome is neutral or negative and you take both out. You have just rejected a change that had a positive impact because it was masked by a change with a negative impact.

What to optimise
In general, optimisations swap time for resources or resources for time. Improvements to algorithms or some simple code re-arrangement optimisations may give you improvements to both, but in most cases you are decreasing execution time by increasing memory usage or reducing disk accessing by increasing processor usage etc.

After you have identified where your program spends most of its time, you can then investigate those areas and work how to speed up program execution. Time can be wasted in the most unexpected places. In a recent example at Ben's work a significant speed up was achieved by removing an unused variable! A lot of work was being done in the objects construction and the compiler couldn't optimise it away. Further proof that you must profile first and optimise second.

Lets look at some common ways to improve performance. These techniques will have differing effects on different platforms, languages and compilers.

Code re-ordering
Simply taking the code you have and reorganising it can bring surprising speed ups.

Consider this code:

if(condition1) {
... some code ...
} else if (condition2) {
... some more code ...
} else if (condition3) {
... even more code ...

if condition3 happens more often than conditions 1 and 2 then moving it to the top will save conditions 1 and 2 having to be calculated. If the tests are anything other than trivial this could be a significant saving.

Move variable declaration out side of loops
A lot of code looks like this:

loop over a large number of items
create local variable
do work using local variable

Every time the loop is executed a new variable is created at the beginning of the loop and destroyed at the end which takes time.
Moving the local variable outside of the loop, such as this:

create local variable
loop over a large number of items
do work using local variable

Will remove the overhead or variable creation and destruction and will probably be quicker, especially if the cost of creation and destruction is high and the code is called a very large number of times.

The classic processor time for memory swap is to cache previously calculated values in the expectation of using them again. The idea is that after calculating a value, your code stores it in such a way that you can tell what parameters created it, then instead of re-calculating this value you can simply retrieve it when you need it again. Values are stored in a dictionary or hash (these are data structures that map a key to a value). This technique is only of use if the results of your expensive calculation are likely to appear again otherwise it is a waste of space and time!

Caching is used in CPUs to store a small amount of data and instructions in very high speed memory so that the processor doesn't have to wait for it to be fetched from slower main memory and the same technique can be used to reduce memory usage. If you are processing a large number of items (larger than will fit in memory) then rather than loading them one at a time, load them in batches. The overhead of file accessing etc. is reduced and disk drives perform better when fetching large blocks of information.

Look-up tables
Look-up tables are like caches, in that they store values rather than calculating them, but they are generally hard-coded rather than dynamically created. Sine and Cosine are common examples, because calculating Sine or Cosine is an expensive operation but there is a limited range of possible input and output values. A few hundreds or thousands of values stored in a dictionary or hash map can give a fair degree of accuracy. Values that aren't in the table are created by interpolating between the nearest values and even simple linear interpolation can provide good enough accuracy if the samples are close enough together.

Look-up tables can also be used for other expensive calculations such as Square Root and Exponential. For example, a contact of ours got drastic speed increases by using two look up tables to exploit the fact that exp((a+b) = exp(a)*exp(b) where a is an integer and b is from 0.001 to 1.000.

Reading and writing files in chunks
File accessing is, compared to fetching data from memory, incredibly slow but most data will only be available from files. To minimise the overhead of opening, reading and writing, data should be fetched and written in blocks rather than individually. We have already mentioned the use of a cache to read in blocks of data but the same can be applied to writing output. Rather than writing each result as you get it, write them into a cache and then, when the cache is full, write the whole thing out. Some languages already provide this functionality so read the documentation before re-implementing something you already have!

Files should also be read from and written to in one go rather than getting a bit of data from one file and a bit from another. This will cause the disk to have to jump around and this will waste a lot of time. If your data is in different files then better to read in chunks from each file and process them from the cache.

Vectorised languages
Some languages such as Matlab, R and IDL are vectorised.  This means that some of their in-built functions (operators such as +, -, *, /; also functions like EXP, COS, SINE etc)  can accept vectors/arrays of inputs and will perform the operation over the whole set of values.  The reason for the creation of vectorisd operations is that in these languages, FOR loops are very slow (because they're not fully compiled).  The vectorised form will be fully compiled and hence much faster.  So, if you're using one of these languages, large speed-ups can often be gained simply be replacing your FOR loops with vectorised operations.

In this article we've shown some techniques to help you increase the performance of your code. Optimising code can be a slow process as you have find and then remove each performance barrier. The big savings come quickly but they become increasingly hard to find and each optimisation produces a smaller result. It is also important to realise that as you increase the performance of the code you are often sacrificing its ability to be easily understood and this can make maintaining the code harder in the future. Be careful not to get caught up trying to wring the last drop of performance out of code that has become unreadable.


  1. Hi!
    I like your blog so much. I think it is one of the very best references for scientists and wanna-be bioinformaticians :=).

    However, can you post new articles a little less often? I don't make it in time to read all of them. I know that's a strange question to ask, but please think about it :).
    Morevoer, can you add something like a 'dashboard' o 'contact-me' area, where people can write suggestions to you?

    Thanks :)

  2. Very glad you like it!

    We're posting twice weekly right now to get content onto the site (we have quite a lot already written). We ultimately plan to go to weekly posts, once we've done that. So watch this space! (and we'll see what we can do about a "contact us" button :-) ).