Friday 5 February 2010

How to...optimise Matlab



[caption id="" align="alignleft" width="300" caption="image from wikipedia"]image from wikipedia[/caption]

Matlab is a language that's used a lot in science and for good reason.  It's quick to code in, plus there are loads of built-in functions and packages available for many of the tasks that Programmer-Scientists might find themselves involved in.  The main issue that a language like Matlab faces is that it's slower than languages like C++ and FORTRAN, which can be inconvenient when dealing with resource-intensive tasks like data analysis, statistical modelling or numerical simulation.  But fear not!  We proudly present our guide to how to optimise your Matlab code.

Use the profiler!

The first and most important tool you need when optimising your code is a profiler.  This is essential so that you can track down the bottlenecks in your code.  Only then is it sensible to start optimising parts of your code.


Happily, Matlab has a profiler built into it.  Here's an example of how you might use it (to run the code in a file called RunTestScript.m)

>profile clear    %%this command clears the previous records in the profiler.  To start with a blank slate...
>profile on        %%switch on the profiler (this can slow execution down, so don't leave it on all the time)
>RunTestScript %%run the code you wish to profile
>profile off        %%switch the profiler off again, so it ceases to record
>profile viewer  %%this opens the viewer window, which allows you to explore which functions etc are taking most of the run-time


Vectorise your code
FOR loops tend to be slow in Matlab.


This performance comes from the fact that Matlab loops aren't fully compiled (as they would be in C++, FORTRAN etc).  This has apparently been improving over the years as Matlab now includes a Just-In-Time (JIT) compile and Accelerator.  But there is still certainly a shortfall in performance.

The solution to this is to vectorise your code wherever possible.  A range of functions in Matlab are able to accept vector/matrix inputs and will perform their operation on all elements of said vector/matrix.  Because these built-in function can be implemented (by Mathworks) in C, they are often a much faster way to get something done than using a FOR loop.  For example:

nValues = 1e7;
firstArray = ones(1,nValues);
secondArray = 2*ones(1,nValues);
resultArray = zeros(1, nValues);


tic()
for i=1:nValues
resultArray(i) = firstArray(i) * exp(secondArray(i));
end
toc()


tic()
resultArray = firstArray .* exp(secondArray);
toc()

Which gives us the result that the vectorised form is almost a factor of 3 faster.


You don't need to eliminate every FOR loop on your code (look for the bottlenecks, remember), but vectoring the main loop in your code can give quite a performance boost.

Use the (right) built-in functions wherever possible
Some built-in Matlab functions are faster than others and it can make a big difference which one you choose.  For example, consider generating some random numbers:


nValues = 1e4;
tic()
for i=1:nValues
random('gamma', 2, 2);
end
toc()


tic()
for i=1:nValues
gamrnd(2, 2);
end
toc()


Using gamrnd() rather than random() here saves about a factor of 4 in run time.  In this case, this is because random() is a wrapper that is calling gamrnd(), with the overhead being the bottleneck.  We recently encountered exactly this issue in some code Rich was working on.  The profiler showed that random() was a significant (and unexpected) bottleneck, but showed that it was calling gamrnd() to generate the actual random number.  Having identified this, is was straightforward to switch function and gain a significant speed-up in our code.  The reason random() exists is that it allows access to a range of random number generators via an input parameter (set to 'gamma' here), which is often useful but does come with a performance penalty that was significant in this instance.


Sparse matrices
One really nice feature about Matlab is that it has a built-in implementation of sparse matrices.  These are matrices (2D arrays) that contain a lot of zeros.  When a significant proportion of the elements are zero, it can be much more efficient if you have an effective way of only storing the non-zero elements.  Matlab's sparse matrices are just that.  They are set up so that they behave pretty much as normal matrices, so the differences are largely invisible in terms of the code.  But they are a *lot* more efficient when it comes to working with large matrices of sparse data, simply because your code won't spend time performing unnecessary operations like adding or multiplying by zero (not to mention the memory you can save by not storing those zeros).


The Lightspeed toolbox
Tom Minka over at Microsoft Research has produced a Matlab toolbox called
Lightspeed that speeds up a range of Matlab functions.  It's pretty straightforward to install and directly replaces various internal Matlab functions, so you can benefit from it without needing to change your existing code.  When we first installed it, the code we were writing immediately ran noticeably faster, so it's well worth installing this toolbox.


Multiple cores and the parallel toolbox
We're big fans of making your code run faster by simply using better hardware.  If you can access a bigger/faster/shinier computer, you can get your results more quickly without having to touch your code.  Nowadays, many computers have multiple CPU cores and there are also a lot of clusters of computers that are available to be used for distributed and/or parallel processing.  Happily, Matlab has ways to benefit from both of these.


Multiple cores are easy, because Matlab has an increasing number of built-in functions that can benefit from multiple cores (see here for more info).  This means that if you have multiple cores on your machine, you can benefit from them by just using the relevant built-in functions.

For parallel and distributed processing, Matlab also provides the parallel toolbox.  This costs extra and we've had some issues with having to buy an extra license to run on many cluster nodes at one (which felt a lot like paying twice), but once up and running we've found it to be a pretty useful toolbox.

The order of array indices matters

One of the really useful features of Matlab is that there are lots of good ways to handle arrays and subsets of array elements.  There is a subtlety here that can speed up your code:  for 2D arrays (matrices), the order of array indices matters when accessing subsets of array elements.  This is because 2D arrays are really 1D arrays in disguise and it's quicker to access a consecutive sequence of elements.  For example:


dumArray = ones(1000, 1000);
nLoops = 1e6;


tic()
for i=1:nLoops
dumVector = dumArray(1,:);
end
toc()


tic()
for i=1:nLoops
dumVector = dumArray(:,1);
end
toc()

The latter method is over 5 times faster here.


So the lesson here is that if you're spending a lot of time accessing 2D arrays (for example, a data matrix), there may be an optimal order for the array dimensions.

mex files
If you absolutely, positively must have FOR loops in your Matlab code, all is not lost.  You can recode a bottleneck loop in C and put it inside a mex file.  This should allow you to speed up critical bottlenecks in your code.  Disclaimer:  we've never experimented with mex files, so don't have a feel for how easy/hard they are to work with.  If you have some insight, please leave a comment at the end of this post!

Code granularity
Something we're realised when using Matlab (and other similar languages) is that the granularity of your code matters a lot for performance.  For example, we're big fans of Object Oriented coding, because when done well it can make the code so much easier to work with.  But if your code ends up with too many small objects, the overheads of accessing each one can make a big dent in your code's performance.


We particularly found this when using arrays of objects (hint: don't) and we've seen advice on this in various places.  In short, objects of arrays tend to be faster than arrays of objects.  When working with large data-sets, our advice is that you want to have all the data in a single array if you can, because this will be efficient to access (and consider using sparse matrices, if your data have a lot of zeros).

In conclusion...
Every programming language has its particular tricks and tips for optimisation and Matlab is no different.  Understand these characteristics and you can end up writing much faster Matlab code!



Links to some other pages on optimising Matlab
Vectorisation tricks
Faster scripts

Optimising for speed

Video on Matlab optimisation

Tom Minka's page on accelerating Matlab

5 comments:

  1. It's been a long time since I used Matlab. The last time I used it was to create my final year project using Matlab 6.0. The project was for creating an image compression tool. We were just amateurs at C back then. And when we got our hands on Matlab, we were amazed by the available features in the language.

    The program we wrote did excellent image compression though it was a bit slow because we never really thought of optimizing it :)

    ReplyDelete
  2. I use mex-functions for Matlab often, exactly because of the for-loop reasons. It is quite straightforward for the usual data types (double, int, strings, etc.): the C-function has to call some Matlab functions to convert Matlab arrays into C arrays. The for-loops are of course becoming much much quicker (and also - the memory constraints of Matlab often do not apply! ). The complexity in this case depends on how familiar you are with C, and how complex your algorithm is to code it element-wise.

    I did not use mex-functions for passing and handling objects though (structures, cell arrays, etc.)

    ReplyDelete
  3. I have a grid interpolation in my code for which I am using interpn function, which is called inside one function. It is taking aroound 1 hour to run the program because of the interpolation being done in every iteration. Is there any faster way to interpolate grid data in spline method? I have read about ba_interp3 but I am not very clear how to use it.

    ReplyDelete
  4. 999 This is a cologne created for guys amongst the ages of 35 and 53, and will be worn in the daytime. I'd continue to invest so long as practical chopping it up which includes a fellow geek, most of the whereas not providing anything at all.
    authentic louis vuitton outlet new york
    [url=http://www.clarkcountybasketball.com/authentic-louis-vuitton-outlet/]authentic louis vuitton outlet new york[/url]

    ReplyDelete
  5. 999 It is a cologne established for guys relating to the ages of 35 and fifty three, and may be worn in the daytime. I'd personally move forward to invest assuming that conceivable chopping it up by using a fellow geek, each of the at the same time not selling everything.
    cheap authentic nfl jerseys china
    [url=http://pack547.com]cheap authentic nfl jerseys china[/url]

    ReplyDelete