This is a resource post to go with the previous posts on picking a programming language (Part 1 and Part 2). It's a big list of programming languages that you might consider using for your software project, or that you might encounter as the result of using legacy code. This list isn't exhaustive (there are a lot of languages out there) and doesn't give any kind of ranking with regards to how suitable a language is for your project (that's up to you). And we've mainly tried to go for languages that we're aware are used fairly widely by scientists. But it should give you a quick flavour of a range of languages that people find useful to use. Then if any take your fancy, there are web pages and books that can tell you more!
- C. A widely-used compiled language. C is used widely both in science and in most contexts where software is needed.
- C++. The object-oriented version of C. This makes is both fast and able to benefit from all the nice (from the developers' perspective) qualities of an object-oriented language. It needs the developer to handle things like memory allocation, which can be a headache, but the skilled programmer can use this to their advantage (See Basics of ... C\C++).
- C#. Arguably Microsoft's answer to Java this C style language has garbage collection and can be run on non-Windows machines using the open source Mono project.
- Excel (!). Okay, we know it's not a programming language. But there are a lot of scientific tasks you can get done with a spreadsheet's built-in arithmetic functions and macros (see Visual Basic below). And a spreadsheet can be a pretty handy way of passing round a data-set, because everyone will know what to do with it.
- Fortran. A family of compiled, and hence fast, languages designed to be good for scientific and engineering tasks. Fortran has been around in one form or another for literally decades, so there is a lot of scientific legacy code written in it.
- IDL. Not dissimilar to Matlab, although with perhaps a focus more on data analysis and less on maths. Again, loads of built-in library support. Very good for prototyping data processing pipelines and for doing interactive data mining, for which it has some nice tools built-in.
- Lisp. Not used in science much, but it's how you program Emacs. Emacs is a very powerful, general-purpose text editor that has support for a lot of different programming languages (amongst other things).
- Java. An object-oriented language that has C-like syntax. Java has nice features like garbage collection for memory (de-) allocation, which saves a lot of the headaches experienced by users of C and C++. It is also (fairly) platform independent because it runs on a thing called a virtual machine, which is handy for porting your code to other computers. It is generally slower than language like C and C++, but the gap is smaller than most people think; it used to be a lot worse, but the virtual machine has improved a lot. When someone tells you "Java is much slower", they're generally repeating something that used to be true but is less so now.
- Matlab. A widely-used commercial language that was originally developed to perform maths operations involving matrices. It has loads of built-in library support for scientific-type analysis. It can be a lot slower than the compiled languages like C and Fortran, except when using certain built-in library functions (for example the libraries that handle matrix operations) (See Basics of ... Matlab).
- Python. Designed for clarity, productivity and extensibility the language is designed to have "one—and preferably only one—obvious way to do" anything.
- Perl. Originally designed for text manipulation Perl has expanded into many other areas. Based loosely on C it is designed to be practical and supports many different programming styles (procedural, object oriented, functional etc.) and different ways to solve problems.
- Shell scripts. Built into your UNIX system (if you use one). These can be a very useful way for 'glueing together' different pieces of software, or for running the same software a number of times with different inputs.
- R. A statistical programming language. It has loads of statistical libraries that are either built-in or downloadable. R is open source and has a community that develops new code for it; for example, the Bioconductor toolbox for analysing gene expression data is very widely used. It can be slow to run big jobs, unless you use the built-in functions (written in C and hence v. fast) or attach your own C or C++ code to speed up the critical bottlenecks. Interfacing C/C++ to R in this way take a bit of care, but can be very powerful (See Basics of ... R).
- Visual Basic. The modern version of BASIC, a very easy to learn language, this is used a lot by people for whom programming is not the main part of their job. Notable in scientific circles because Visual Basic for Applications (VBA) is the macro programming language for Excel.
There are also exhaustive lists of programming languages on Wikipedia and an interesting diagram showing the evolution of programming languages is maintained by Eric Levenez.