There are two interesting papers in the last issue of Science concerning biology education and the need for computer science and mathematics as part of it:
Computing has changed biology – biology education must catch up
Pevzner and Shamir, Science 31 July 2009: Vol. 325. no. 5940, pp. 541 – 542
Mathematical biology education: beyond calculus
Robeva and Lauenbacher, Science 31 July 2009: Vol. 325. no. 5940, pp. 542 – 543
and a great piece at Ars Technica discussing them.
The message in the two papers is that computation and mathematics is such important aspects of modern biology – especially molecular biology – that biologists need to learn more of it and need to fully understand the computational tools they use.
The Ars Technica piece asks the valid question: Why? To develop computational tools for biological data analysis, of course you need to know, but do you just to use the tools?
Now, I tend to agree that computer science and mathematics should have more focus in biology education – and not just biology but any science really – but I also agree that we need to ask the question why should they learn what they learn? and to which degree?
It is not possible for everyone to be an expert on everything, so some choices must be made.
To some degree, it is quite enough to know how to use a tool to get the work done, rather than to understand all the details of how the tool works. You don’t need to be able to build a computer to program a computer, and you don’t need to know all the tricks needed for sequence alignment to align sequences.
You just need to know enough to be able to use the tools: know 1) when it is appropriate to use a given tool, 2) how to run it, and 3) how to recognize if the results are sensible. If you use the tool as a black box and don’t know what is going on under the hood, 1) and 3) is really important.
There are assumptions about the data underlying any tool, and if you don’t know what they are, you shouldn’t use the tool. For example, if you do a statistical test on data that doesn’t look at all like the test expects, it is garbage in – garbage out. Don’t do ANOVA on data that doesn’t look normal distributed. You will get significant results all over the place, but they will be artifacts.
What should we teach, then?
I’m not really sure, but for mathematics I think some basic statistics is really essential. Not all the arithmetic involved, that isn’t really that important if you have tools for doing it anyway, but the assumptions underlying some basic tests, the importance of checking these assumptions and what significance really means. This is essential for any data analysis, and everyone needs to know it.
For computer science, probably a bit of complexity theory. Enough to know that some problems are feasible to solve and some are not, and have some intuition about which kind of problems fall in which category. Probably not much more than that for complexity theory.
Some programming – script programming – just to make it easier to manipulate data. Manual manipulation of data is just too tedious and error prone.
If you expect ever to have to implement some analysis yourself, then also some basic data structures and algorithms. Just the very basics. You need to know much more computer science to build really efficient tools – you probably need to come up with some new algorithms and some algorithmic engineering – and if it is not your main field, then you are probably better off getting a computer scientist or engineer to look at it.
Understanding the math is probably a lot more important than understanding the computer science, for a biologist. At least one not specialising in computational biology or bioinformatics.
–
213-210=+3