Teaching programming in bioinformatics

Next term, I am teaching a class called applied programming, which is just a fancy title for script programming.

I finally got out of teaching programming a year or so ago.  I don’t really like to teach this topic.  I love programming, but I find it very hard to teach.  It is something you only learn by doing, and the typical university setup here is lectures.  That simply doesn’t work with programming classes. Period.

Anyway, the university has started a new bachelors program — molecular medicine — and they need to learn a lot of bioinformatics.  For that, they need some basic programming, and I have to teach them that together with Søren Besenbacher.

Teaching programming

In some sense, all skills you learn, you learn by “doing”.  Even if what you are doing is just thinking about it.

Asking students “to think about it” is not the way to go, though.  Trust me on that.

You need some techniques to start thinking.  If you discuss a topic with your friends — construct arguments for your case — you are forced to think about the topic.  If you have to present a topic to your class, you are forced to think about the topic.

When you teach a topic, you are really forced to think about it.

Programming, you really have to think about. You cannot fool the computer.  If you don’t know how to solve a problem, there is no bullshitting the computer into believing that you can.

So in a sense, it should be easy to teach programming. Give people a problem and let them solve it.  There is no cheating, and as long as the problem highlights the points you wish to teach, the students will learn it.

Why do I still find it hard to teach programming then?

Lecturing on programming

One problem is the way we typically teach here.  Teaching is very much based on lecturing followed by practical exercises with TAs.

It is mostly a practical matter. For practical exercises, you need small groups, and we would spend all our time on teaching if we had to do all the exercises ourselves.  Thus the TAs.  But you cannot leave the entire class to TAs and small groups, so we have the lectures to cover the broad topics and make sure that all the groups are keeping up with the teaching plan.

In many classes, this setup works fine, but I don’t think it helps much to lecture on programming.  The lectures in such a class can easily end up being a waste of time.

I can show you all the language constructs of Python on a PowerPoint slide, but that alone does nothing for teaching you how to use them.  It is absolutely worthless if you need to solve a problem using Python.

With 50+ students, I’m probably stuck with the lectures, but how do you structure lectures so you actually teach something useful for programming?

Teaching problem solving

Programming is, more than anything else, problem solving. So how do you teach people how to solve problems?  You show them how you do it yourself!

This is where it gets tricky, I think.  You are not showing people how you solve problems by writing down the problem and then the solution.  You need to show them how you think when you are making your way from problem to solution.  And you need to do it very slowly!

When you are experienced, you make leaps of intuition when you solve a problem.  You recognize something you have solved before and — probably without thinking about it — you immediately think of a solution that worked before.  It is very hard to avoid this.

You need to slow down when solving the problems.  You don’t want to dumb down, though!  You need to solve the problem without your experience, not without your wits. The students are less experienced, they are not stupid, and you won’t keep them interested if you are thinking slower than they are.

Ideally, you want the solutions you come up with this way to be the same you would come up with if you were using all your experience.  I personally hate it when text books come up with solutions you would never see in the real world, just because they haven’t introduced all the techniques you need to get the “right” solution; only enough to actually solve the problem, but in a roundabout and non-idiomatic way.

To avoid this, you need to come up with problems that not only can be solved with the techniques seen so far, but where the ideal solution only uses those techniques. I’m still trying to figure out how to do this…

Anyway, these are the ideas I am throwing around right now for how to approach lecturing on programming again.

Refusing to help (or “give them what they need, not what they want”)

So show you show the students how to solve a problem, and then you give them some problems to solve themselves.  This is where the TAs come into play.  The idea is that the students try to solve the problems themselves or in small groups, and then can meet with the other students in larger groups, maybe 15-20 people, together with a TA to discuss problems and solutions.

For small classes, I’ve been doing both the lecturing and the solution discussions.  My experiences here are not so pleasant.

First of all, it is very hard to get the students to actually try to solve the problems.  Not the computer scientists. In the previous script programming classes there were always a few computer scientists.  Those are already interested in programming and, to be fair, already know how to do it.  The non-computer science students rarely made an effort.

There is, of course, a major difference in being interested in programming and in considering programming a necessary tool for some other problem that is the real problem.

So there is a bigger problem in motivating the problems, but I don’t think that is the full story.

Once the students figure out that they can just show up at the meetings with the TA and then the TA will show them how to solve the problems they got stuck on, they will simply stop trying to solve anything themselves.  At the first sign of trouble, they just stop.  They do not try to work around the problems, or try to figure out why there is a problem in the first place.

This is probably the worst thing they can possibly do.  Programming is all about solving a larger problem by working around a bit pile of smaller problems.  You get stuck all the time, and need to get yourself “unstuck”.  With experience you will do this faster and faster, and often manage to avoid a lot of the small problems in the first place, but you need to get this experience by actually solving the small problems.

When people ask me how I would solve a given problem, I tend to tell them.  This makes me the worst TA ever in a situation like this.  I have tried and tried, with more success the older I’m getting, not to help too much.

It is essential that you don’t help a student who could actually solve the problem himself if he just worked a little harder and a little longer.  We learn from our own experience, not others.

What’s so special about programming in Bioinformatic?

What are we doing teaching programming anyway?

The Department of Computer Science teaches an introduction to programming class that is mandatory for half the student programs at the Faculty of Sciences.  So why do we have our own introductory programming class for bioinformatics (and related student programs)?

This is a good question, and there has been some debate over it. If only it wasn’t me who is supposed to teach the class, I would be a strong proponent for it ;)

Kidding aside, I do think there are strong arguments for having a different introductory programming class.

It is getting late, and I have a paper to read before going to bed, so I will leave  those arguments for another post, though.

Applied programming?

Well, I didn’t choose this title for the class.  I wanted script programming, but apparently that didn’t sound serious. Go figure.

Tags: , ,

14 Responses to “Teaching programming in bioinformatics”

  1. Bob O'H Says:

    How are you assessing the students? If it’s an end of term exam, then you could make sure they know that they will have to solve problems in it, so that they have to practice. Giving them a dummy exam early on might drive the point home.

    Could you also redesign your lectures? Rather than talking at the class, have them present their solutions to problems you set? If you can make it a safe environment for them to show that they went down blind alleys, made mistakes etc. then it might help everybody’s learning.

  2. Thomas Mailund Says:

    Bob, for assessment they will get a problem at the end of term that they have to solve (probably in groups of three) and then write a small report on how they did it. During the class, we plan to have the students hand in weekly assignments of a similar nature (but much smaller problems so they can solve them individually in a few hours). In that way, the weekly exercises and the assessment should be nicely aligned… at least if everything goes according to plan.

    As for student presentations in the lectures, I am not so sure that would work. Experience shows me that most of the students are reluctant to speak up in front of the entire class. To the point where they much more likely to show up at my office than to ask a question during class.

    Asking them to present their solutions in the lecture room, in front of the entire class, is not going to go down well.

    Anyway, presenting solutions this way is what the meetings with the TA is for, so there is an element of solution presentation planned.

    Now the trick is just to make sure that 1) the students actually prepare for this, and 2) that all the students get to present their ideas and attempts at solving the problems, not just the usual 5% of them…

  3. K W Says:

    Hi. As someone who would like to learn programing, I’m not sure where to start on my own? That is, with all the programming languages out there, which one do I start with? Thanks for your comments. Interesting article but I agree that programming can only be learned by doing (and not by listening to a lecture).

  4. Thomas Mailund Says:

    Hi KW,

    what you are asking is actually a difficult question. The reason that there is so many languages out there — beside the narcissistic tendency computer scientists have to want their own language — is that different languages are good for different problems.

    What I left out of the post here is the kind of problems I think Bioinformaticians need to worry about, and the kind of programming that they therefore need to learn.

    Anyway, a good beginners language that is both easy to learn and a pretty good all-round programming language is Python. That is the language I will be teaching in this class.

    It is relatively simple (at least compared to beasts such as C++) and it is very easy to experiment in and to explore. Plus, the online documentation at http://www.python.org is excellent.

    It is a scripting language, though, so for really heavy duty number crunching it is probably not the right choice. Still, even then it is very easy to integrate with other languages, and it already has some very nice modules you can use to get almost the same speed as, say C or C++, but with the ease of use of Python.

    If you are into statistics, R might be a better choice. You can access R from Python through the module rpy, but in my experience it is nicer to work directly in R.

  5. RBH Says:

    Had the course been called “script programming” you would have had a classroom full of cinema students. :)

  6. Bob Grove Says:

    Great article Thomas, thanks!
    Here is a note for K W:
    Data structures are a great place to start – how data is represented, what is a list, array, stack, queue, list, heap, etc …
    In my view, languages have quite similar constructs, however two that come to mind as very useful are strong datatype enforcement and inheritance. Python is great, however if your goal is to learn programming as a means to an end ($$$) I suggest you spend some time surveying the popular commercial development languages of the day (Python among them, but Java, C#, etc) and keep in mind this thought: you will be spending *thousands* of hours reading, writing, loving and cursing these new languages, discussing them with their proponents and arguing with their detractors. Choose wisely!!!

  7. Vince Says:

    I agree, you can’t learn programming without doing it. One possible solution to the problem would be to require the students to have access to a personal computer where they could install and run python code on their own. I have taken online introductory perl and java courses that worked very well (at UMass Lowell). For a large class size, you might need to reduce the work load and simplify the material so that you won’t get killed by question overload. Questions could be answered via email or in online chat sessions in the absence of a physical lab. You could try out a few sessions to see how it goes before expanding. A few sessions would be valuable in any case so that there would at least be some real experience.

  8. Thomas Mailund Says:

    Vince,

    the students probably all have a laptop — that is the norm rather than the exception here — but otherwise they have access to plenty of computers both here at BiRC or in the Dept. of Computer Science, where they will have their meetings with TAs.

    I don’t worry too much about the questions from the exercises. There will be a TA per 20 students, give or take, so it shouldn’t be too much of a problem.

    The two things that I worry about are: will the exercises be too easy, so they are too boring, or will the students rely on the TAs for solving their problems rather than trying themselves.

    I guess some experimentation with this is in order to figure out how it works. After a few sessions, as you say, I should have a better idea.

  9. Vince Says:

    You will want to start out “easy”–can they install python and run hello world. This will turn out to be unbelievably difficult for a surprising number of students. The difficulty question after that will depend on the ability range of your students and what your goal is. I expect most students should be able to do most of the work without a TA, but a few students will be difficult.

  10. Thomas Mailund Says:

    Call me optimistic, but I refuse to believe that installing Python is a major bottleneck. We are talking university students who are used to using computers for their work, and if they are on the molecular medicine program they are in the high percentile of high school grades or they simply wouldn’t have been admitted.

    I am prepared to be disappointed, but if just installing Python is a problem I probably need to rethink the entire class.

  11. Mailund on the Internet » Blog Archive » Why do we need a separate class for programming in bioinformatics? Says:

    [...] a previous post I asked “Why are we teaching an introductory programming class for bioinformatics, where [...]

  12. Mailund on the Internet » Blog Archive » Applied Programming: Course plan Says:

    [...] class with focus on bioinformatics applications).  See my previous thoughts on the class here and [...]

  13. Mailund on the Internet » Blog Archive » First week of applied programming Says:

    [...] Teaching programming in bioinformatics [...]

  14. gioby Says:

    Teach them something about software testing.
    Researchers are more confortable with the concepts of testing rather than programming.
    And it will be useful for them.

Leave a Reply