How to Debug by Splitting the Problem Space
Posted by admin on July 12 2010 13:03:23
Debugging is fun, because it begins with a mystery. You think it should do something, but instead
it does something else. It is not always quite so simple---any examples I can give will be contrived
compared to what sometimes happens in practice. Debugging requires creativity and ingenuity. If
there is a single key to debugging is to use the divide and conquer technique on the mystery.

Suppose, for example, you created a program that should do ten things in a sequence. When you
run it, it crashes. Since you didn't program it to crash, you now have a mystery. When out look at
the output, you see that the first seven things in the sequence were run successfully. The last three
are not visible from the output, so now your mystery is smaller: ‘It crashed on thing #8, #9, or
#10.’

Can you design an experiment to see which thing it crashed on? Sure. You can use a debugger or
we can add printline statements (or the equivalent in whatever language you are working in) after
#8 and #9. When we run it again, our mystery will be smaller, such as ‘It crashed on thing #9.’ I
find that bearing in mind exactly what the mystery is at any point in time helps keep one focused.
When several people are working together under pressure on a problem it is easy to forget what
the most important mystery is.

The key to divide and conquer as a debugging technique is the same as it is for algorithm design:
as long as you do a good job splitting the mystery in the middle, you won't have to split it too
many times, and you will be debugging quickly. But what is the middle of a mystery? There is
where true creativity and experience comes in.

To a true beginner, the space of all possible errors looks like every line in the source code. You
don't have the vision you will later develop to see the other dimensions of the program, such as the
space of executed lines, the data structure, the memory management, the interaction with foreign
code, the code that is risky, and the code that is simple. For the experience programmer, these
other dimensions form an imperfect but very useful mental model of all the things that can go
wrong. Having that mental model is what helps one find the middle of the mystery effectively.

Once you have evenly subdivided the space of all that can go wrong, you must try to decide in
which space the error lies. In the simple case where the mystery is: ‘Which single unknown line
makes my program crash?’, you can ask yourself: ‘Is the unknown line executed before or after
this line that I judge to be executed in the about the middle of the running program?’ Usually you
will not be so lucky as to know that the error exists in a single line, or even a single block. Often
the mystery will be more like: ‘Either there is a pointer in that graph that points to the wrong node,
or my algorithm that adds up the variables in that graph doesn't work.’ In that case you may have
to write a small program to check that the pointers in the graph are all correct in order to decide
which part of the subdivided mystery can be eliminated.

by Robert L. Read