One of the reasons making bug-free software is hard is that when you make a change, sometimes the effect isn't what you quite expect... Now, one of the problems in KDE3.2.x is that with some compiler flags, the MMX and SSE2 versions of some graphics routines don't work quite right, since they make assumptions which are no longer true of newer gcc versions. Fixing those things is rather painful (oy, debugging inline asm, oy), and so after assigning the bug to myself I've been procrastinating for quite a bit, partly because I can't even test the SSE2 versions, and partly because I was secretly hoping that someone else (Fredrik?) may fix it. Since there seemed to be a bit of a spike in the reports of the problems lately, I decided to do the reasonable thing, and to disable these routines in advance of 3.3 Beta 1, until they get sorted out. Seems like a sane and low-risk decision, doesn't it?
Well, a few days later, and people are reporting similar-sounding problems to what I thought I just "fixed" (which of course worked before and after on my machine). And, what's more surprising, some of them have not had those problems before. So in other words, a very conservative change that was supposed to fix a bug, actually introduced that very same bug, perhaps even for more people.
So I look into the C++ versions of routines, and find something like this:This piece of code may look right at first sight, but it actually has a major problem. There are some things about program behavior that C++ does not specify. For example, if you call a function, there is no guarantee about the order in which the parameters are evaluated. Now, most of the time it doesn't matter. But sometimes to get the parameter, one has to call a function, and that function may changed some memory, or output to console, or draw to screen. It's usually said that the expression has side-effects. To be more concrete, consider the example:
What should this output? It turns out, this can output either 123 or 213, since C++ permits the compiler to call get(1) and get(1) in any order. So what does this have to do with the original code? Well, that code has pretty much the same problem. What the person who wrote the code expected was that it would perform the calculation on the contents of data, then update that location, and update the pointer. Except, there is nothing requiring things to happen in quite that order. The compiler may chose to generate code that does the increment before doing the work on the right hand side. Which is likely what it did for people who were having problems with this. The fix?
So, to those still reading: please do not use ++variable in expressions where variable is accessed anywhere except through that operator, or you may face a confusing "but it works for me, why does it fail for so many people" bug.