This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

good code bad answer

my code is 110% correct; eg no errors and runs properly. see.

int _val;

int myfunc2(int val)
{
  _val = val;
   return _val;
};

int Bar(int val)
{
  return _val + val + 1;
};

void myfunc1(int val)
{
  _val += Bar(val);
};

etc
etc
etc

it doesnt give me the right answer sometime.

HEEEEEEEELLLLLLLPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP

Parents
  • It is so HARD to write even moderately bug free code.
    Circumstances might change.
    Components could be replaced rendering once functioning code useless.
    Slight differences between processors might induce a huge impact on large program constructs.
    EMC is always there.
    System load will determine "correctness", too.
    Tool chain (settings) might play a role.
    System that "live on the edge" in terms of timing might fail without warning (I made such a mistake 1.5 years ago only to fix it 3 weeks ago!).

    110%?

    I believe 70% is optimistic!

Reply
  • It is so HARD to write even moderately bug free code.
    Circumstances might change.
    Components could be replaced rendering once functioning code useless.
    Slight differences between processors might induce a huge impact on large program constructs.
    EMC is always there.
    System load will determine "correctness", too.
    Tool chain (settings) might play a role.
    System that "live on the edge" in terms of timing might fail without warning (I made such a mistake 1.5 years ago only to fix it 3 weeks ago!).

    110%?

    I believe 70% is optimistic!

Children
  • I just set the compiler switch --quality=100 and take it from there.

    Then I know that if the code goes through the compiler without any reported warnings/errors, everything is dandy. It's the same as with jpeg images - if you drop the quality parameter, you get lossy results.

  • (I made such a mistake 1.5 years ago only to fix it 3 weeks ago!).

    That's nothing. I've fixed a bug like that in a subroutine which had successfully gone through more than 10 years of continuos re-use all across the company. Then mine happened to be the project where all the external conditions (speed of CPU, speed of peripheral device, etc.) were just right to trip the lurking bug. The central code sequence had been exactly the wrong way round all that time.

  • Yes, that sounds pretty nasty.
    Mine lived on for a long time undetected. Then, starting at a particular release, it was relatively easy to reproduce (reason remains a big mystery, still). The timing margins addressing an external analog output via the SPI bus were not conservative enough, so that some (one?) of the 12 bits heading towards one of the DAC's banks ended up nowhere or at an identical, separate DAC (there is a chip select there). Because the DAC have a shift register to input the data, even the loss of 1 bit meant huge swings! That was fun to watch :-) but much more fun to see with a scope and solve...

  • The false (but very widely-held) belief that, because the stuff "works" (sic), it must, therefore, be right...

  • It has happened to all of us . Here is my 'war story'.
    http://www.keil.com/forum/docs/thread15893.asp - 8.6K - Nov 8, 2009
    Bradford

  • "Then mine happened to be the project where all the external conditions (speed of CPU, speed of peripheral device, etc.) were just right to trip the lurking bug."

    I am not convinced that it is a bug.

    I think it is practically impossible to write completely bug-free code that works under all circumstances.

    Instead, we all write code pieces that have "limitations" that unravel under certain conditions. it is our job to document such limitations so that when we are in a circumstance where the limitations become reality, we know that we need to fix the code for that particular application.

  • I am not convinced that it is a bug.

    You have no idea.

    I think it is practically impossible to write completely bug-free code that works under all circumstances.

    Except this was actually the exact opposite case: blatantly buggy code that still appeared to work under quite a number of circumstances.

    If you want details, suffice it to say that when a chip's datasheet says it takes at least time {T} after the falling edge on pin X before pin Y has settled on valid data, what you not do is read Y immediately after pulling down X. You delay it as long as feasibly possible, typically until immediately before pulling down X again.

  • "when a chip's datasheet says it takes at least time {T} after the falling edge on pin X before pin Y has settled on valid data, what you not do is read Y immediately after pulling down X. You delay it as long as feasibly possible, typically until immediately before pulling down X again."

    that's slightly different. here you have a case of a moronic programmer writing a moronic piece of code that would NOT work regardless of cpu frequency, peripheral speed, or any other external conditions, as it is in direct contradiction to the datasheet.

    basically, the programmer has no idea of what s/he is doing and that's in direct contradiction to the early statement of the bug being exposed only under the right combination of cpu frequency, peripheral speed / etc.

  • The programmer just admitted to having written bad code that should never have worked - and yet it did!!

    I think most experiences programmers have been in the position, at some point, of looking back at their code and thinking, "how on earth did that ever work??!"

  • here you have a case of a moronic programmer writing a moronic piece of code that would NOT work regardless of cpu frequency, peripheral speed, or any other external conditions, as it is in direct contradiction to the datasheet.

    Sorry, but you still don't get it.

    That code does work in its original surroundings. It works by coincidence rather than by design, but still...

    The CPU was, in fact, so slow that even a read "immediately" after pulling down the clock was still happening long enough after the clock edge that valid data was available by then.

  • "That code does work in its original surroundings. It works by coincidence rather than by design, but still..."

    not sure about that. sounds like 1) whether it had worked had nothing to do with a) cpu frequencies; b) peripheral speeds; or c) external devices, and 2) the only reason it worked is that the cpu was NOT functioning as it should per the datasheet.

    that contradicts the notion that the bug is the result of a) cpu frequencies; b) peripheral speeds; or c) external devices.

    "The CPU was, in fact, so slow that even a read "immediately" after pulling down the clock was still happening long enough after the clock edge that valid data was available by then."

    that's irrelevant to this discussion, had the cpu worked as described in the datasheet. so the bug here is really the hardware bug, not a software one, except that the moronic programmer couldn't realize it.

  • the only reason it worked is that the cpu was NOT functioning as it should per the datasheet.

    What on earth are you talking about?
    Hans-Bernhard Broeker presented a perfectly logical explanation for the observed behavior.

  • except that the moronic programmer couldn't realize it.

    Such mistakes are rarely related to one being a moron.
    Everybody makes mistakes, and as specified already- sometimes it the environment, not the code itself!

  • On the contrary - it is entirely pertinent!

    It is actually quite a common type of mistake - especially when using a "slow" CPU with a "fast" peripheral.

    And it is exactly the type of problem that can give the situation of the OP; viz, the code builds fine, works most of the time - but occasionally fails.

    It's even likely that the code would work fine in an instruction set simulator like the one in uVision.