We are running a survey to help us improve the experience for all of our members. If you see the survey appear, please take the time to tell us about your experience if you can.
This one is driving me crazy, so I just wanted to know if anyone has any hints as to how I might fix this.
I have compiled zlib as a library, and I have included the lib-file in my project along with the header files for the exported functions. This works, and there are no complaints from the compiler or linker.
However, a very strange thing happened when I was modifying some code I was working on. It was a very simple change, I added a missing space to a literal string somewhere in my code. This caused a completely different part of my code (in a different source file) which used zlib to decompress data, to actually stop working (the zlib inflate()-function returned an error, even though it was given the same source data).
The object file with the code that started failing was unchanged, but there was a positional change in the final image due to the one extra space character in the literal string.
How can this happen? The library was unchanged, the code using it was unchanged, yet it failed.
If I change the optimization level I can "fix" the error, but that is of course just conicidence.
I'm using RealView MDK-ARM Version: 3.23.
I think I may have tried almost all combinations of optimizations and other options, both when compiling the library and the actual project. Sometimes it works, sometimes it doesn't.
Can anyone please shed some light for me? I am probably going bald any day soon... :-)
Thanks for your tips. Declaring one or more of the local variables in the function that uses zlib also "fixes" the problem, but it is still coincidental. I can declare a local variable in a completely different function as volatile and also "fix" the problem.
The point is, none of the local variables (in this case) needs to be volatile, so declaring one or more of them as volatile does just the same as adding the space to the string: It alters code size so that something changes alignment.
What I don't understand is that I am letting the compiler and linker take care of the alignment, I am not forcing code to particular addresses.
"What I don't understand is that I am letting the compiler and linker take care of the alignment, I am not forcing code to particular addresses."
You may not be forcing it, but you might be assuming it somewhere; eg, in a reference via a pointer...
The only way to track down this sort of problem is to get in there with the debugger; it can be hard enough with the source and target in front of you - it's impossible to do from afar without the source or the target or even a clear idea of what the thing is actually doing!
Andy, I have a certain product at work that runs on a processor that does not have a built-in debug interface (C167). The software does not support debugging via a serial port, and guess what - it has the EXACT same problem as the OP...Even if absolutely dead code is removed, the program will crash at arbitrary moments. I guess it that damn legacy code again, but I cannot find it! It work as it is (as long as you add code at the end of modules...), which is, unfortunately, good enough until we flush it down the nearest toilet...!
I actually have been looking at it with the debugger, and all the actual data that is used by the inflate()-function is the same in both cases, but in one case inflate() returns with an error code, in the other it doesn't. The code doesn't crash in any way, it's just that inflate() is unable to do the job.
I am now guessing that the error is within the zlib-code itself. Not a bug, but perhaps some assumption that may cause trouble when the code alignment changes.
I was kinda hoping to avoid problems like this by choosing tried and proven code...
I tried compiling the library in ARM mode (no Thumb) and I turned off interworking. This unfortunately makes the library a lot bigger, but it seems (only time will tell) to solve the problem.
My guess is that this forces a certain alignment, but I can't be sure.
In any case, I find this solution to be borderline bearable, but I am not satisfied that I still don't know exactly why it fails. And I would certainly have preferred being able to compile the library in thumb mode to save some (a few kBs, actually) of space.
So, if anyone has any more feedback on this, please chime in!