Hi,
I have a wierd bug on my hands. It's C code compiled with version 7. There are 2 threads running which just happen to have a custom Timer declared in each file with the same name and type.
The MyTimerStruct is just a structure with a bunch of variable declarations. The TimerStart function uses that data passed in to start a multimedia timer for the thread.
The bug is that sometimes thread 1 calls TimerStart but passes thread 2's Timer pointer to TimerStart instead of it's own for some reason. This only ocurrs under very heavy load. The solution was to change the declarion for the 2 Timers to STATIC. But I don't undertand why. Anyone Is there some compiler option that I'm missing Or some variable modifier for this type of thing (volatile or something-volatile didn't change anything)
Here is the long winded description.
#### File 1 ####
MyTimerStruct Timer;
thread1
{
while (1)
{
...
TimerStart(&Timer);
...
}
}
#### File 2 ####
MyTimerStruct Timer;
thread2
{
while (1)
{
...
TimerStart(&Timer);
...
}
}
Notice that the Timer structure is global to each file and has the SAME NAME in both files. But simple scoping says that this should be fine and the compiler doesn't seem to care. Now for the stupid part that I can not explain. Under very heavy load conditions ( the app is streaming video, downloading RSS headlines and transferring images), occassionally the system incorrectly gives thread2 access to file1's Timer instead of the file2's. Or vice versa. It took 3 days of debugging and placing lots of printf's to find this problem and I was a bit taken back when I finally discovered the problem. I re-examined the code over and over and could never see how this could occur. For example, in file1, the TimerStart routine was only called in one place with specific values. When the same information was showing up in thread2, it made no sense what-so-ever. It looked like a stack overflow to me. So I cranked all the stacks up. No diff.
The only thing that kept nagging me was that the variables were defined exactly the same in 3 different files (same name). I thought, could it be the case that for some reason the thread is accessing the wrong timer structure
So I decided to change the declarations in all files to STATIC just to prove that my thoughts about such a cure were wrong.
Well guess what. The problem appears to have went away (as well as a few other strange issues that I had not debugged yet). I change the variables back to non static. Problem comes back. So it's either some sort of an obscure compiler bug or an actual h/w issue where the processor is having problems thread switching. Perhaps a register not getting swapped out in time or something. Most of the time it works fine. But under very heavy load, the problem occurs occassionally.
The real kicker. I had a similiar issue with a different variable that was exhibiting the same problem (a thread STATE variable). For some reason one of my threads would occasionally get an incorrect value (heavy load again) in the state varaible. The value was one that could only get assigned in a different thread. So I looked at the variable declarations. The same in 2 different files again. I again changed both to static and the problem dis-appeared. So. I know it is indeed happening, I can't explain why.
Does anyone have a clue as to why this is happening I can not explain why changing it to static helps but it does.
Thanks.
Strange bug-thread accesses wrong pointer under heavy loads.
Shakje
This a horrible, horrible "feature" called common-data. It stems from the very early days of C when compatibility with FORTRAN was still considered by many people to be a "good thing".
As neither of these variables is either marked as "extern" or has an initializer the compiler treats them as common-data (COMDAT) this means that they share the same address space at runtime: it is not that one timer is getting the other timers address - there is only one. As you have seen most times you get lucky - it is only when the application is stressed that "both" timer are active at the same time. As you have seen one fix is to mark the instances as static. Another fix would be to initialize them - though as doing this would make the variables real definitions you would get a linker error - due to multiple definitions of the same name.
The fix I would prefer would be make them static and to give them different names.
JohnnyAV
Wow. Thanks. I checked the map and set breakpoints to look at the pointers. They are the same. Learn something new everyday.
Hey. Does this same thing apply to code compiled as C++ Why doesn;t the linker kick out a warning
Thanks.
rtpninja