03-02-2010 04:40 PM
I have written an application which seems to run correctly on some Dell
computers, but on two out of three HP Elite computers it does not (one of
the HPs does work correctly). All the computers are running windows XP SP3
and CVI 8.5.
The problem shows up in the timer control. It is set up to execute a
callback function once per second. Inside the callback I make use of the
current time pointed to by eventData1. Normally, the current time should
increment by about 1 second each time through the callback, and this is
what happens on the machines that operate correctly.
However, on the problem machines, we see occasional bizarre jumps in the
current time. On one particular run, we saw a sequence like this:
17179882.763494 (0) 25769828.763494 (10) 30064807.050273 (21)
21474851.048108 (1) 25769829.763494 (11) 30064807.763494 (21)
21474851.763494 (1) 25769829.763494 (11) 30064808.763494 (22)
21474852.763494 (2) 25769830.764532 (12)
21474853.763494 (3) 25769831.763494 (13)
21474854.763494 (4) 25769832.763494 (14)
21474855.763494 (5) 25769833.763494 (15)
21474856.763494 (6) 25769834.763494 (16)
25769825.039438 (7) 25769835.763494 (17)
25769825.764403 (7) 25769836.763879 (18)
25769826.763494 (8) 25769837.763494 (19)
25769827.763494 (9) 25769838.763494 (20)
Weird, isn't it? Those values should be starting at zero, and incrementing
by approximately 1 each time.
I am able to work around this problem in software to get a reasonable
elapsed time value. However, I am concerned that this is really a symptom
of a deeper problem and my program (or something else) is writing to
addresses it should not be writing to. The time value is no big deal, but
maybe something else is being overwritten.
Sometimes, we can provoke a jump by moving the mouse around, but other
times not.
Another thing to note is that except for this timer quirk, the program
seems to be operately correctly in that it can read in known data and
output known results.
Does anyone have any ideas about what could be causing this problem? Any
debugging suggestions would be appreciated.
Thanks,
Marty O.
03-02-2010 08:20 PM
The timer controls aren't guaranteed to execute the callback at the timer interval 😉
I think if you read the fine print on the timer control, they say the callback will be called but that it can get delayed by the other stuff NI has happening in the RTE. Well, they don't say it that way, exactly, but they do disclaim any hard determinism as far as exactly when the timer callback will get executed. It runs on the same thread as the application, as do other callbacks, so the timer callback has to take its turn with all the others as I understand it. Plus, the app / callback thread is getting scheduled just like any other thread on the PC, so it could be getting pre-empted for any number of reasons by other threads that have nothing to do with your app.
Have you tried an asynchronous timer control? These run on a separate thread (all async timers in a CVI app run on the same thread as it turns out) from the app so a little more reliable - the async timer callback isn't competing with the app and all the other callbacks for cpu time, though it still competes with other threads in the system for cpu time. The idea here is that the OS is more likely to give the async timer thread a fair schedule than when your using a normal timer control.
If you just want to measure elapsed time between points in your code, you can use the Pentium performance counters which can be used to measure elapsed time with high precision. If you're interested I posted the code for this the other day on another thread (pun intended).
Menchar
03-02-2010 08:36 PM - edited 03-02-2010 08:37 PM
menchar is right.
Use an asynchronous timer if consistency is critical.
Check this thread "Encountered inconsistency 2ms for Async Timer"
(http://forums.ni.com/ni/board/message?board.id=180&message.id=43087&query.id=1686606#M43087)
if you need higher precision using asynchronous timer.
03-03-2010 08:23 AM
03-03-2010 11:12 AM
Are you checking for EVENT_TIMER_TICK in the callback?
I remember being puzzled once at timer behavior then I realized I wasn't ensuring the event causing the callback was really a timer tick. Stupid I know but it really had me fooled for a while.
Do you have other callbacks or processes running that might be stopping CVI events from being processed for several seconds at a time?
Do you have just one timer using the same single callback? I.e., do you have multiple timers using the same callback function?
Is your main process multi-threaded? I wonder what CVI's timer function uses for the elapsed time - what system HW is it using - and is this HW process-specific or thread specific? I wonder if some of the PC's you're using are multi-core and some are not, or maybe some AMD and some Intel.
I've used the Pentium peformance counter instead of Timer() (which CVI is using to pass the"current time" eventData1) with good results and better resolution than Timer().
What happens if you make your own Timer() call in the callback when it's entered? Does the time jump around then?
Menchar
03-03-2010 11:24 AM
The gaps in your sequence seem to correspond to roughly 49.7 days (as measured in seconds). This is approximately the amount of time it takes for a millisecond counter to overflow a 32-bit integer. Now, assuming that your timer doesn't simply freeze for 49 days, this is clearly an incorrect value for the event data that is being passed to your callback. I'd love to be able to investigate and confirm what is happening, but not having one of those computers here at my disposal, I wouldn't be able to reproduce the problem.
I think you can rest assured that you're not experiencing a memory corruption issue. This is either a bug in the system's performance counter, or in the CVI runtime. Either way, you might want to disregard the value in eventData1. If you need the current time, you can always call the Timer function inside the callback.
Luis
03-03-2010 11:32 AM
03-04-2010 03:05 PM
03-04-2010 07:22 PM - edited 03-04-2010 07:27 PM
I run into timing issues from time to time with NI library functions - NI is trying to be platform agnostic Linux/Windows so they come up with abstracted behavior sometimes that is hard to figure out how it really behaves and what it really does in terms of a specific OS (Winders in my case).
Delay is another function that seems to have curious behavior in some situations.
When I have problems, I try to bypass the NI function and use the OS implementation - easier to reason about and closer to the machine and more help/experienc/info available at maybe the cost of portability which we typically don't care about Linux <=> Windows.
Glad you got it solved, though maybe you'll never know exactly why. Async timers on their own thread and they may also use a different HW mechanism for the timer.
From what I read about HW support for timers on PC's, it's not unknown for there to be a BIOS or HAL bug that can screw these up.
Having used HP minicomputers for a long time, I've always harbored doubts about HP's PC's, though the last few years they seem pretty solid, I'm using one now 😉 And Compaq always goofed the PC architecture for vendor lockin.
And with true concurrency on multi-core micros, it could be that errors show up that don't show when running pseudo-concurrent on a single core.
Menchar
12-21-2010 06:19 AM
Hi,
I've got better results with a specific OS implementention.
I build this solution for use with labview, but you can use with CVI
I have monitored the time between call and the worst case was 3ms
Bruno Costa