12-15-2010 05:27 AM
Hi all,
We have a large program, that communicates with Data Translation ADC/DAC boards and with an additional board via LAN. We compiled it and installed it and it worked. After a week, it stopped working. the log files show that it simply stopped - it sent a message to the LAN, got an ack, and... that's it. I stress that this is compiled code that couldn't possibly have changed. sometimes it stopped after sending one parameter, and sometimes another, but it stopped in the same general area. installing it on a different computer showed the same problem. We tried to debug the code, nothing has changed and single stepping or probes show no problem.
Anyone have any idea what could be causing this?
Thanks,
Danielle
12-15-2010 06:10 AM
Danielle,
"stop working" can have different flavors:
- It can hang due to an error within a component
- It can run in an infinite loop
- It can wait for certain events which do not occur
So maybe you should trace CPU load and memory usage.
If it stucks in the same area of code on different machines, you should focus your search to this part (but not solely on it!). Are there chances of race conditions? Do you use parallel loops where you can track if only a certain loop stops iterating?
hope this helps,
Norbert
12-15-2010 06:31 AM
Hi,
Thanks for your quick reply!
Infinite loops and changes in the components are not possible, as this is compiled already. However, waiting for an event that does not occur could be (part of) the problem. Changing a few unrelated parameters caused it to work again (though I don't know for how long). I suspect a timing issue - that the event happens before we are ready for it and then our program waits for ever. This is not supposed to happen as we have a timeout, but perhaps...?. Or a race condition, that was solved by adding/changing our few parameters?
Even though it works now I would like to solve this because it might happen again - this time on a customer's computer :(. We will try to find the race condition/timing problem...
Thanks again,
Danielle
12-15-2010 08:54 AM
Danielle,
I am working with Data Translation (DT) boards and their code for a while now. It is sometimes not that easy. Have had memory leaks and overrun errors occasionally. I have changed my buffering, have added a lot more and bigger buffers than the system was estimated to need. This helped most of the time.
I suspect another reason for problems:
The DT code I use is based on dlls that are called with CINs by a library of DT vi's. Many of those CINs are configured to be NOT threadsafe, which causes them to be executed in the UI-thread of LabVIEW and with LabVIEWs priority. This _can_ cause long breaks whenever Windows has something with higher priority to do (Networking, access to HD, CD etc.). So many of huge buffers can help here.
Another hint was to rethink the architecture of your app, to decouple your DT-actions with the LAN actions. Use a producer/consumer template that sends commands and data via queues or such, so that time-critical actions are not hindered by potentially time-consuming tasks.
HTH!
Greetings from Germany!
--
Uwe
12-15-2010 09:54 AM
Changing a few unrelated parameter can work if it's a race condition, as it'll force a recompile which can (for the moment) fix the issue.
If it works when tracing but not at full speed, it can certainly be a timing issue.
Is it possible to post the code?
/Y
12-15-2010 10:23 AM
> "Only dead fish swim downstream"
Just curious: How do living fish get upstream?
12-15-2010 01:09 PM
@LuI wrote:
> "Only dead fish swim downstream"Just curious: How do living fish get upstream?
By going against the current and not following the main flow. 😉
Else i suppose their upstream is 28kb/s ... ;D
/Y