NI TestStand

cancel
Showing results for 
Search instead for 
Did you mean: 

TestStand Process Hangs on Wait Step

From time to time we have experienced TestStand hanging on a wait step configured for a constant time period (e.g.. 0.2 seconds).  The execution will enter the wait step but never return from it, hanging the execution thread indefinitely.  The only way to clear this is to break the execution, then terminate, killing all threads and restart TestStand.  We all using TestStand 2010 SP1 running on Windows XP but have seen this issue on previous versions of TestStand (4.2 and 4.0).  Most of the time we execute these sequences via a .NET operator interface based on the Fully featured example shipped with TestStand, but I do not believe this is an Operator Interface issue, as it only occurs when executing wait steps. Memory and CPU usage look normal to me at the time of the lock up and we have experience this on multiple PC's.  Its just like TestStand enters a wait process but loses the plot and never returns.

 

Has anyone else experienced this sort of issue, or have any ideas on what may be causing it?

 

Regards,

 

David

0 Kudos
Message 1 of 26
(6,036 Views)

There may be something else hanging but because of a station option you're not seeing where the hang is actually occurring, e.g. you're in a post action callback that is hung. Try turning on all tracing and breakpoint options in Station Options and see where that gets you. Or it may be a bug...

CTA, CLA, MTFBWY
0 Kudos
Message 2 of 26
(6,031 Views)
There is no known issue with wait steps hanging. Can you break in the execution when this occurs? Does a regular terminate work? What is the wait step configured to do? Are there any other threads in the execution? If so perhaps its a different thread that is hung, not the one with the wait step? Can you reproduce the problem?

-Doug
0 Kudos
Message 3 of 26
(6,024 Views)

Hello Doug and SnowPunter.  The hang would appear to occur randomly in that nothing we have tried has been able to reproduce the problem .  We are running in a single thread and the wait steps where the problem occurs are configured to wait a set amount of time (anything from 0.2 secs to 60 seconds).  When the hang up occurs, the execution will not respond to a terminate command.  We must pause the execution, then choose terminate.  Even then we have to abort all to get control back of Teststand.  There are no post action callbacks that I'm aware of.  We have not tried running with tracing turned on because this slows down the execution way too much and the problem happens so infrequently that we cannot justify doing that.  When the problem does occur we can lose days of testing due to having the terminate the test and restart it. Its like Teststand disappears into the wait process and gets lost - never returning.

 

Regards,

 

David

0 Kudos
Message 4 of 26
(6,020 Views)

Do you perhaps have a sequencefilepoststep callback of some sort? If so it's possible it's hung inside of such a callback.

 

-Doug

0 Kudos
Message 5 of 26
(6,007 Views)

I have seen this issue intermittently as well, but with multiple step types and not just a wait.  The problem seems to be worse when we are running TS 2010 SP1 as an external process from CVI 2010 SP1.

 

When TS locks up, we are not able to click on ANY menu or icon and have to use task manager or CVI break to terminate.  We are using many custom step types as well as pre/post calls.   We have used these steps and sequences succesfully with CVI 2009 and TestStand 4.2.1

 

We are using Dell Precision T3500 Workstations running Windows 7 and have updated all Bios and drivers to latest versions as of this post

 

 

0 Kudos
Message 6 of 26
(5,954 Views)

We are still seeing this issue although we had experienced it in Teststand 4.2 (but not earlier versions of TestStand - I have been using TestStand since the days of TestStand 2).  There are no post step call-backs or post action steps of any kind.  Almost all of our sequences run in single threads. The issue is as far as I can see a lock-up on a simple wait step configured for a time period. The problem is that it occurs so intermittently it is difficult to pin down.  For instance we are running nine automated RF test stations, executing tests for up to 4 days at a time.  There can be a lot of wait executions over this period of time, as we have to wait for the product we are testing to cool to within a reference temperature window.  On average we may see one lock-up in a month, but when it occurs it can kill a 3 to 4 day test requiring it to be restarted - very frustrating for the users.

 

We do execute via a run-time operator interface based on the NI example with a few modifications.  However, we have never had any problems running this, and as it is the wait step causing problems I cannot think it is just caused by the operator interface.

 

Regards,

 

David

 

 

 

 

0 Kudos
Message 7 of 26
(5,946 Views)

Does the background of the execution show in yellow to indicate it is paused when you break the execution when this occurs? Is the thread on the wait step the only TestStand thread running? Since you are not tracing, how are you determining that the thread is on a wait step? Is there any CPU usage showing for the process when this occurrs?

 

Thanks for any info that would help us reproduce this problem.

-Doug

0 Kudos
Message 8 of 26
(5,939 Views)

Hi Doug,

 

No, the execution does not pause (and turn yellow) - it hangs.  If we hit the break button then the step that the execution is stuck on is the Wait step.  It is not possible to resume execution after this point.  We need to terminate TestStand - using task manager.  The thread executing the wait step is the only thread Teststand is executing at the time of the lock up. There is nothing unusual about CPU and memory usage for the Teststand process at the time of the lockup.  It is just as simple as it can get. One thread running, the executuon enters the Wait step but never returns.  The same step/sequence will work when executing 99.9% of the time so it must be an execution level issue.  I have tried looking for external processes that might be affecting the execution but haven't seen anything definite to date.  I did have a suspicion that the Windows update service might have been causing some issues but have not been able to prove anything conclusively.

 

Regards,

 

David

0 Kudos
Message 9 of 26
(5,928 Views)

Are you running windows update on the machines while teststand is actively running? Have you noticed the problem generally happening around the time of the update?

 

The execution not turning yellow indicates that it could not be considered suspended which is odd because wait steps are using Thread.ExternallySuspended, so the execution should be turning yellow, even if the wait step is hung internally, unless there is another thread in the execution that is still executing a code module, or the hang is happening in a callback and not the wait step, or the hang is happening in the wait step after it sets Thread.ExternallySuspended to false (there are no such known issues and not much code involved after that point though).

 

I'm assuming you aren't able to reproduce this, it only is happening very rarely correct? If it happens again, and you are able to do so, you might want to leave the machine running with teststand hung and contact NI support for help with debugging the process.

 

We will also try to reproduce this on our end, perhaps while running Windows update. Thanks for any additional info. Let me know if you have any questions that might help you narrow down what's happening.

 

-Doug

0 Kudos
Message 10 of 26
(5,918 Views)