problems executing Switch Executive connect/disconnect in parallel (LabVIEW)

AnkeS · ‎03-06-2007

My LabVIEW Application works 2 separate relays of a PXI-2569 board using the Switch Executive niSE Connect/niSE Disconnect VIs.
I have defined the respective routes in my niSE Virtual Device.
Called separately the connect/disconnect VIs work just fine.

In my current Application there is a loop containing a niSE Connect of Route1 and a niSE Disconnect of Route2.
Both calls are not put in any sequence, so they may execute in parallel.
Repeatedly connecting or disconnecting the same route results in an error message, but that is accepted - I am not discussing application design here.

What worries me is the fact that every now and again the niSE Connect function returns the error message -2904 ("You cannot disconnect a route that does not exist") that clearly belongs to the niSE Disconnect call. It seems that both calls somehow interact on a driver level and then show unpredictable results.

I can put the calls in sequence within this small application, but it leaves me rather scared to use Switch Executive VIs i a context where switching activities go on in different parts of a larger program.

I am using LabVIEW 7.1 and SwitchExecutive 2.1

Anyone care to comment?

Thanks
Anke

chericks1 · ‎03-07-2007

Hi Anke,

You have found an issue with NI Switch Executive when it is used in a multi-threaded application. The problem is caused by the shared Switch Executive session handle between the two VIs (niSE_Connect and niSE_Disconnect). The design of your program allows these VIs to be executed in separate threads, the first executed VI (niSE_Disconnect) can be interrupted and its error message can be referenced when the second VI (niSE_Connect) executes. Underneath the hood, the calls to niSE_Connect and niSE_Disconnect are both atomic and threadsafe. If you look at their subVI implementation, however, you'll notice that the call to retrieve the error information is done separately and with parallel execution you cannot guarantee that the error retrieval will happen immediately after the call to niSE_Connect/Disconnect. As you have witnessed, niSE_Connect can thus return niSE_Disconnect's error and vice versa. Because the execution order of niSE_Connect and niSE_Disconnect is not deterministic within a multi-threaded application, this problem is likely to occur if the user does not take steps to prevent it.

Are you designing your NI Switch Executive code with the intention of creating a multi-threaded application? In other words, is it a requirement to have niSE_Connect and niSE_Disconnect calls in separate threads? If multi-threading is not a requirement, the problem can be avoided by performing the niSE_Connect and niSE_Disconnect steps sequentially which is easily enforceable by chaining the error cluster or session from VI to VI. If separate threads are necessary, one workaround is to utilize semaphores within your LabVIEW code to ensure that calls to niSE_Connect and niSE_GetError (or niSE_Disconnect and niSE_GetError) happen together before calling other niSE functions. The Semaphore palette is found by right-clicking on the block diagram and selecting Functions>>Data Communication>>Synchronization>>Semaphore.

Let me know if you have further questions or if this does not resolve your issue!

Chad Erickson

Switch Product Support Engineer

NI - USA

AnkeS · ‎03-08-2007

Hi Chad,

thanks for your elaborate answer.

Yes, I shall have to use NI Switch Executive VIs in a multi-threaded application.
Implementing a synchronization mechanism however presents no problem, as I already use a wrapper VI around the niSE_Connect/Disconnect functions to extract the route name string from an enumeration type.

But I still wonder why the Switch Executive VIs are marked reentrant

by NI, when thy definitely don't work that way ?!

Anke

Oli_Wachno · ‎03-08-2007

Hi Anke,

you said, you don't want to discuss architectural issues but is it really neccessary to have thos connects / disconnects in parallel?

But you're right, it sounds strange. Can you post an example screenshot of the block diagram? How did you distribute the SE-Reference to th VIs?

BTW:choosing the multiconnect option can resolve the errors you experience when performing consecutive "Open Routes" respective "Close Routes"

Oli

OK, seems to me, I've missed Chad's anwer (I hate this browser )..... sorry

Message Edited by Oli_Wachno on 03-08-2007 03:12 PM

Srđan_Zirojević · ‎03-08-2007

Anke,

just to touch on the reentrancy issue - if the VI is reentrant, that only means something for the instances of the same VI, and not for the instances of connect VI versus disconect VI. Even then, the reentrancy only deals with the data held by the VI - not by the data held by an instrument driver instance (session).

So, reentrant or not, that won't help you in synchronization issues you have with this parallel execution. If you are wiling to share some broad aspects of your application, it may be a good learning experience for all of the audience (as well as you and us here at NI) to try to come up with either an alternative design for your application, or a way to handle your design in the most feasible way.

best regards,

Srdan Zirojevic

AnkeS · ‎03-09-2007

Oli, Srdan,

Yes, I know about reentrancy and a function's local data space.
In practice, being reentrant also means that more than one instance of a function may execute at the same time.
For the niSE VIs I now believe this is not the case - not because of local data but because the VIs make calls to the instrument driver that must not be made out of sequence.
So - why make them reentrant in the first place??

I simply took the niSE VI's reentrancy as an indication that there would be no trouble using them in a multithreading environment. I was wrong!

Oli suggest using the multiconnect option.

Actually that got me into trouble already:
Imagine a loop that executes say once every second, run that loop for 5 minutes repeatedly connecting a single route, and then try to disconnect that route. You will have to call the disconnect function 300 times - undoing every single connect - before your switches open again!

Does an application need to call connect/disconnect functions simultaneously in different threads?

I agree that in a good data flow driven design this should be avoided, if possible.

The small application I was working on when I stumbled upon the error above, certainly falls into that category. I can easyly put my calls of niSE_Connect/Disconnect into a sequence and make everything work OK that way. But I also want to know why the niSE VIs behave the way they do, to avoid further trouble in the future.

We use LabVIEW and some switching and measurement (among others) hardware to test our products. Usually a test specification gives a sequence of test steps to be performed one after the other. So no problems there...
BUT
during the tests there are other tasks like monitoring and controling environmental conditions, that are performed in different parts of the program.

I believe in this case a multitasking/multithreading design is the right approach.

How to implement synchronization?

I don't want to put any extra synchronisation functions into my test or monitoring tasks.
That leaves 2 solutions I can think of

1.
Modify all niSE Switch Executive Vis that call a sequence of instrument driver functions.
I would perhaps use a semaphore that is created in the niSE_Open_Session.vi, stored in a LV2 style global , and destroyed in the niSE_Close_Session.vi.
(I wish NI had done something like that!)

2.
Put a wrapper VI around the niSE VIs.
Synchronisation may then be done using a semaphore, or by a single non reentrant VI to execute all possibly dangerous niSE VIs - one at any time.

Any more ideas ??

Thanks for your comments
Anke

Oli_Wachno · ‎03-09-2007

Hi Anke,

to be honest, I'm also not very happy with this multiconnect mode, so I don't use it

Regarding the synchronization issue, it would be good to know which priority the monitoring tasks have and f there are constraints on the total testing times of your DUT. For eample, I can imagine having a "monitoring daemon" that runs time triggered on one hand, but keeps in standby / suspend mode as long as there is no semaphore. This can be delivered for example using a named single-element queue: upon completion or before start, your test writes an element to the named queue, idles for an acceptable amount of time (depending on your allowable test times, if you have mechanical loading/ unloading to wait for, put it there) and then check the queue for an available element (and then remove it). The monitoring daemon runs in parallel monitorig the named queue on a time triggered base (e.g. activate the checking of the queue every 2 minutes, remove the element to lock the NISE for the tests, put back the queue-element after completion and go to standby again for the next two minutes).

Does that make sense / did I explain that clearly enough (I fear I didn't )?

Have a nice weekend!

Oli

Srđan_Zirojević · ‎03-09-2007

Anke,

I've read your description, and I agree that you would definitively benefit from parallel execution, since your application really needs to operate on independent switches concurrently. (I assume that those are different switches, and I will elaborate on this below)

Just to quickly address some of your questions:

the driver VIs are marked reentrant to prevent artificial/unnecessary locks by LV if you are trying to call the same VI, but on a different session or with different (non-colliding) input parameters. There are no limitations on NISE C entry points - they are thread-safe internally and will run atomically, and if you plan your app carefully you may benefit from parallel execution.

I could talk more about multiconnect mode and try to understand why that one does not work for you, but we may talk about that after we remove this major issue from the table.

I would like to point out that the only thing wrong in the calls you do is the error that is reported. I.e. the calls to connect, disconnect, etc. legitimately failed. They failed because of the application design, and not because of the threading mishaps. The threading can be blamed only on the incorrect errors reported, not on the errors themselves.

So, that said, are you really developing your application in a way that you expect errors to happen and react on them? If yes, I would strongly reccomend alternative approaches. One reason, - nise's performance is optimized for the correct case. Error cases may take longer to execute because of the unwind semantics we implement. Second, and this is my sujective view, the code is easier to write and maintain if the flow of function calls is expected to work in most cases, and failures are typically exceptions to handle, and not the part of normal program flow.

That said, I would like you to pay attention to failures, and understand that those failures will not go away if you implement any/all of the sync objects you wrote about.
I mentioned at the beginning something about accessing different portions of the switch. That should always work in parallel. But, if you are accessing same relays, then you must ensure that, for correct behavior, you are able to perform those operations.

I have some other ideas about what may go wrong (i.e. you will get no errors but you will still get wrong behavior) if you are blindly hitting on same relays in parallel.

So, please tell me more about two parallel procedures (in high level of course) and what they do, what portions of the switch they bang on. I want to understand why do you get any errors in first place.

best regards,

Srdan Zirojevic

Srđan_Zirojević · ‎03-27-2007

Anke,

Did you figure out anything more about this problem? It did seem interesting that the behavior was erroneous to begin with. Please let us know if you're still seeing the same behavior, and whether or not you attempted to figure out what was at fault.

best regards,

Srdan Zirojevic

AnkeS · ‎03-28-2007

Chad, Oli, Srdan,

thanks for the help I got here!!
I was able to solve my immediate problems and I believe I've got a pretty good idea about how the SwitchExecutive API is actually working and where the pitfalls are.

I'm sorry I was not able to respond sooner. Athough a perfectionist at heart, unfortunately I am paid to deliver results in time and within budget.

Having said that, I would still like to continue this discussion - maybe at a slower pace - , as my questions seem to have touched quite a number of interesting issues.

I did prepare some code examples and would like to include the screenshots in the text rather than use an (invisible) attachment.
I'm still working out how to do that.

Anke

Switch Hardware and Software

problems executing Switch Executive connect/disconnect in parallel (LabVIEW)

problems executing Switch Executive connect/disconnect in parallel (LabVIEW)

Re: problems executing Switch Executive connect/disconnect in parallel (LabVIEW)

Re: problems executing Switch Executive connect/disconnect in parallel (LabVIEW)

Re: problems executing Switch Executive connect/disconnect in parallel (LabVIEW)

Re: problems executing Switch Executive connect/disconnect in parallel (LabVIEW)

Re: problems executing Switch Executive connect/disconnect in parallel (LabVIEW)

Re: problems executing Switch Executive connect/disconnect in parallel (LabVIEW)

Re: problems executing Switch Executive connect/disconnect in parallel (LabVIEW)

Re: problems executing Switch Executive connect/disconnect in parallel (LabVIEW)

Re: problems executing Switch Executive connect/disconnect in parallel (LabVIEW)