Multifunction DAQ

cancel
Showing results for 
Search instead for 
Did you mean: 

wls connection reliability issue

I'm facing serious reliability issue trying to use wls-9211 and wls-9215 through WiFi dedicated infrastructure connectivity.

Specifically I've been able to configure the units, by means of an ethernet wired connection, following the clear documentation instructions. To go with the simple way, I've chosen a static IP address for them, and I tried all the authentication choices available: from none, through WEP, WPA-PSK to WPA2-PSK.

In every configuration I'm unable to get a satisfactory and reliable connection even with short distance range between the Access Point and the WLS units (same desk). I tried using simple solution too: a single WLS with a single Access Point and only the WLS under test associated on the Wireless network.

I used three different Access Point solution, going from D-link DWL2100AP, Netgear DG834G and Proxim AP2000.

I tried to force 802.11b only on the access points, then 802.11g only too, but nothing seems to help in solving the issues.

Basically I see a very unstable connection, I mean that also at low level checking done by means of ICMP, I get PING reply with time in the range 1ms to several seconds. Very often the WLS acts like if its TCP/IP stack hangs, I mean no reply to ICMP, even if the WLS link led remains active. 

The same situation in two totally different environment (where I work and also at my home).

I tried to change the channel, looking for the most free condition. I'm practically unable to get a long term stable connection with the units. If I fire up a different client (a simple laptop, Dell Latitude D800 with Intel Pro Wireless 2100, or Dell Latitude D820 with Intel Pro Wireless 3945ABG) i get stable and short round trip time by means of the PING (min 1ms, max 45ms, mean 1ms).

When I use the WLS as client, the situation is so bad that I often loose the connectivity also during a MAX configuration. When I try to acquire by means of SignalExpress 3.0 in continuous mode, I get frequent errors for connectivity reason.

If I use the wired solution, every thing works as expected. At the moment, to solve partially the problem I'm acquiring by means of a wired/wireless added bridge (using a D-Link DWL2100AP in client configuration). With this workaround I read perfectly the unit without problems.

We bought the WLS to use their WiFi connection. At the moment it's not possible. I know that the WLS is based on Atheros Wireless chipset. If I remember well, also the D-Link and the Proxim AP2000 are Atheros based. I tried also with an old Orinoco Radio Unit 802.11b in the Proxim AP2000, without improvement. I don't know the reference solution on which the Netgear Access Point is based, but I'm quite sure it's not Atheros. I've played with every parameter available, including short and long preamble.

I would like to know if someone has ever been able to use these units in a reliable way without disconnections, and just in case which Access Point to use to solve the issue as soon as possible (I would change immediately AP knowing a model which works for sure).

What else to say? Well, after the disconnection, especially with the D-Link AP I need to power cycle the WLS to gain the connectivity again. If I change the channel on the AP I need to power cycle the unit too. This is strange, because in a multi AP environment... it means that the WLS cannot roam from one AP to another (they are usually on different channels).

It's almost an entire week since when I started this challenge with these units. I faced many problems with WiFi devices in the past (I manage a medium size network), with multi SSID, multiple VLAN and so on. Now I'm giving my support for a project where it's needed to acquire data by means of these WLS. If there aren't hidden workarounds to solve the problem, like for example an updated firmware for the units, or some kind of "magic formula", I'm afraid that I'll go for the ENET-WIFI added bridge.

Hope to get some useful advice.

Thank you.

Mirko. 

0 Kudos
Message 1 of 24
(6,967 Views)

I performed other tests, about the issue discussed in the topic.

First of all, I tried to use also a Proxim AP4000 access point configured from scratch with WPA2-PSK (AES).

Even with this approach I've been unable to keep a constant connection with the WLS devices.

As a last choice I configured the WLS for ad-hoc connectivity, trying also different channels.

I did the test on two totally different environment.

The first was done by means of a USB2 WiFi adapter (SMCWUSB-G) installed on the host PC where the NI-SignalExpress was running (Dell OptiPlex Gx260). I assigned statically the IP address 192.168.1.1 to the WiFi key. The WLS was statically configured with IP 192.168.1.58.

Even with the devices at short distance I've been unable to record a log without interruption (only several minutes of continuous data streaming, at low sample rate, in the order of 2 Hz).

At home with a laptop, again the well known Intel3945ABG, same addressing used.

I had a little better result at home where the logging was performed anyway with interruptions (again slow acquisition speed, in the order of 2 Hz).

The only solution that we are going to deploy is to bypass the WiFi feature of the WLS, connecting them directly by wire to a Client configured dedicated AP (D-Link DWL-2100AP, with WPA2-PSK AES), one for each WLS. In this way the logging is continuous and reliable.

Another observation about the WLS: if you power cycle the Access Point to which they are associated, when the Access Point restarts, they don't reconnect to it, even if they weren't "active" during the task (no active logging from the host).

To test the whole thing I tried also to switch the Regulatory Domain configured on the WLS from ETSI (Italy) to FCC (US). That's because here in Italy we have 13 Channel, in place of the 11 available in United States. Even with this attempt I've been unable to get a working solution.

I exclude a problem with the network configuration on the host side. During the test with an Access Point, I was able to keep connectivity with it's management IP (located on the same IP network). The default implementation used with the Dell OptiPlex host does have a wired NIC, with address statically assigned on the same IP network configured on the WLS. For this reason it was not possible that a simple reconfiguration done in background by the WZC service of Windows XP caused the connectivity issue. There weren't wireless adapter on the host during that tests, and it was directly connected to the Access Point.

Hope to get at least a simple "reference" configuration, with the model of the AP on which NI is able to operate without issue this WLS device.

At the moment I'm deploying the two WLS that I have by means of the WiFi-Eth bridge workaround already illustrated. I won't be able to perform other tests on the devices.

Thank you.

Mirko.

Message Edited by Mirko1024 on 11-12-2008 02:43 AM
0 Kudos
Message 2 of 24
(6,923 Views)

Hello Mirko1024!

 

Thanks for your posts and comments on the new Wifi DAQ. I can say that the situations you are seeing are very interesting. If you could maybe put some data on the amount of disconnections you see with your WLS 9211/15. As you know these are very new products and R & D would of course like to hear feedback as we always strive to improve our products. The reason I ask is I can't really account for the problems your are seeing as a whole. For instance my set up is:

 

Dell Precision 380 Desktop PC with PCI NIC card

Wireless Netgear WGT624 4 port router

WiFI DAQ 9215 

 

I have been running a test for 12 hours and I did have two disconnections errors that I saw when running the test. The error in particulatar that was seen was:

 

Error -50400 occurred at an unidentified location

Possible reason(s):

The transfer did not complete within the timeout period or within the specified number of retries.

 

I have to be honest though when I saw these errors I was watching The Office on nbc.com so I am thinking that pretty much destroyed my bandwidth. To address you more specifically, in your post you mention the following:

 

When I use the WLS as client, the situation is so bad that I often loose the connectivity also during a MAX configuration. When I try to acquire by means of SignalExpress 3.0 in continuous mode, I get frequent errors for connectivity reason.

 

So how long are these connections breaks apart and what errors do you see? I have to admit when I first began running my tests I have my Netgear router set to channel 11. When I did this I noticed frequent errors from the Wifi DAQ 9215 and then later found out that our entire company infrastructure runs on this channel. So I was seeing error 50400 almost every 30 min or so. Then I changed the Netgear router to channel 02 and this fixed all my problems. I have been running test for 2 hours, 4 hours and 5 hours at different rates of 1000 Hz, 25,000 Hz and 100,000 Hz (the max rate of the 9215) and did not see any dropped communication. However this is not to say that I didn't see any network Lag. There is some RF testing going on in the area next to me and every time they would fire up a test I would see maybe a 2 or 3 second delay in my data before it would update on my LabVIEW VI. This is of course expected because all the RF signals are sharing the same air space. When the RF tests stopped my acquisitions went back to normal. I also have never seen a problem were you mention that you power down the AP and then never see the WiFi DAQ 9215 reconnect. I would strongly advise to check the settings here as I have never seen this problem with Netgear, Linksys or Cisco routers or AP's also in particular the Netgear WGT624 that I am using. I constantly move my setup to different location and the WiFi 9215 reconnects every time no matter in what security mode. 

 

So also to mention there is always a possibility for loss of communication when it comes to wireless devices. All I do in my LabVIEW programs to account for these errors is to do some simple error handling. It requires that I restart the task but if I am just doing machine monitoring this is not of a big concern as it takes milliseconds to restart a task. If you need real time monitoring of a system then I would strongly advise not to use Wireless DAQ. I do want to promote NI products but the WiFi industry standards are not perfect by any means and can be un-reliable at times. This is just the nature of wireless. 

 

I would like to help you with your WiFi problems so please let me know how frequent your errors are.  Also what version of LabVIEW are you using? I have a program that might be great for you that I have developed to handle wireless communication errors. Let me know if you would like to try this out! Let us know your findings and have a great day!

 

Cheers!

 

Corby_B

Message 3 of 24
(6,878 Views)

I would also like to mention that my WiFi DAQ 9215 was about 3 to 4 meters away from my AP for all my tests. I hope some of this information helps and let me know!

 

Cheers!

 

Corby_B

http://www.ni.com/support 

0 Kudos
Message 4 of 24
(6,873 Views)
Hi Corby.
Thank you for you kind reply to my questions.
To give you more information about the issues that I was dealing with, I must say that I'm totally new about NI devices. Instead I'm quite experienced about networking and wireless connectivity issue, because I manage the network on which the data acquired by the WLS will be transferred.
To explain in details the situation, we have an acquiring PC which isn't located on the field where the measures have to be taken.
The measures come from a field which is located about 1.5 km apart from the University where I work (as network manager, btw).
We have 4 K-type thermocouples acquired by means of the WLS-9211 and an E+E Elektronik Humidity / Temperature transmitter EE21-FT3A21/T24 which outputs its values as voltages in the 0-10 V ranges, acquired by means of two channel of the WLS-9215.
Now, on the measurement site we have an Access Point D-link DWL-2100AP configured with WPA2-PSK because the wireless connection is necessary. This is the Access point to which the WLS should associate to access the network.
The ethernet connection of the access point constitutes the Up-link toward the wired port of another DWL-2100AP device configured as bridge and with a directional antenna. This device is coupled with another unit in Line of Sight with it and located in a good location in our university at about 1.5 km from the "on field" corresponding unit. From there, by means of a dedicated VLAN we extend the network up to the acquiring PC, the Dell OptiPlex Gx260. This system has two network card. One of them is for the network standard connection. The other one, a 3com 3C905c-txm is connected to an access port of the dedicated VLAN where the NIC is connected. We decided to keep totally isolated the LAN for the measurement devices to avoid issues coming from ethernet traffic on the wireless bridge link. I used two NIC in place of just one configured with 802.1Q trunk for two VLAN, because in this way I was able to perform configurations and test before deploying the solution, connecting directly via the 2nd NIC on the OptiPlex.
To test the solution, I wasn't going to learn all the amount of documentation available for LabVIEW, even if I'm sure that an experienced user could setup the solution in a matter of minutes. I don't have the LabVIEW installed on the OptiPlex. I used the CD's available with the WLS: I installed SignalExpress, and fired it up, configuring all the stuff as explained. I was able to log data without issue when the WLS were connected by means of the ENET connection.
The problems arised when I swapped to wireless.
I don't repeat in this post all the tests that I've done and explained previously.
At the moment, I've deployed the solution, including the SignalExpress acquisition, adding a couple of D-Link DWL-2100AP configured as client for the Access Point available on the field. They are connected to the ENET-9163 by means of short 50 cm cable patches.
Without them the Wireless association of the WLS was too weak. I was unable to log data continuously as required by the application.
Now, I believe that by means of LabVIEW, it should be possible to poll the devices at specified rate managing errors and connectivity issue. But... if I loose the connectivity with the WLS (no ping... reply) I'm almost sure that there isn't any "error handling procedure" to overcome the issue. First of all I must get a reliable network connection with the device, connection that should recovery from unwanted disconnection/disassociation automatically. The WLS firmware at the moment is unable to give me this recovery, even if I tried 4 different Access Point's.
Just to give an example: if I change the channel on the Access Point, the WLS won't be able to keep the connection after the reconfiguration. I need to power cycle the WLS to associate again successfully with the Access Point.
The solution should be able to buffer data in the WLS unit for the small amount of time during which the ethernet (WiFi) connection should become unavailable.
I'm not acquiring at 100 kS/s. I could rely also on 1 sample every minute, but I've been unable to find a way to configure SignalExpress to give me reliable results (with buffering of data acquired during disconnections). I'm almost sure that LabVIEW should permit this approach (if the WLS carried is able to buffer the data).
About your question, I must say that the connection breaks are quite short, when the WLS is able to reconnect. In the order of 1 missed ping at 1 s rate.
When the WLS remains isolated... well, the answer is "connection break undefined".
I noticed that with the D-Link Access Point (based on Atheros like the WLS) the disconnection were more disruptive (no reconnection). I experienced reconnections only with the Netgear DG834Gv3 (based on T.I.) and perhaps with Proxim AP4000 (based again on Atheros).
Obviously I tried to change the acquisition rate (mainly to get sample time in the range of 1 s). In every condition I experienced at least the -50405 Error.
In any case, even without acquisition, and just by means of the ICMP I've discovered long replies in the order of hundreds of ms from the WLS. In this condition even MAX had issue in connecting the devices.
I've been able to smooth (sometimes) these problems trying to change channel. But in the test performed in two different location the underlying issue were always present. I made tests changing also the distance of the WLS from the Access Point (I tried with 50 cm, 1 m, 5 m and 40 m). During the tests there weren't other wireless devices near (like the PtP bridge mentioned). I had the AP directly connected to the NIC of the PC.
Now I can't perform other tests because the solution is deployed and I must improve the bandwidth of the bridge link to avoid issue in the data acquired from the ENET-9163 when there is other activity on the link (we need to manage another device deployed remotely which is basically a remoted desktop for an embedded application getting some other measures in the field). It's strange that you have reliable association of your WLS when you power cycle just the AP, while I had totally unreliable results. But now I can't check it more (and in any case I worked on them one entire week, without solutions).
If I can ask, how much time would it take to a completely new person like me with LabVIEW to poll the devices from the Optiplex PC one every minute and log the measures on a txt files?
If I could implement this thing easily, we have the license for LabVIEW 8.5 (even if I don't know anything about it). I think that the final user of the data would be satisfied of the results that I could harvest through LabVIEW, because perhaps the SignalExpress solution is too sensitive to connection interruptions. I would like to reiterate the acquisition if for some reason it fails. I've read in your posts that this should be easy to do. After all we don't need "real time" processing of the data. It's enough to get the measures with their time logged, at about 1 sample every minute. I think that perhaps I could directly look at your "program" if it isn't to hard to understand. I don't know the learning curve of LabVIEW...
The user needs to write the data on a Database to publish them graphically (through http daemon). The daemon should read data from the database. I should give him the data coming from the field. If it is easy I could write directly in the database (I don't know if LabVIEW has options to deal with database formats. Otherwise also a simple file based data logging should be good for the user.
Thank you.
Mirko.

0 Kudos
Message 5 of 24
(6,804 Views)

I encountered a similar problem, too.

 

The device I used is WLS-9234. I connected my PC to an access point with a cable, and through that ap to get WLS-9234 acquiring data by WiFi infrastructure connectivity. But I usually got an error while measuring and than it disconnected automatically.

 

I also do some test, like "ping -l 65500 [WLS-9234 IP]", and there is no answer from WLS-9234 only the request timed out message. Why?


What is the number of communication port? Is TCP/IP protocol of WLS-9163 UDP or TCP?


If it is TCP, do you have any suggestions for the TCP Windows Size and MTU setting?


Thank you.

 

Cathy

 

Setting Info :
Radio Mode : 802.11g
Wireless Mode : Infrastructure
Authentication : OPEN
Radio Channel : 2
Distance (AP to WLS-9234) : 5 m
Antenna : 7 dBi omnidirectional
Measurement : 4 channels with 10k sampling rate and 5k sample points, time out 50 sec.
Software : LabVIEW 8.2.1

Error Message - 50405 occured at DAQmx read (Analog 1D Wfm NChan NSamp) No transfer is in porgress because the transfer was aborted by the client. The operation could not be completed as specified.

0 Kudos
Message 6 of 24
(6,691 Views)

Hello Mirko and Cathy!,

 

Thanks for your posts!  

 

Mirko,  I would first like to say that even though Signal Express is a bit easier to use and great for someone with little knowledge of LabVIEW, it does have some limitations. For instance the error handling that you can do in LabVIEW is far better than in signal express. It seems that I am experiencing the 50405 error less frequently than you but all I do to handle that error is to clear the error and then re-set up the DAQmx task.  Once I clear this error I wait for a small amount of time, maybe 5 - 10 seconds to make sure that my WiFi DAQ has reconnected to the Access Point (AP) successfully. Take a look at the attached picture. This should be able to guide you in making your LabVIEW program even if you have little experience.  I am going to try out some things on the Access points you mentioned that you were using. I have never see the going problem where the WLS-9162 doesn't reconnect to the AP after a disconnect (error 50405). I have been using a 4 port Wireless Netgear WGT624 router and several Linksys routers without a problem. 

 

Cathy,

 

The WLS-9162 with a 9234 inserted does support UDP and TCP/IP. Here is the link to the manual that talks about this in more detail.

 

NI WLS/ENET-9163 User Guide and Specifications

 

Also take a look at the attached picture to see if this helps with your programming approach. The attached pictures shows how to do some simple error handling in case you see a disconnection between your router and the WLS-9162 carrier. 

 

Let me now if any of this helps and wish you both the best on your applications. Mirko I am going to see if I can get a hold of some different access points to address your situation more specifically. Have a great day!

 

Cheers!

 

Corby_B

Message Edited by Corby_B on 11-30-2008 09:14 PM
0 Kudos
Message 7 of 24
(6,664 Views)

Hi Corby,

thank you for your kind reply.

About your suggestion, regarding the use of LabVIEW in place of Signal Express I must agree that it permits to manage errors and connectivity issues. Nevertheless, it must be highlighted that in my experience about "networking" it is best investigate and solve problems at the lowest layer when possible.

Now, it seems to me clear that if swapping the wireless section provided by the WLS-9163 carrier with a simple workaround that I depicted previously (using in fact a DWL-2100AP configured as client) solves the instability of the wireless connection, this means that there is some kind of "unknown" interaction between (at least) the AP and the WLS. It could be for example the 2nd antenna (AUX) left enabled for "diversity" operation in the firmware of the WLS the cause of the problems I was facing. If possible, I disable the diversity operation on devices that have to option to select it. It requires specific support, and also some kind of antenna tuning. But after the connection is usually much more reliable.

In any case, being the solution deployed and working at the moment, I don't have the option to test back the direct WLS to AP native connection.

I'm instead trying to implement some kind of upper layer recovery solution, that gives us a reliable data collection, even if the network should be temporarily unavailable (lets say to overcome possible bridge connectivity interruptions).

I have understood that to get this result I must go with the LabVIEW solution, and I will try to use the provided picture to implement it in the first stage.

At the moment, I have already installed LabVIEW 8.5 and I've been able to read from the WLS by means of some simple steps. Now I'm looking for something that gives me a sample every 10 minutes, able to buffer data for half a day (something like 72 samples), if the network connection should become unavailable for some hours. The problem is that I'd like to log the data on a txt file or even better through a database framework, but from what I've read, I must accept to wait for the buffer to be full, before reading and visualizing the data. This would give me a latency of 12 hours, which would be not too good. It would be good to "wait" for data temporarily unavailable (buffered on the WLS-9211/9215) and acquired in a burst fashion when the connectivity reestablishes and I'm quite sure that looking through the examples I'll find a suitable solution, but in the meanwhile I'll try the proposed "error handling example" that you provided kindly to me also to become more familiar with LabVIEW.

Thank you again.

 

Cathy, thank you for your post. I've tried with my implementation and I must agree with you. I don't get replies for ICMP of similar size. And I'm using the Carrier with Ethernet wired connection. But it could be some kind of "by design" choice.
Mirko.

0 Kudos
Message 8 of 24
(6,651 Views)

Hello everybody,

 

I have a problem regarding WLS 9206. Actually when the connection to WLS lost even for a short period of time, in DAQ Read function prompts an error. I have set the timeout to 10, or even -1 (for infinite wait) but DAQ read has the same problem and error.

 

this is the error message: Error -88710 occurred at DAQmx Read (Analog 2D DBL NChan NSamp )

 

I also check with simulation device, when I delete it from Measurement & Automation explorer and again adding it immediately, I see this problem from DAQ read again.

 

I expect after removing the device and adding it again, DAQ can read and continues its normal operation. In other words it seems -1 for timeout is not working at all.

 

I will appreciate if somebody answer me, Thanks! 

0 Kudos
Message 9 of 24
(5,493 Views)

Hello rezaravani,

 

Thanks for your post. 

 

First thing I should mention is that we have a new firmware out for the WLS-9163 carrier that significantly increases network connectivity and streaming in saturated wireless environments. You can find this new firmware here.

 

So I see that you are getting error -88710 when you are trying to perform a DAQmx Read function. Are you removing the device from the system and then re adding it when the VI you have attached to this forum is still running? This would happen if a device is deleted from MAX and the re-added with the same name or if the device dropped out of wireless communication and then came back. When this happens, it may be the same device and even have the same name but the reference that was opened to the device is not the same. So what needs to happen when you encounter an error at a DAQmx Read is to clear the task and error and start again by calling a DAQmx Create Virtual Channel.vi and corresponding configurations. 

 

So as in the simulated device situation, if you are leaving the VI running, deleting the device from MAX and then reading it while the VI is still running then I would expect error -88710 because the device that you initially opened a reference to was removed. So -1 timeout on the DAQmx read function is working correctly because the device has been removed from the system. In other words the DAQmx driver can't guarantee the status of the device you are using anymore because its physical location and or situation has changed. 

 

A good rule to follow with DAQmx is that any time you see any errors it needs to be accounted for. So any time you see a buffer over flow, underflow or device disconnected from the system (as could be the case with wireless) you should clear the current task and attempt again as mentioned in the first paragraph. See the VI I have attached below as a good example of how to do this. 

 

Let me know if this helps with your application and some concerns you might have about using our wireless and DAQ products. Feel free to post back if you have any questions at all. 

 

Cheers

 

Corby Bryan

NIC AE-Specialists

Message 10 of 24
(5,454 Views)