LabVIEW

cancel
Showing results for 
Search instead for 
Did you mean: 

Error 66 with DataSockets

Hello Megan,

After about 4 error-free days, yesterday I had 3 occurrences of this problem. I was at the computer for two of them (the third brought down the data collection in the middle of the night), and was therefore able to find out some more information.

-As per your suggestion, I had put a second while loop into one of my VI's, and during the slow-down this second loop was unaffected. Therefore I think it is only the DataSocket VI's, and not all LabVIEW processes, that experience the problem.

-As far as I can tell, the error does not seem to originate at the same VI every time. The VI's which run fast loops tend to "notice" the slow-down first, but of these there doesn't seem to be an individual culprit. Because each of my VI's contains dozens to hundreds of DataSocket reads/writes, it is not really feasible for me to tailor the error checking for every call to a DataSocket VI in order to pinpoint which exact call is the first to get into trouble. Instead, error information (including date, time and error number) ) is passed from one DataSocket VI to the next, then written to a text file and front panel indicators once every loop iteration.

-We only have these two computers running LabVIEW, so at the moment I can't check to see if a third computer can access the DataSocket server. However, when the slow-down is afflicting computer #1, the VI's running on computer #2 (which read and write to the DataSocket server running on computer #1) do not seem to be affected; I do not know this for sure, but I've now added timing indicators to the computer #2 VI loops, so the next time the error happens I'll be able to find out whether or not their DataSocket actions are affected. I have also written a simple VI on computer #2 that queries any one of the DataSockets (on computer #1's DataSocket server) -- next time I see the error occur, I'll check to see if this VI can still access them.

-Perhaps the most enlightening clue yet: the first time this problem happened yesterday, I was not taking any data, so only 4 VI's were running on computer #1 and 1 VI on computer #2. When the slow-down occurred, all 4 VI's on computer #1 were affected, but the VI on computer #2 seemed to be ok. I stopped the VI running on computer #2, and immediately all of the VI's on computer #1 recovered. I could then restart the VI on computer #2, and everything was back to normal. When the problem happened again later on in the day (again affecting only VI's running on computer #1), this time with several more VI's running on both computers, stopping the same single VI as before once again caused the system to recover. The VI in question does not do anything particularly dangerous; just a few DataSocket reads and writes in a loop that waits 200 ms at the end of each iteration. Also, computer #2 did not seem to be having any trouble that might cause these reads/writes to go slower than usual.

-Last week I poked around a bit and found out that both computers had hyperthreading enabled. After reading about some badness between LabVIEW and hyperthreading on these forums (probably old version of LabVIEW, however), I disabled hyperthreading on computer #2 with no noticeable effect. Disabling hyperthreading on computer #1 degraded performance so heavily that no data could be taken (our data is heavily-processed in real time), so I re-enabled it.

One of the most frustrating things about this error is the fact that I can't cause it to occur on command: it just occurs sporadically on it's own. I tried today to push the DataSocket server as hard as my VI's are capable, but it performed just fine.

Hopefully these additional pieces of the puzzle will give you a more complete picture,

Patrick

0 Kudos
Message 11 of 18
(2,450 Views)
Patrick,
Can you tell us more about what this vi on computer 2 is doing (from bullet point 4)? I find it really interesting that resetting this vi frees up the server. I know we've gone over the computer stats on the server, but what about memory \ cpu usage on computer 2?
 
Chris C
0 Kudos
Message 12 of 18
(2,439 Views)
Hi Chris,
 
I've attached the computer #2 VI (called "DewNerveCentre") to this message. Basically, it lets computer #1 (called "Max") open & run and stop & close VI's on computer #2 (called "Dew"). These VI's are called "RTA", "DSS" and "RTAconf". The DataSocket server is running on computer #1. One loop iteration of DewNerveCentre goes like this:
 
#1. Wait 200 ms
#2. Read from the boolean DataSockets "spawnRTA", "spawnDSS", "killRTA", "killDSS" and "killRTAconf".
#3. a) if all of these boolean DataSockets are false and/or the "spawnkilldone" DataSocket is true, write false to "spawnkilldone".
      b) if any of the boolean DataSockets are true and the "spawnkilldone" DataSocket is false, open & run or stop & close the appropriate VI, then write true to the "spawnkilldone" DataSocket.
#4. Read from the boolean "DNCcleanup" DataSocket.
#5. a) if "DNCcleanup" is false, write false to the "DNCcleanupDone" DataSocket.
      b) if "DNCcleanup" is true, make sure the run-state DataSockets for the computer #2 VI's match their actual run-state. Then write true to "DNCcleanupDone" and false to "DNCcleanup".
#6. Handle any errors that occurred in the DataSocket reads/writes during this iteration.
#7. Check to see how long this iteration took.
 
All of the occurrences of the "slowdown problem" that I have observed happened when DewNerveCentre is not trying to open or close anything; ie. 3.a) and 5.a) are executing.
 
In terms of computer stats, both computers are 3.2 GHz Pentium 4 single-processor machines, with 1 Gb of RAM. On computer #2, when we're not taking data (ie. only DewNerveCentre is running), LabVIEW is using 0-2% of the processor and ~25 Mb of memory. When we are taking data (DewNerveCentre + RTA + sometimes DSS) LabVIEW's processor usage oscillates around 60-75% and the memory usage sits at  ~450 Mb. No other programs are using any sizeable amount of cpu power or memory. Of course, I've seen the error 66 problem occur both when we are taking data and not, so I don't think cpu/memory issues on computer #2 are at fault.
 
Patrick
0 Kudos
Message 13 of 18
(2,436 Views)

Hi Patrick,

I still have the problem using DS on a DualCore-machine (DCM):

  • First start the DSS on a DCM
  • Link several readers (on the same machine) to the DSS
  • When the readers appear on the diagnostic screen of the DSS, link writers from a remote PC on the same items

The connections of the readers will be closed (with error 56)!!, when the writers connect. I tested this behaviour on different DCMs: On a new Dell-PC (only WXP SP2 and LV RTE 8.0.1, Firewall disabled) the problem only occured, when I acivated the Dual-Core-processing in the BIOS. With deactivated Dual-Core-processing everything was ok.

If your problem has a similar origin, perhaps you can try to avoid to connect the readers before the writer. In this case I had no problem. Anyway, I'm still waiting for a real solution of the problem.

Matthias

0 Kudos
Message 14 of 18
(2,427 Views)

Hi Patrick,

Thanks for providing us with so much information!  Here's another idea.  I see that if you stopped and restarted the VI, that the system recovered--does this happen with closing and reopening the DataSocket connection?  Although this is not an ideal solution, perhaps you can catch the error and simply close and reopen the DataSocket programmatically.  This method would at least minimize the downtime of your data collection.

Best Regards,
Megan B.
National Instruments

0 Kudos
Message 15 of 18
(2,416 Views)

Matthias,

We do not have a dual core machine (DCM) at the moment, but our computer #1 (on which the DataSocket server runs) does have hyperthreading enabled, so I thought it was worth a try to attempt to reproduce your error. Following the steps you described, I was unable to cause any errors. Additionally, I tried with both dynamic and pre-defined DataSockets, various combinations of readers and writers on both computers (including having multiple writers to the same DataSocket on both computers simultaneously, but only for pre-defined DataSockets -- it's not allowed for dynamically created DS's), with different orders of linking and unlinking the readers and writers to the server. Again, no errors occured. Also, my problem seems to happen after hours or days of continuous operation, rather than when a reader or writer is first connected.

 

Megan,

We have not had any further occurrences of this error since last Wednesday (the last time I posted new information), so I haven't been able to perform any new tests.

For the future, I have inserted some code into the potential "problem" VI ("DewNerveCentre") which will reset the DataSockets read/written from that VI when the user presses an appropriate button. For each DataSocket, the new code does a DataSocket Open, a DataSocket Close, then a DataSocket Open again. Does this sound reasonable? I do that first DataSocket Open in order to get the datasocket id that is needed by the DataSocket Close. At this stage it's not feasible to do a similar reset of all of the DataSockets, because there are just too many of them (>500). However, I'm hoping that the next time we get this error, simply resetting the DataSockets accessed by DewNerveCentre will get around the problem (just as stopping it currently does).

In fact, in all of the rest of my VI's I never explicitly run any DataSocket Opens or Closes -- I just run DataSocket Reads and Writes when I need to, and therefore the connections are automatically created or destroyed in the DataSocket server. Furthermore, all of my DataSockets are pre-defined in the DataSocket Server, so in practice they are never created or destroyed.

 

Patrick

0 Kudos
Message 16 of 18
(2,406 Views)
Hello all,
 
A quick update: We had another occurrence of this problem last Thursday, and resetting the DataSockets in my "problem" VI did not resolve the matter. Once again, only stopping and restarting execution of the VI fixed things.
 
So, we know that during one of these problems:
-the DataSocket server is running on computer #1
-DataSocket reads/writes coming from computer #1 VI's slow down greatly
-LabVIEW execution is fine otherwise
-it is still unknown whether the DataSocket reads/writes coming from computer #2 slow down, but I suspect that they do not
-stopping the "DewNerveCentre" VI (the "problem" VI) on computer #2 causes the whole system to recover, even if other VI's on computer #2 are still reading/writing to DataSockets
-simply resetting the DataSockets in DewNerveCentre has no effect
 
Frustrated by the loss of data, I implemented a crude "band-aid" solution -- a new VI that watches for a slowdown in the most sensitive VI on computer #1 and restarts the DewNerveCentre on computer #2. Communication for this new VI is done through file reads and writes, so it is independent of the DataSocket server. This new mechanism seems to have rescued our system once Thursday night; and although we have been taking data steadily through the weekend, we have not had any further slowdowns.
 
Patrick

Message Edited by PMCR on 07-04-2006 03:34 PM

Message Edited by PMCR on 07-04-2006 03:35 PM

0 Kudos
Message 17 of 18
(2,381 Views)

The solution is adding this code at the endo of the URL in both sides, in te URL of the server an the URL of the client 

 

?sync=”true”

 

add this and your VI's will be working witout disconnect

0 Kudos
Message 18 of 18
(1,473 Views)