10-26-2015 08:21 AM - edited 10-26-2015 08:23 AM
Hello all,
To the few brave souls who clicked on this thread, (thank you) here's what we've got:
I'm assisting in debugging a particularly elusive crash in some LabVIEW code:
Shoot... I just realized he has a thread open, too: (Slightly embarassing on my part)
http://forums.ni.com/t5/LabVIEW/Microsoft-Visual-C-Runtime-Library-Crashes/td-p/3205010
TestStand is calling a complex bit of LabVIEW code, as alluded to in this thread:
http://forums.ni.com/t5/LabVIEW/Exact-Meaning-of-Resetting-VI-Dialog-Specific-Context/td-p/3206889
At varying, unpredictable intervals we get a MS Visual C++ runtime error dialog that alludes to LabVIEW.exe requesting runtime to terminate it in an unusual way.
My colleague here managed to catch an instance of the crash, and I'm now going through the dump file.
(The following is for the benefit of anyone else who ever has to do this)
.load wow64exts
.effmach x86
!analyze -v
That yielded the exception analysis (full text in attached file)
The first few lines of which look like:
******************************************************************************* * * * Exception Analysis * * * ******************************************************************************* GetPageUrlData failed, server returned HTTP status 404 URL requested: http://watson.microsoft.com/StageOne/LabVIEW_exe/12_0_1_4014/522f27d1/KERNELBASE_dll/6_1_7601_18869/556363bc/e06d7363/0000c42d.htm?Retriage=1 FAULTING_IP: KERNELBASE!RaiseException+58 7669c42d c9 leave EXCEPTION_RECORD: ffffffffffffffff -- (.exr 0xffffffffffffffff) ExceptionAddress: 000000007669c42d (KERNELBASE!RaiseException+0x0000000000000058) ExceptionCode: e06d7363 (C++ EH exception) ExceptionFlags: 00000001 NumberParameters: 3 Parameter[0]: 0000000019930520 Parameter[1]: 000000000d0bf9c4 Parameter[2]: 00000000022ea4a8
This is killing our tests pretty frequently, and the crash occurs much more often the more test sockets there are. (Yeah, this might be better served in the NI TestStand forum, but it's a LabVIEW crash...)
I know that this will probably end up as a support ticket, but can anyone give me a direction to go in? Is it a driver issue? .NET?
Is there any particular behavior or action that I should be looking for in the LabVIEW code?
We're trying to isolate the offending LabVIEW code, but in the meantime, any advice would be much appreciated.
Thank you so much,
Jim
10-27-2015 06:41 AM
Alright, it seems that this thread is turning into a tutorial of sorts. (Or at least that's how I'm now redeeming it)
Again, for anyone who ever has the unfortunate need to do this, once you've performed the above, enter the command "kb" in WinDbg to reveal the call stack. Alternatively, you can go to View->Call Stack from the upper menu.
For me, that yielded the following:
0:023:x86> kb ChildEBP RetAddr Args to Child 0d0bf970 7412df60 e06d7363 00000001 00000003 KERNELBASE!RaiseException+0x58 0d0bf9a8 00f082ce 0d0bf9c4 022ea4a8 09f292b8 msvcr90!_CxxThrowException+0x48 [f:\dd\vctools\crt_bld\self_x86\crt\prebuild\eh\throw.cpp @ 161] WARNING: Stack unwind information not available. Following frames may be wrong. 0d0bf9d4 00f0f23b 577278e8 195b1478 00000000 LabVIEW!DevClose+0x199ae 0d0bf9f0 013b6590 09f292b8 195b1478 0d0bfa64 LabVIEW!AutoClose+0x432b 0d0bfa00 04cda803 09f292b8 195b1478 00000000 LabVIEW!ClearTDR+0x1a40 0d0bfa64 04ccea06 09f292b8 195b1478 00000000 tdcore_12_0!TD::CopyData+0x63 0d0bfbe0 04cd0556 09f292b8 2f408e69 00000000 tdcore_12_0!LvVariant::SetContents+0x546 0d0bfe4c 04ccfef4 2f408e38 00000000 00000000 tdcore_12_0!LvVariant::SetContents+0xd6 0d0bfe74 04cd0ea4 2f408e38 595e99b4 00000000 tdcore_12_0!LvVariantCopyWithContext+0x74 0d0bfea4 01ce1597 04491180 00000000 51b974d0 tdcore_12_0!LvVariantCopy+0x14 0d0bff58 01cc6f78 04491380 00000001 76c614db LabVIEW!GetCurrentExecutingVIPath+0x6f17 0d0bff74 1005a6e9 00000100 00000000 00000000 LabVIEW!EnqPrRunQ+0x428 0d0bff88 76c6337a 0e0c45c8 0d0bffd4 775392e2 mgcore_SH_12_0!ThLocalStorageFreeKey+0x39 0d0bff94 775392e2 0e0c45c8 7a689632 00000000 kernel32!BaseThreadInitThunk+0xe 0d0bffd4 775392b5 1005a6c0 0e0c45c8 00000000 ntdll_77500000!__RtlUserThreadStart+0x70 0d0bffec 00000000 1005a6c0 0e0c45c8 00000000 ntdll_77500000!_RtlUserThreadStart+0x1b
Because I don't work for NI and quite understandably don't have access to their symbols, I only have addresses to look at in the above. However, things are thankfully rather intuitively named, for the most part. (Thank you, NI developers)
I now have a lot of clues about where the crash is occurring.
Therefore, I started searching for instances of the above functions in the LabVIEW code. Upon doing that, I found only one instance of the cooinciding LabVIEW functions and consequently have a very educated guess as to where the crash is occurring: A dynamic call of a VI using the "Start Asynchronous Call" node.
I hope this monologue helps someone someday.
If it does, kudos are always appreciated.
10-28-2015 09:54 AM
Mr._Jim, have you had any progress in your debugging? If the issue occurs more frequently when you add more test sockets, it sounds like it could likely be a memory issue as suggested in the other forum post.
You may want to try and decouple your LabVIEW code from Teststand to help isolate the issue. It might help show if it's something on the Teststand side or the LabVIEW VI side.
Clemens | Applications Engineer | National Instruments
10-28-2015 03:14 PM
Hi Clemens,
Thanks for the reply!
We've tried everything. I thought initially that it was the ACBR node issue addressed by CAR 492898, but we've tried every workaround we can think of to no avail. We changed the method by which we were closing the VI ref and switched to the "Run VI" method as well, but neither of those eliminated the issue. The code's pretty tightly coupled, so it's tough to break it into managable chunks that can be debugged individually. I'm starting to think that there's something else going on here that isn't related to the issue in that CAR.
We're about to try migrating the whole thing to LabVIEW 2015 to see if that helps. If indeed it is the problem addressed by the above CAR, at least that will have been fixed as of LabVIEW 2014 SP1.
If that doesn't help, then we may have to do as you've suggested - take the various code components out of TestStand and put it in a calling VI.
Again, I very much appreciate the reply.
Kind regards,
Jim
10-29-2015 05:29 PM
Mr_Jim
Do you have the Desktop Execution Trace Program on the computer with LabVIEW on it? If so, I would run a trace execution when your LabVIEW code is running to see if it catches any memory leaks or memory resource issues.
Just to isolate if it is a TestStand issue or a LabVIEW issue, something that is worth trying is to run the TestStand sequence in single pass and see if the crash ever occurs (make sure that you first restart TestStand). That way if it is reference issue inside LabVIEW, the crash should never occur since reference issues should not exist, because it is only being called once. If the crash does occur, I would watch the tracing on TestStand to determine what module is causing the crash and that way you have isolated the VI to go ahead and debug, presuming there are multiple LabVIEW VIs in TestStand calling C++ code.
Also another thing to check is that in the LabVIEW code you are closing all the references that are being opened.
Let us know if the upgrade to LabVIEW 2015 helps, if the problem is associated with that specific CAR it should fix that issue.
From my understanding the TestStand sequence is calling a LabVIEW module and then inside that LabVIEW module it is calling some C++ code, is that correct?
Based on being LabVIEW or TestStand issue, different memory troubleshooting steps can be taken.
10-30-2015 07:08 AM
Hi Sarina,
Again, thanks for the reply. I've recommended the DETT as a possibility to the individual maintaining the station code, but I don't think they've gone that route yet. I'm hopeful that that could yield some success, especially if it catches a crash. We know which VIs it is running when it crashes, but there are several possibilities due to parallelism.
From what I'm hearing they are trying the LabVIEW 2015 upgrade, but progress is delayed because of a shortage of UUTs - it will likely be next week before we know whether that will yield any success.
We could try a single pass, but unfortunately it usually takes a few runs for the crash to occur. To my knowledge it has never occurred on the first run - it's intermittent and unpredictable. It varies in appearing from 3 hours to 3 days.
> From my understanding the TestStand sequence is calling a LabVIEW module and then inside that LabVIEW module it is calling some C++ code, is that correct?
You're partly correct - the TestStand sequence is calling a LabVIEW module which has several dynamic calls within it. Various daemon VIs are spun up to run in the background and then shut down, but via LabVIEW, not TestStand. There is no C++ code - the crash dump above pertains to LabVIEW itself.
Thanks again for your input - we'll likely have more information next week.
Regards,
Jim