LabVIEW 2012: Crash Dump Exception Analysis

Mr._Jim · ‎10-26-2015

Hello all,

To the few brave souls who clicked on this thread, (thank you) here's what we've got:

I'm assisting in debugging a particularly elusive crash in some LabVIEW code:

Shoot... I just realized he has a thread open, too: (Slightly embarassing on my part)

http://forums.ni.com/t5/LabVIEW/Microsoft-Visual-C-Runtime-Library-Crashes/td-p/3205010

TestStand is calling a complex bit of LabVIEW code, as alluded to in this thread:

http://forums.ni.com/t5/LabVIEW/Exact-Meaning-of-Resetting-VI-Dialog-Specific-Context/td-p/3206889

At varying, unpredictable intervals we get a MS Visual C++ runtime error dialog that alludes to LabVIEW.exe requesting runtime to terminate it in an unusual way.

My colleague here managed to catch an instance of the crash, and I'm now going through the dump file.

(The following is for the benefit of anyone else who ever has to do this)

I got the SDK, as documented in this kb article.
I ran WinDbg
I set the symbol file path using the File menu: SRV*C:\Windows\symbol_cache*http://msdl.microsoft.com/download/symbols
Saved the workspace (File menu)
Opened the crash dump file using File->Open Crash Dump
Because LabVIEW is 32 bit in our case, running on a 64 bit PC, I ran the following first two commands, followed by analyze -v:

.load wow64exts
.effmach x86
!analyze -v

That yielded the exception analysis (full text in attached file)

The first few lines of which look like:

*******************************************************************************
*                                                                             *
*                        Exception Analysis                                   *
*                                                                             *
*******************************************************************************

GetPageUrlData failed, server returned HTTP status 404
URL requested: http://watson.microsoft.com/StageOne/LabVIEW_exe/12_0_1_4014/522f27d1/KERNELBASE_dll/6_1_7601_18869/556363bc/e06d7363/0000c42d.htm?Retriage=1

FAULTING_IP: 
KERNELBASE!RaiseException+58
7669c42d c9              leave

EXCEPTION_RECORD:  ffffffffffffffff -- (.exr 0xffffffffffffffff)
ExceptionAddress: 000000007669c42d (KERNELBASE!RaiseException+0x0000000000000058)
   ExceptionCode: e06d7363 (C++ EH exception)
  ExceptionFlags: 00000001
NumberParameters: 3
   Parameter[0]: 0000000019930520
   Parameter[1]: 000000000d0bf9c4
   Parameter[2]: 00000000022ea4a8

This is killing our tests pretty frequently, and the crash occurs much more often the more test sockets there are. (Yeah, this might be better served in the NI TestStand forum, but it's a LabVIEW crash...)

I know that this will probably end up as a support ticket, but can anyone give me a direction to go in? Is it a driver issue? .NET?

Is there any particular behavior or action that I should be looking for in the LabVIEW code?

We're trying to isolate the offending LabVIEW code, but in the meantime, any advice would be much appreciated.

Thank you so much,

Jim

Mr._Jim · ‎10-27-2015

Alright, it seems that this thread is turning into a tutorial of sorts. (Or at least that's how I'm now redeeming it)

Again, for anyone who ever has the unfortunate need to do this, once you've performed the above, enter the command "kb" in WinDbg to reveal the call stack. Alternatively, you can go to View->Call Stack from the upper menu.

For me, that yielded the following:

0:023:x86> kb
ChildEBP RetAddr  Args to Child              
0d0bf970 7412df60 e06d7363 00000001 00000003 KERNELBASE!RaiseException+0x58
0d0bf9a8 00f082ce 0d0bf9c4 022ea4a8 09f292b8 msvcr90!_CxxThrowException+0x48 [f:\dd\vctools\crt_bld\self_x86\crt\prebuild\eh\throw.cpp @ 161]
WARNING: Stack unwind information not available. Following frames may be wrong.
0d0bf9d4 00f0f23b 577278e8 195b1478 00000000 LabVIEW!DevClose+0x199ae
0d0bf9f0 013b6590 09f292b8 195b1478 0d0bfa64 LabVIEW!AutoClose+0x432b
0d0bfa00 04cda803 09f292b8 195b1478 00000000 LabVIEW!ClearTDR+0x1a40
0d0bfa64 04ccea06 09f292b8 195b1478 00000000 tdcore_12_0!TD::CopyData+0x63
0d0bfbe0 04cd0556 09f292b8 2f408e69 00000000 tdcore_12_0!LvVariant::SetContents+0x546
0d0bfe4c 04ccfef4 2f408e38 00000000 00000000 tdcore_12_0!LvVariant::SetContents+0xd6
0d0bfe74 04cd0ea4 2f408e38 595e99b4 00000000 tdcore_12_0!LvVariantCopyWithContext+0x74
0d0bfea4 01ce1597 04491180 00000000 51b974d0 tdcore_12_0!LvVariantCopy+0x14
0d0bff58 01cc6f78 04491380 00000001 76c614db LabVIEW!GetCurrentExecutingVIPath+0x6f17
0d0bff74 1005a6e9 00000100 00000000 00000000 LabVIEW!EnqPrRunQ+0x428
0d0bff88 76c6337a 0e0c45c8 0d0bffd4 775392e2 mgcore_SH_12_0!ThLocalStorageFreeKey+0x39
0d0bff94 775392e2 0e0c45c8 7a689632 00000000 kernel32!BaseThreadInitThunk+0xe
0d0bffd4 775392b5 1005a6c0 0e0c45c8 00000000 ntdll_77500000!__RtlUserThreadStart+0x70
0d0bffec 00000000 1005a6c0 0e0c45c8 00000000 ntdll_77500000!_RtlUserThreadStart+0x1b

Because I don't work for NI and quite understandably don't have access to their symbols, I only have addresses to look at in the above. However, things are thankfully rather intuitively named, for the most part. (Thank you, NI developers)

I now have a lot of clues about where the crash is occurring.

We're very likely using a "Set Variant Attribute Function" twice
Some LabVIEW variant data is getting copied (possibly at a tunnel of a structure?)
I see a "Current VI's Path" LabVIEW function
Something called LabVIEW!EnqPrRunQ, which I think may correlate to the "Open VI Reference" function. (A guess)
What appears to be a memory manager function (mgcore)
...followed by several signs that a new thread is spinning up using Windows calls

Therefore, I started searching for instances of the above functions in the LabVIEW code. Upon doing that, I found only one instance of the cooinciding LabVIEW functions and consequently have a very educated guess as to where the crash is occurring: A dynamic call of a VI using the "Start Asynchronous Call" node.

I hope this monologue helps someone someday. If it does, kudos are always appreciated.

Kemens · ‎10-28-2015

Mr._Jim, have you had any progress in your debugging? If the issue occurs more frequently when you add more test sockets, it sounds like it could likely be a memory issue as suggested in the other forum post.

You may want to try and decouple your LabVIEW code from Teststand to help isolate the issue. It might help show if it's something on the Teststand side or the LabVIEW VI side.

Clemens | Applications Engineer | National Instruments

Clemens | Technical Support Engineer | National Instruments

Mr._Jim · ‎10-28-2015

Hi Clemens,

Thanks for the reply!

We've tried everything. I thought initially that it was the ACBR node issue addressed by CAR 492898, but we've tried every workaround we can think of to no avail. We changed the method by which we were closing the VI ref and switched to the "Run VI" method as well, but neither of those eliminated the issue. The code's pretty tightly coupled, so it's tough to break it into managable chunks that can be debugged individually. I'm starting to think that there's something else going on here that isn't related to the issue in that CAR.

We're about to try migrating the whole thing to LabVIEW 2015 to see if that helps. If indeed it is the problem addressed by the above CAR, at least that will have been fixed as of LabVIEW 2014 SP1.

If that doesn't help, then we may have to do as you've suggested - take the various code components out of TestStand and put it in a calling VI.

Again, I very much appreciate the reply.

Kind regards,

Jim

StarSarina · ‎10-29-2015

Mr_Jim

Do you have the Desktop Execution Trace Program on the computer with LabVIEW on it? If so, I would run a trace execution when your LabVIEW code is running to see if it catches any memory leaks or memory resource issues.

Just to isolate if it is a TestStand issue or a LabVIEW issue, something that is worth trying is to run the TestStand sequence in single pass and see if the crash ever occurs (make sure that you first restart TestStand). That way if it is reference issue inside LabVIEW, the crash should never occur since reference issues should not exist, because it is only being called once. If the crash does occur, I would watch the tracing on TestStand to determine what module is causing the crash and that way you have isolated the VI to go ahead and debug, presuming there are multiple LabVIEW VIs in TestStand calling C++ code.

Also another thing to check is that in the LabVIEW code you are closing all the references that are being opened.

Let us know if the upgrade to LabVIEW 2015 helps, if the problem is associated with that specific CAR it should fix that issue.

From my understanding the TestStand sequence is calling a LabVIEW module and then inside that LabVIEW module it is calling some C++ code, is that correct?

Based on being LabVIEW or TestStand issue, different memory troubleshooting steps can be taken.

Sarina
Applications Engineering
National Instruments

Mr._Jim · ‎10-30-2015

Hi Sarina,

Again, thanks for the reply. I've recommended the DETT as a possibility to the individual maintaining the station code, but I don't think they've gone that route yet. I'm hopeful that that could yield some success, especially if it catches a crash. We know which VIs it is running when it crashes, but there are several possibilities due to parallelism.

From what I'm hearing they are trying the LabVIEW 2015 upgrade, but progress is delayed because of a shortage of UUTs - it will likely be next week before we know whether that will yield any success.

We could try a single pass, but unfortunately it usually takes a few runs for the crash to occur. To my knowledge it has never occurred on the first run - it's intermittent and unpredictable. It varies in appearing from 3 hours to 3 days.

> From my understanding the TestStand sequence is calling a LabVIEW module and then inside that LabVIEW module it is calling some C++ code, is that correct?

You're partly correct - the TestStand sequence is calling a LabVIEW module which has several dynamic calls within it. Various daemon VIs are spun up to run in the background and then shut down, but via LabVIEW, not TestStand. There is no C++ code - the crash dump above pertains to LabVIEW itself.

Thanks again for your input - we'll likely have more information next week.

Regards,

Jim

LabVIEW

LabVIEW 2012: Crash Dump Exception Analysis

LabVIEW 2012: Crash Dump Exception Analysis

Re: LabVIEW 2012: Crash Dump Exception Analysis

Re: LabVIEW 2012: Crash Dump Exception Analysis

Re: LabVIEW 2012: Crash Dump Exception Analysis

Re: LabVIEW 2012: Crash Dump Exception Analysis

Re: LabVIEW 2012: Crash Dump Exception Analysis