DQMH woes

LVUser94 · ‎09-15-2021

I am now responsible for maintaining/completing a LabVIEW project developed that was developed using DQMH (DQMH would not have been my choice, but I inherited the code from another developer who left).

The software runs a durability test on 8 DUTs. The DUTs are inside a environmental chamber and operate simultaneously (they are driven by a common motor). The DUTs operate twice during each cycle. The cycle time is about 4 to 5 seconds, and the durability test runs for 5 days at the end of which the DUTS would have experienced > 100,000 cycles.

What I am finding is that DUT events get fired when they are not supposed to. For example, each DUT is supposed to operate 2 times during each cycle (so should send two data events), but sometimes one of the DUTs will send 4 or 6 data events. There is no way to catch the DUT in action as this happens randomly about once or twice a day. This is weird because all the DUTS are operating synchronously, so they should have same # of events at all times. But over a period of 5 days, the # of operations of DUTs are not equal and can differ by as much as +/- 100 cycles.

Another weirdness I am finding is sometimes an old data event (say from 4 hours ago) will occur again. Suddenly all the channels will revert to the data they had when the event first happened (in this example 4 hours ago) but with the current time stamp. This is happening about 2 or 3 times a day. The data file clearly shows this because the current Cycle # suddenly drops by a few hundreds to an old Cycle #, and the data from the old Cycle # now gets recorded to the data file with new time stamp! Then in the next cycle, the Cycle # count resumes and increments correctly. There are about 16 channels x 8 DUTS and all the values are somehow retrieved from memory and recorded again. As far as I can tell, the software does not have a memory buffer to hold on to data from 4 hours ago. It holds data for about 10 cycles (less than 1 minute) before it writes to the data file, then purges it (10 element FIFO). So I can only conclude that an old data event fired again.

Has anyone experienced this sort weirdness with these programmatically fired dynamic events? I have of course used dynamic events, user events on many many projects, but not on the scale DQMH uses.

Sorry I cannot post any code. Thanks for any tips.

Certified LabVIEW Developer (since 2005)
LabVIEW Developer since Version 2.0

drjdpowell · ‎09-16-2021

I strongly doubt this is a DQMH or dynamic event problem. Strange bug though. Are you sure one of your DUT VIs are getting stuck for 4 hours?

LVUser94 · ‎09-16-2021

Yes strange bug indeed. The DUT Vis are not getting stuck.. They are sending data normally, then suddenly I get a repeat of old data, then back to normal operation.

Certified LabVIEW Developer (since 2005)
LabVIEW Developer since Version 2.0

ChrisFarmerWIS · ‎09-16-2021

Hi LVUser94,

I don't know the solution, but I do have some questions/suggestions for you:

1) Download and install Antidoc via VIPM, and use this tool to generate a document of the source code to help understand the relationship between all of the DQMH modules. This might highlight any design issues present.

2) What is structure of the application? ie. what are all of the DQMH modules, and what are their dependencies. Anti-doc will spit this information out.

3) Is the "DUT" a DQMH module, and is it cloneable? Or are all 8 DUTs managed by one singleton DQMH module?

4) In order to "catch" the events, I suggest an activity logger where every event is logged to a file. Then you can search through to find when the events happen, and perhaps work out why from this information. At Wired-in, for all of our device-based modules we have an activity logger. Or does your current data file already show this?

5) Are the DUTs struggling to keep up - are you potentially getting a back log of events/requests, and hence you get random entries when it catches up?

6) How are the DUT events being generated? What is triggering them?

Christopher Farmer

Certified LabVIEW Architect and LabVIEW Champion
DQMH Trusted Advisor
Automated Production Test Specialty Premier Partner
https://wiredinsoftware.com.au

LVUser94 · ‎09-17-2021

Thanks Christopher for your tips!

I downloaded the Antidoc via VPM and tried it. I pointed it to the Project lvproj file (which as about 2000 VIs, controls etc). It ran for a few minutes and created a "Project Documentation.adoc" file and a bunch of over 1000 png files. I also installed the necessary Chrome extension to read the .adoc file (Ascidoctor.js). But after re-starting Chrome, and opening the .adoc file, all I get is an inscrutable text dump.

To answer your other questions:

3) Yes the DUT is a DQMH Module. It is cloneable and can operate upto 8 DUTs in parallel. But the DUT module is just a simple GUI to display status and cycle count, and pass/fail counts for each DUT. It does not do any DAQ.

4) I was dreading this approach. Just the suggestion of an Activity Tracker contradicts the supposed robustness of the underlying DQMH architecture! I suppose I can write and put an Activity Tracker but there are 15 DQMH modules each generating all manners of events. So I would need more than one Activity Tracker and then would need to comb through thousands of entries to figure out what happened.

5) I don't think the DUTs modules are struggling to keep up. Most of the time they are operating normally and on schedule. Just once in a while, they re-generate old events.

6) The DAQ DQMH module reads data from a 8 channel NI-cDAQ counter module (count edges). Each counter channel monitors one physical DUT. It then sends data as events after each DAQmx Read. Typically new data should be available once per cycle which is 4 seconds long. All 8 DUTs are running concurrently (all driven by one common motor). But the nature of the DQMH architecture is that there is no straight path from the DAQ module to the Data Logging module (another DQMH module). It travels through layers of other DQMH modules so who know where it got stuck or duplicated?

Certified LabVIEW Developer (since 2005)
LabVIEW Developer since Version 2.0

ChrisFarmerWIS · ‎09-17-2021

You need to enable extensions etc to get Antidoc fully working. Once you have it setup, it's easier to use.

Watch this: Antidoc FAQ #1: How to preview the output in my web browser WITH the graphs? - Video - VIPM by JKI

I encourage you to embrace the activity logging idea. I don't think it contradicts. It's a massively helpful way to verify your source code is running correctly - ESPECIALLY executables or real time projects, and it also proves your software is working - it's a handy piece of test evidence.

One way to do this easily is in all of the modules, use the Status Updated broadcast (and Error Reported broadcasts) a bit more in your modules, and capture them all in one place, and log that. That way you have a pretty accurate vision of what's happening with your software.

There are other debugging tools - Event Inspector window / Desktop Execution tracing tool etc

Christopher Farmer

Certified LabVIEW Architect and LabVIEW Champion
DQMH Trusted Advisor
Automated Production Test Specialty Premier Partner
https://wiredinsoftware.com.au

drjdpowell · ‎09-18-2021

@LVUser94 wrote:

)But the nature of the DQMH architecture is that there is no straight path from the DAQ module to the Data Logging module (another DQMH module). It travels through layers of other DQMH modules so who know where it got stuck or duplicated?

From your description you have only one module in the path: DAQ --> DUT --> Logging. So you shouldn't have much code to debug. DQMH is just wrapping User Events and Queues, standard and very well-tested LabVIEW technologies, so DQMH is not the source of your problem.

joerg.hampel · ‎09-21-2021

Hello LVUser94,

first of all I'm sorry for your struggles with the software you inherited. I'd also be interested to hear why would not have gone with DQMH. It seems like a good fit from what you've described here so far?

Without the actual source code, or a complete description of your application (at least a back-of-the-napkin sketch of which modules there are, and how they communicate), it is hard for us to be of much help. The only thing we can claim is what drjdpowell already stated: It's highly unlikely that the problems you're facing are rooted in DQMH itself or in the events/queue functions of LabVIEW.

I'd be happy to try and help figure this out, but from what you've described here so far, I'm not sure I understand correctly how the modules work together:

How are the modules connect? I.e. which module uses which requests of others, which modules are registered for other modules' broadcasts?
You're mentioning 15 modules (8x DUT, 1x DAQ, 1x Logger, ...?)
You mention 8x 16 channels in one place, but only 1 counter channel per DUT (8 in total) coming from a cDAQ counter module in another place

4) I was dreading this approach. Just the suggestion of an Activity Tracker contradicts the supposed robustness of the underlying DQMH architecture!

I find this statement interesting, as I'm of the exact opposite opinion. The very first thing we put in place whenever starting a new project is our HSE Logger (a small, free, open-source logging tool, designed after the Python logging library), which out-of-the-box can log to files or user events.

We don't start off that way to debug DQMH (the framework) itself, but to monitor the execution of our application, and the interaction of modules etc within it. As this will always be project-specific, it makes perfect sense in my opinion to have this kind of logging in place, and would not want to work without it. I would really recommend for everybody to have something like this in their toolbox.

But the nature of the DQMH architecture is that there is no straight path from the DAQ module to the Data Logging module (another DQMH module).

DQMH helps with creating proper APIs for your modules, so you can use the default DQMH mechanisms to exchange messages from and with your modules in a standardised way. DQMH is not limited to these mechanisms, though: If your design requires a direct (eg. high-throughput) communication channel, you're free to implement one.

It travels through layers of other DQMH modules so who know where it got stuck or duplicated?

It is quite common - and probably even recommended best practice in most cases - to design your modules in a way that they don't (statically) depend on each other. One way to do that is to design a tree-like structure for how modules communicate. I think that's often mentioned when talking about Actor-based design. Messages sent from one module to another will travel from the sender up the "tree" to a common node, then back down to the recipient.

At first look, the fact that data is routed through a module (or modules) adds complexity to your application. On second look, it also reduces complexity by cutting down on the number of direct connections between all your modules (and thus, helping with cohesion and coupling). There are various methods that can help with adding more transparency to the communication. Logging, as mentioned above, is one method that I feel is definitely worth a look.

Again, if you're able to share more information, we (both the Consortium and the community) would be happy to look into it and try to help solve your issue.

DSH Pragmatic Software Development Workshops (Fab, Steve, Brian and me)
Release Automation Tools for LabVIEW (CI/CD integration with LabVIEW)
HSE Discord Server (Discuss our free and commercial tools and services)
DQMH® (Developer Experience that makes you smile )

LVUser94 · ‎09-21-2021

Thanks Christopher for your help with AntiDoc. I kept adjusting all the settings (lallow File URLs, Pictures extensions etc) and loaded the .adoc file into multiple tabs in Chrome. After a lot of back and forth, on of the tabs in chrome the .adoc document finally turned into a nicely structured document with pictures. I don't know why it does not work right away and I am still having problems re-opening the file even though all the settings for the extension or correct. But it works sometimes, and when it works it is very impressive. It showed me in a picture all the DQMH module and their connections. I have attached the picture below.

I was hoping to find some hidden module or VI sending random events, but I don't see any in the picture.

I have now created an Activity Logger to capture all events to come into the Data Logger VI. Hope this will help me catch the events responsible for sending old data at random times.

Certified LabVIEW Developer (since 2005)
LabVIEW Developer since Version 2.0

Taggart · ‎09-21-2021

I do see some circular dependencies there. particularly around main UI, step executor and test manager.

Looking closer lots of big circular loops there.

For example, I can trace from Main UI to Motor Controller to DI Pulse Counter to Data Formatting back to Main UI.

I can also trace From Unit Display through Display Formatting, Start Screen, Step Executor, back to Unit Display.

Those are just 2 of a ton of circular paths. Not surprising you are having issues.

These circular dependencies could be causing your problem or contributing to it at least. If nothing else it is probably killing your load and build time. Probably want to look at fixing those.

https://youtu.be/_YcfT-Ec0ws

Edited to add: Dependency management is hard!

Sam Taggart
CLA, CPI, CTD, LabVIEW Champion
DQMH Trusted Advisor
Read about my thoughts on Software Development at sasworkshops.com/blog

DQMH Consortium Toolkits Discussions

DQMH woes

DQMH woes

Re: DQMH woes

Re: DQMH woes

Re: DQMH woes

Re: DQMH woes

Re: DQMH woes

Re: DQMH woes

Re: DQMH woes

Re: DQMH woes

Re: DQMH woes