LabVIEW

cancel
Showing results for 
Search instead for 
Did you mean: 

Wait for less than 1ms.

Solved!
Go to solution

That function may make some people happy, but it can not even remotely guarantee that it waits 50us. The only guarantee you get is that it waits at least 50us. When Windows decides to start a virus scan right at that moment, or anything else of a million different actions, it may take 1 ms, 10ms, or even more than a second before that function returns. And it is a polling wait, so something is spinning in a loop somewhere to check if those 50us have elapsed.

 

If someone is really concerned about those 50us rather than waiting about 1ms, this is NOT the function to use.

Rolf Kalbermatter  My Blog
DEMO, Electronic and Mechanical Support department, room 36.LB00.390
Message 21 of 26
(229 Views)

Hi Rolf,

Yesterday, before posting my previous comment, I briefly tested the High Resolution Polling Wait.vi on a Windows 11 laptop running LabVIEW 2025 Q3 64-bit. During that brief test the VI performed very well. I measured the amount of time it waited by surrounding it with a "Timing Sequence" (a flat sequence structure that uses two instances of High Resolution Relative Seconds.vi to measure the execution time). The measured execution time was always very close to what I had fed as input into the High Resolution Polling Wait.vi (a few microseconds difference).

 

I don't know enough to dispute that the High Resolution Polling Wait.vi may take 1 ms, 10 ms, or even more than a second, but my gut feel after testing is that this is unlikely.

 

What makes you think it may take even more than a second?

0 Kudos
Message 22 of 26
(199 Views)

So you created a test VI that does nothing else than call the High Resolution Timer value before and after High Resolution Polling Wait. And concluded that function works exactly as you desire. On a computer that is most likely idle, meaning no other applications actively doing anything on the system. Basically there are 8 or more cores doing nothing and you grab one of them to spin in a loop (internally in the High Resolution Polling Wait) and declare victory.

 

Now you build a real application. Acquiring 16 analog signals from a DAQ card, talking to 4 serial devices, communicating with a network database, saving all the data to disk and displaying it on your screen. And to make matters a little more interesting your IT department starts a preliminary download in the background for a OS update that will be performed after you log out from your computer and Windows decides it needs to do a system check and security scan and your 50us wait that you so happily depended on since you have "proven" that it works so reliably, takes several milliseconds or more and your motion control system runs over its limits and destroys the 10000 $ sample!

 

Yes that High Resolution Polling Wait can be fairly accurate on modern computers, IF the computer is more or less only busy doing that Wait. It's an entirely different story, if your application and computer also does real world work besides that.

 

Moral of the story, if you need 50us accuracy do it in hardware, anything else will maybe work when the system does nothing else but will sooner or later go off into the woods. If you don't need 50us accuracy, why not just do a 1ms delay? It won't guarantee 1ms either, but if you do timing in software on a non-realtime system you simply can't depend on ANY specific timing. Yes it will be in most cases in the ms range accurate, but there is simply no guarantee! The 1 second delay or more is an extreme outlier, but absolutely not unheard off. Most of my work computers tend to acquire after about 1 to 2 years of use the annoying habit of sometimes simply "freezing" up for several seconds whenever I do something specific like for instance starting up an application!

 

Rolf Kalbermatter  My Blog
DEMO, Electronic and Mechanical Support department, room 36.LB00.390
Message 23 of 26
(171 Views)

@rolfk wrote:

So you created a test VI that does nothing else than call the High Resolution Timer value before and after High Resolution Polling Wait. And concluded that function works exactly as you desire. On a computer that is most likely idle, meaning no other applications actively doing anything on the system. Basically there are 8 or more cores doing nothing and you grab one of them to spin in a loop (internally in the High Resolution Polling Wait) and declare victory.


Come on, Ralf — not everything is that bad, and Windows is not that stupid, can keep 50µs loop also under nearly 100% CPU load:

50mks.gif


@Petru_Tarabuta wrote:

Hi Rolf,

Yesterday, before posting my previous comment, I briefly tested the High Resolution Polling Wait.vi on a Windows 11 laptop running LabVIEW 2025 Q3 64-bit. During that brief test the VI performed very well


Yes, it works quite reliably. Technically, behind the scenes of High Resolution Relative Seconds.vi and High Resolution Polling Wait.vi, the WinAPI function QueryPerformanceCounter() is used. The resolution of this counter is 100 ns (just check what QueryPerformanceFrequency() returns — usually 10,000,000 Hz or 10 MHz). This makes it fully suitable for achieving a 50 µs delay. In a tight polling loop, you need at most around 500 iterations. But the way, the function intelligent enough do not poll very long intervals, here Wait (ms) is used for intial sleep, the only last two milliseconds polled.

 

Anyway, when this tight polling loop is running, there are three major reasons why the delay might exceed requested time (50 µs in our case):

 

  1. Thread preemption by the scheduler. If another thread with equal or higher priority needs to run on the same CPU core, your thread will be preempted and suspended. This is the most common cause of long delays. However, if your loop is already executing and the core is fully occupied, Windows will typically schedule the competing thread on another available core — that’s normal load‑balancing behavior.

  2. Thread migration (cross‑core context switching). Every 10–20 ms, Windows may migrate your thread to a different CPU core to maintain balanced utilization. A migration is not free: registers values and state must be saved and restored, and CPU caches may need to be repopulated. On multi‑socket (NUMA) systems, this can cost more than 100 µs due to cross‑node memory access and cache invalidation.

  3. Hyper‑Threading resource contention. If the sibling logical processor on the same physical core is fully occupied, execution resources (ALUs, Ports, etc.) may be saturated. This reduces the effective throughput of your loop and introduces timing jitter.

For a relatively short polling loop of 50 µs, the probability of these events is relative fairly low, so the timing is “good enough” as you observed — but occasional spikes are unavoidable because Windows is not a real‑time operating system. And yes, sometimes you may see delays in the millisecond range. If you ever see delays on the order of whole seconds, then the entire Windows system would already be nearly “dead” or completely unresponsive, could happened.

 

By the way, accuracy can be improved further. Two common approaches are:

• moving the entire polling loop into a DLL written in a more efficient language (such as C or Rust), which reduces overhead and improves instruction‑level efficiency.

• replacing QueryPerformanceCounter() with _rdtsc(), which wraps the RDTSC CPU instruction. RDTSC returns a timestamp counter at the CPU’s base frequency (3.59 GHz in my case) with extremely low overhead.

 

It is technically possible to measure extremely short intervals on Windows — for example, using RDPMC to read hardware performance counters and measure instruction‑level latency, throughput, or instructions‑per‑cycle (IPC), cache misses and branches mispredictions, etc. But a 50 µs delay is rarely required. In my entire career, I needed such a short microsecond‑range delay only once, when synchronizing the displayed image with the monitor’s VSync to achieve tearing‑free output. The GDI bit‑blitting had to be delayed slightly after receiving the new VSync.

Message 24 of 26
(122 Views)

Rolf, my apologies — I realized I occasionally misspelled your name in the previous comment, and I’m truly sorry for the oversight, it was a bit too late.

 

Additionally, for better timing accuracy, it can sometimes help to disable hyper‑threading and pin the process/thread to a dedicated core to avoid context‑switch penalties — this  sometimes may improve determinism a little bit...

Message 25 of 26
(94 Views)

No offense taken Andrey.

 

Our arguments aren't opposites, just two different sides of the same and you even concede so at the end of your second last post where you stated that you only came across a situation where you needed this accuracy in timing once in your entire carrier. It's extremely rare and in most cases something similar to premature optimization, just like the poster of this thread basically had. He wanted a 50us because he knew he needed at least 50us, but felt that "wasting" one ms instead was the declaration of defeat. He basically would have happily traded a high performance spinning loop keeping one CPU core maxed out for 50us + some unknown amount, to a low performance yield that either releases that core to another task or simply allows the CPU to adapt its clocks and peacefully wait on the completion of the time if there was nothing else to do. All because he felt, waiting for 1ms each time was wasting time. And if the argument is that he really needs to wait 50us every time, not 55, or sometimes 100 or even 1000 or more, the software timer simply is not a safe thing. Windows can't and won't guarantee such accuracy and even for a real-time OS it is very borderline. The much safer thing in this case is to go with full hardware for this particular functionality.

You can of course put the entire loop  into a DLL and call that to eliminate the LabVIEW scheduler (or you could put that VI in subroutine priority to achieve the same) but that still leaves two things that can interfere. In LabVIEW the part from when you know you have to wait until you call that function and afterwards when you actuate whatever you need to actuate after the delay. In here the LabVIEW scheduler could interfere and simply handle something else in the diagram instead. And more importantly if you really want to avoid Windows context switching too, you would have to disable that from the moment you try to detect if the delay should start until you have finished the actuation of whatever you want to start after the delay.

 

We are talking here about low level Windows DLLs calls, with particularities that 99.999% of LabVIEW users never have dealt with and probably should never want to deal with. And in the context of the OP request even more so. He mentioned he does lots of measurements but never really quantified that. Maybe he indeed meant with lots like 100 million measurements, but then I question his choice of choosing single point analog output generation, followed with single point analog input reading. More likely it was a few 1000 points or less and this was part of a measurement routine that probably had other things to do as well and he tries to shave off 1s of "waisted" delay over the whole.

Rolf Kalbermatter  My Blog
DEMO, Electronic and Mechanical Support department, room 36.LB00.390
Message 26 of 26
(69 Views)