Lockup possibly caused by Sleep function?

HatleySoftware · ‎03-05-2008

I'm developing an application in CVI 8.0, and I've recently started running into frequent lockups,

where the application just freezes, and has to be manually terminated and restarted.

I'm wondering if this might be related to my usage of the Sleep() function? I've read warnings

that this function can cause lockups under certain circumstances, but the warnings are a bit

on the vague side.

My program runs several threads, including a main thread to do event processing:

int main (void)
{
if ( 42 != InitApp() ) {return -1;} // initialize application and start 2 other threads

QuitMainLoop = 0;
while (!QuitMainLoop)              // main thread loops until application exits
{
    ProcessSystemEvents();
    Sleep(100);
}

ppAtinI();         // Clean-up before exit.  ("InitApp" spelled backwards)

return 0; // Exit application.

} // end main()

The other threads are based around a for loop with a Sleep function. These functions are scheduled

in InitApp() using CmtScheduleThreadPoolFunction().

Here's an example of one of my thread functions:

int CVICALLBACK CommunicationsThreadFunction (void *functionData)
{
for ( ; !KillTimers ; Sleep(101) ) // loop once every 101ms
{
CheckCommPort();

CommunicationsActivityStateMachine();
}

return 0;
}

Is there anything inherently wrong or dangerous with my usage of Sleep() in this context? Or is my

current "lockup" problem more likely caused by something else?

--

Puzzled,

Robbie Hatley
lonewolf aatt well dott com
www dott well dott com slant user slant lonewolf slant

menchar · ‎03-05-2008

A deadlock (AKA fatal embrace) can occur when you have at least two resources and two processes trying to acquire both.

Similar to a race in logic design, under certain timing conditions you might be getting the case where:

1. Process A grabs resource 1 and tries to grab resource 2.
2. BUT Process B has grabbed resource 2 and is trying to grab resource 1.
3. Neither thread is then able to proceed.

A resource could be a comport, a file, or any number of things.

The OS interleaves processing time amongst the threads per its thread scheduling algorithm when you haven't put them to sleep, and they interleave at a machine instruction level granularity that can lead to some very non-obvious situations.

If your process includes any code that's not "thread safe" then you might be deadlocking there and not in any code that you wrote. I think all or nearly all of the NI libraries are threadsafe.

If you do have a deadlock, there are several things you can do to manage the situation. NI has a lot of info on multi-threading, including a description of deadlock and solutions to it.

Menchar

HatleySoftware · ‎03-06-2008

"menchar" wrote:

> A deadlock ... is ... Similar to a race in logic design ... The OS interleaves processing time amongst

> the threads ... when you haven't put them to sleep ...and they interleave at a machine instruction level

> granularity that can lead to some very non-obvious situations ...

Yes. However, none of this really addresses my original question, which was, "Is my usage of the

Sleep() function in the given context inherently dangerous"? Also implied, but not stated (goes without

saying, doesn't it?) is the question "If so, what would be a better way of doing it?"

Also, I'm careful to restrict access to files and comm ports using locking mechnisms. (Very simple,

actually. For example, any communications routine just returns an error code if global boolean variable

"bCommLock" is not 0. (If it's 0, the routine sets it to 1 on entry, then resets it to 0 on exit. That way,

if some other thread tries to access the comm port through one of the comm routines, the other routine

will see that bCommLock is 1 and just exit.))

> If your process includes any code that's not "thread safe" then you might be deadlocking there and

> not in any code that you wrote.

Where is "there"? Any "un-thread-safe" code I write into my code is going to be, well, in my code.

Also, this program is full of "un-thread-safe code". (Not by my design. The person who started writing

this program loved global, non-thread-safe variables, and used hundreds. I've been peeling them

slowly away and replacing with passed function arguments, but there's still plenty of globals.)

However, I don't think this is where my problem is.

My experiments last night determined something interesting: this deadlock occurs only in the debug

configuration, and only when I have breakpoints set in both of my two user-defined threads. (This

program seems to be running 5 threads: The "main()" thread, the "Background" thread, the

"Communications" thread, and 2 unnamed threads which I'm assuming are part of the CVI runtime

engine.) If I take all of the breakpoints out of the Background thread, then the "deadlock" bug goes

away. The breakpoints in the "Communication thread" are hit normally, and execution continues

normally on tapping F5 key.

Whereas, with breakpoints in both threads, I keep getting into situations where execution freezes,

and yet no breakpoint is hit in either thread.

How does one determine what breakpoint is going to be hit first in multiple threads? It seems

quite likely that breakpoints may be reached in multiple threads at once. So what determines

which breakpoint is highlighted in red? And what determines which thread is re-activated on

tapping F5? Is there a way to switch views between threads?

--

Cheers,
Robbie Hatley
lonewolf aatt well dott com
www dott well dott com slant user slant lonewolf slant

menchar · ‎03-06-2008

A Sleep() or SleepEx() call isn't inherently "dangerous" - all you're doing is telling the OS not to schedule that thread for the period of time you request. A suspend or sleep forever call is of course potentially dangerous - I don't think you can do a suspend in Win32 but you can specify an INFINITE sleep period which is the same thing.

A problem with a thread that is asleep is that it won't process GUI events and depending on your design, could miss other important events. What most people do is implement a SleepWithEvents () function that wakes up periodically, processes events, then goes back to sleep.

As far as using locks, you may have a problem if you're using you're own mechanism instead of a Win32 synchronization object. As it turns out, an integer read and write is atomic, but you can still wind up with an ambiguous situation. You need an atomic "test and set" to make a lock work and you don't have that. What keeps one thread from reading the integer, deciding to set it, but another thread decides to do the same thing and they both read 0, then both write 1, and then both proceed thinking they have the lock, and then get hung up on trying to access the same resource.

Use a critical section or interlocked variable instead. If your threads are in different processes, then you need a system synchronization object like an event, mutex, or semaphore.

If you use a DLL or a library that was not designed to have multiple threads executing in it at the same time, it can deadlock or fail. If you're not using a non-threadsafe DLL or library then it's not a problem. Most modern code is threadsafe, but there was plenty of Win32 code written that is not. If a vendor doesn't specifically state that his DLL or library is threadsafe, I ask if I can, and if I can't I assume it's not.

My experience with the CVI debugger is that it breaks with whichever thread first hits the breakpoint, you can see the thread ID I think so you know which one it was. If you're suspeneded for a breakpoint and a second thread tries to execute the breakpoint, I'm not sure what happens - I imagine the second thread might suspend waiting for the frst thread to clear the breakpoint, or it might proceed past it. NI can tell you I'm sure.

It could be you're running on a multi-core processor with true concurrency, and you're encountering some nuance in the CVI debugger code.

You can set the boot.ini file to force the use of a single core for all threads to see if that's what's happening. There is a class of errors where multi-threaded code that runs fine on a single core (pseudo-concurrency) but fails on a multi-core processor (true concurrency). Maybe the CVI debugger is vulnerable though NI makes a big deal about supporting multi-core micros so it should be OK. It depends on the debugger implementation. If they protect the breakpoint handling with a critical section or synchronization object, it should work fine.

I suggest reading Multithreading Applications in Win32, Beveridge and Weiner, ISBN 0-201-44234-5. Old book now but no better available.

Menchar

HatleySoftware · ‎03-06-2008

"menchar" wrote:

> A Sleep() or SleepEx() call isn't inherently "dangerous" ... A suspend or sleep forever call is of course potentially

> dangerous ...

Good. That's what I figured, but I wanted to be clear on that. If CVI's handling of user-interface events (keyboard

activity, mouse activity) is anything like how the Win32 API does it, the events are put in queues, so even if some

events happen while a thread is ZZZ, I assume that the thread, if it's looking for events at all, would pick up its

messages on each wakeup. I'm depending on my "main()" thread to do just that, as it has the only call in the

whole program to ProcessSystemEvents(), which is called once every 100ms.

> What keeps one thread from reading the integer, deciding to set it, but another thread decides to do the same

> thing and they both read 0, then both write 1, and then both proceed thinking they have the lock, and then get hung

> up on trying to access the same resource?

Two things:

1. Even if two threads tried to do this, to hang-up, they'd have to attempt to grab the same resource simultaneously,

to within about a nanosecond (one or two CPU clock cycles), else one would set the lock before the other tested it.

For comm events that are happening once every 10 seconds or so, the probability of this is about 1 in 10 trillion.
2. The only functions in the program attempting to access a comm port are in the "Communications" thread.

And that thread doesn't launch any further threads, so once it executes a comm function, that function will hog the

thread until it completes and releases its lock. So the locks are really just safeguards against the unknown. (Such as

future addition of other threads which may attempt communications.)

> ... critical section ... interlocked variable ... event, mutex, or semaphore ...

All very useful, I'm sure, for multiple threads fighting over same resources; but at present, I think my other,

simpler ways of handling this will suffice. If I really do get 2 or more threads fighting over a resource, i'll

have to look into those.

Seems to me that to be 100% sure of a lock, the locking mechanism would have to be part of the resource being

vied for, not part of the competitors. The resource itself can then say, "Thread Argle, sorry, but thread Bargle made

a request 2.3ns before you did, so you'll have to wait." Argle then has to keep trying untill the resource's arbitor says

"Argle, you are now cleared for access." When Argle is done, it says "Finished, thanks" and the resource releases

the lock.

> My experience with the CVI debugger is that it breaks with whichever thread first hits the breakpoint

Ok.

> you can see the thread ID I think so you know which one it was.

I know what functions are in what thread by the call tree: a function is always in the thread that called it,

and I'm only starting two threads on startup, and only one calls a bunch of functions.

> If you're suspeneded for a breakpoint and a second thread tries to execute the breakpoint, I'm not sure

> what happens - I imagine the second thread might suspend waiting for the frst thread to clear the breakpoint,

> or it might proceed past it. NI can tell you I'm sure.

I'm pretty sure that the threads continue, unless they, too, hit a breakpoint, in which case they halt.

I think now that the "deadlock" problem I've been having is due to one or more threads other than the "active"

thread (the one showing in red in the Run/Threads list) being halted at breakpoints. The breakpoint is

highlighted, but in pale-grey instead of in bright-red, and tapping the F5 key has no effect. I haven't figured

out how to switch the "grey" thread to "active" yet.

> It could be you're running on a multi-core processor with true concurrency, and you're encountering some

> nuance in the CVI debugger code. You can set the boot.ini file to force the use of a single core for all threads

> to see if that's what's happening.

Good idea. I'll try that.

> I suggest reading Multithreading Applications in Win32, Beveridge and Weiner, ISBN 0-201-44234-5.

> Old book now but no better available.

Thanks for the reference. Sounds like something I should read, as I get more into multi-threaded programming.

--
Cheers,
Robbie Hatley
lonewolf aatt well dott com
www dott well dott com slant user slant lonewolf slant

menchar · ‎03-06-2008

Never say never when dealing with multi-threaded code.

I have worked with multi-threaded apps that worked for days or weeks before breaking due to a threading issue.

Personally, I wouldn't want the risk, however small. The various Win32 thread schedulers have some very non-obvious behavior, and it can be dangerous to make assumptions about thread scheduling.

Critical Sections are very easy to use - not much of a learning curve. Interlocked variables also very easy to use but less popular than critical sections.

The limitation of critical sections and interlocked variables is that all threads using these mechanisms must be in the same process.

With multi-core micros, multi-threading with true concurrency is mainstream programming these days. While the trend is to provide compilers and tools that will multi-thread transparent to the source code, there's no real substitute for understanding concurrency principles. NI has tried to help with debugging support and some whitepapers - they have a webpage dedicated to multi-threading believe.

All commerical Win32 apps are multi-threaded. To prove this, use the task manager to look at all of the processes and their thread count, and you almost certainly won't find any that are single threaded. If you're going to be a professional Win32 app developer, you'd be well served to master concurrent programming.

I'll step down off my soapbox now 🙂

Menchar

Message Edited by menchar on 03-06-2008 05:00 PM

HatleySoftware · ‎03-06-2008

"menchar" wrote:

> With multi-core micros, multi-threading with true concurrency is mainstream programming these days.

About time. Perhaps not too useful for stuff such as word-processing, etc. But anything that has several

tasks happening in real time (like, reading and writing comm lines + writing data to files + graphing

equations on the video screen + handling events, all at the same time, like my program here at work

does) has huge need of this technology.

> While the trend is to provide compilers and tools that will multi-thread transparent to the source code

Sounds like a recipe for disaster. Even with the relatively-simple 5-thread program I'm working on,

I'd hate to depend on the compiler to decide what threads things go in. In fact, that wouldn't work, period,

due to the way this program is structured. Nothing in it is inherently "thread-safe", so to make threads

work at all requires knowledge of what is running in what thread, with care taken to prevent the threads

from stepping on each others toes. The only thread-safety equipment here is the programmer.

> there's no real substitute for understanding concurrency principles.

Damn right! I've gained some understanding through struggle, trial, and error; but I can certainly

stand to learn more. And believe me, If I had created the project I'm working on now as a

multi-threaded project ab-ovo, my first step would have been to learn all about multi-threading,

then do it right, with proper thread-safe variables, proper resource locking, etc. But alas, when

I started working on this project, it was already written as a very thread-unsafe single-threaded

application, so I had to adapt it to multithreading as well as I could. Messy, but it (mostly) works.

> NI has tried to help with debugging support and some whitepapers - they have a webpage dedicated

> to multi-threading believe.

I haven't seen that. I'll have to look for it. Thanks for the tip.

--

Cheers,
Robbie Hatley
lonewolf aatt well dott com
www dott well dott com slant user slant lonewolf slant

dummy_decoy · ‎03-07-2008

>> While the trend is to provide compilers and tools that will multi-thread transparent to the source code
>

> Sounds like a recipe for disaster. Even with the relatively-simple 5-thread program I'm working on,

> I'd hate to depend on the compiler to decide what threads things go in. In fact, that wouldn't work, period.

the compiler can't decide exactly what goes into which thread, but some assumptions can be made. that's the way your own processor works: it really performs multiple instructions at the same time and you don't even notice. but i don't think menchar meant this in his statement.

what's difficult when doing multithreaded application, is synchronisation. the C language was defined when multithreading was young or inexistant, and there is nothing the language knows about multithreading. unfortunately, the design of multithreading libraries for C are unsafe: they rely heavily on the developper to ensure a correct synchronisation and they are too low-level, exposing a lot of implementation details the developper should not have to worry about.

now, let's forget a bit about the C macrocosm. C is not the panacea, it is even a relatively bad language. but other languages exists and some have been designed with multitasking built-in, which leverages the details, letting the developper focus on the "real stuff". one such language is Ada, with its task construct (think of it as a function running into a thread) with the entry mechanism allowing synchronisation among multiple tasks or the protected construct (think of it as a mutually exclusive set of functions: when one executes, other function calls have to wait until it completes). if this kind of language was used as a learning tool, multithreading would become far less messy and developpers would easily grasp the concepts behind multithreading.

menchar · ‎03-07-2008

Direct language support for concurrency is provided in Ada, Java, and C# for example. The Ada process model is a bit goofy IMHO, I don't think a lot of new code is being written in Ada but who knows.

C and C++ do not as you know, you have to learn how to use operating system constructs or use a threading library. But with Windows being multi-threaded, and Unix multi-processing (Unix does have light weight processes and fibers, Posix has concurrency too) threading libraries are pretty much a thing of the past.

Recent C / C++ compilers, such as the Intel C++ Compiler for Windows, can indeed auto-parallelize loops for example, by partitioning the loop and scheduling a second thread to run half the loop at the same time as the first thread, all transparent to the source code. It turns out the loop must meet certain criteria before the compiler knows it can do it, but it knows how to decide by itself. You can also use OpenMP and other primitives provided by Intel to ease the concurrent threading issues. I use the Intel compiler for production builds in CVI. I see execution time reductions of 50% typically for code that's primarily compute bound, and I've seen as high as 75% reduction in execution time when running on a multi-core micro.

I don't think the average programmer will be able to master thread-level concurrency - it's hard to learn and implement - and it pretty much has to work perfectly or it'll break. That's why Intel and others will scramble to find a way to automate it if at all possible so they can keep selling higher performance in their micros using multi cores - they're tapped out on processor speed pretty much.

I'd say concurrency is worth learning, there's a lot to it, but there's way more info available now days than when I learned it.
If you came up with a scheme to effectively auto - parallelize code beyond the loop level, you'd make a lot of money and be famous.

Menchar

LabWindows/CVI

Lockup possibly caused by Sleep function?

Lockup possibly caused by Sleep function?

Re: Lockup possibly caused by Sleep function?

How do breakpoints interact with multi-threading? (Was: Lockup possibly caused by Sleep function?)

Re: How do breakpoints interact with multi-threading? (Was: Lockup possibly caused by Sleep function?)

Re: Lockup possibly caused by Sleep function?

Re: Lockup possibly caused by Sleep function?

Re: Lockup possibly caused by Sleep function?

Re: Lockup possibly caused by Sleep function?

Re: Lockup possibly caused by Sleep function?