Thinking in Threads - 1

In search of a versatile synchronization primitive...

S V Ramu (2002-08-03)

Prelude

First, I must apologize for stealing the title prefix ('Thinking in...') of Bruce Eckle, who gave this very intuitive prefix its right place. I envy and hence respect his clarity, and thus this mimicry is justified ;-). Knowing the syntax or semantics of some concept is one thing and using it is another. And understanding something is not always a completable task, but a continuing one. I don't mean to make knowledge an elite thing, but it is usually so profound that it becomes an addiction. More than Knowing something, we have to be Thinking in it. Understanding something is seeing it in as many angles as is possible. So much so, that the whole picture soon becomes bigger than its parts.

In this article I'll be trying to explore the concept of Threads, and hence Parallel Programming as a whole, by mainly focusing on Locks, the Synchronization Primitives. I must thank my friend Senthil, with whom it has now become a routine for me to argue the details of a concept, and invariably be benefited from his original questions and understanding. By the way, sometime back I compiled a series of six articles on threads. But I don't dare to publish them, as they contain far more quotes than can be allowed under Fair Use clause of Copyright Law. Anyway, being clear that it was done only to present these clear ideas under one collective perspective, and not to reduce their readership, I'm including these useful quotes as a separate file.

Initially I thought this would be a standalone article, but due to time crunch, I'm making this the part one of a series. In this instalment, I vent my passionate thoughts on the basics of parallel programming. This done, the next one should be bit more practical and concrete.

Why Threads?

Thinking in Threads is a profitable endeavor, if at all you need a reason. Thread is a remarkable abstraction of parallel processing. By training ourselves to use threads properly, we prepare ourselves for the inevitable future boom of parallel processing. What Modularity is for Complexity, Parallelism is for Utilization.

(From an old article, 2001-11-25)

My take was and is, that there are two areas that are becoming more and more prominent and people are giving more focus to: Reducing Complexity & Increasing Parallelism. All the tools and technologies that are coming up, as I see, seem to be going towards one of these two. At present the stress is more towards busting Complexity myths rather than Parallelism questions. Of course, I do see that as the Modularity (Complexity Reducer) goal is maturing with Extreme OOP and Web-Services, the Parallelism theme is conspicuous in the IT community's stress towards Distributed Computing and Threads (for single and multi-processors). Somehow, web services are posing to be the culmination of both Parallelism and Modularity. Slowly the goal of seeking Modularity is not just for reducing complexity but for increasing the utilization of resources through parallelism. The success of a given Web Service is not just dependent on how much it does for Modularity sake, but more on how well it collaborates with other services in utilizing the distributed resources.

The Single Sequence idiom: Process and Thread

Checkout the accompanying master quotes

A clean Idiom for imagining the parallel processes is vital for successful Parallel Algorithms. The Idiom should be broad enough to allow all Parallelism's features without going into the implementation details of the OS. It is also important that the idiom be same or at least consistent between single processor machine (pseudo parallelism) and multi- processor machines. This mimicking of parallelism in single-processor machine is vital for simplifying the already tough domain of creating algorithms for parallel operations, and hence is being supported on all modern OS platforms, as Preemptive Multi-Tasking.

Each program's binary form should be loaded into memory for running it. This runtime instance of the binary code is called a Process. Of course, each binary code can be having multiple processes running simultaneously. Now, the parallel algorithms are just code, compiled into single binary file, and hence should have some facility to create, destroy and control new processes. However restricted the memory used by a process might be (for security reasons), there must be some common memory to communicate between the processes, and a way to synchronize the parallel execution.

It should be noted that a Threading Model is only an idiom for conveniently thinking about Parallel Programming. It is up to the platform (OS mainly and CPU to some extant) whether it is parallelly executing the threads or not. Though the main MS Window flavors support threads as it is, many of the UNIX variants don't. In UNIX, we can very easily create or destroy new process using FORK system call, which just creates identical copy of the calling process. The parent and child continue to run 'simultaneously' (on multi-CPU machines). Maybe due to this ease of process forking, each thread is still implemented as a separate 'process'!

The difference between process and thread is not just semantic; usually a full-blown process is much bulkier than a thread (a sub-process). Hence, OS like Linux are typically slower with threads than say Windows NT is. While UNIX does not have native threads, it does simulate it, and even attempt parallelism if more CPUs are available. Whereas Windows95 series do support native threads (light sub-process context) but doesn't use multiple CPU even if available. At the lowest end, MS-DOS supported switching between different 'threads', but not only did it lack native thread support, it even lacked time slicing altogether. It just suspended the parent process until the child completed.

The issues of parallel programming

Checkout the accompanying master quotes

Usually, having two or more processes simultaneously is not catastrophic, but this danger is very visible when same file or database is used by two different processes, without some form of synchronization. In case of thread, this problem of sharing the same resource safely is much more stark, as all of the process's global variables is common now (instead of just external files and databases).

It should be remembered fully well, that almost all solutions provided by OS and programming languages, for synchronizing a common resource, is very much dependent on the programmer's diligence, as the problem of synchronization is very much a part of any type of sharing. The coding itself of a parallelly running algorithm is not very different from a sequential one, except for the responsibility overhead to coordinate and restrict the competing threads from running over each other's effort, or starving some threads from getting processor time, or simply doing nothing by waiting. At present I don't see any miracle happening to help us from worrying about this complex issue. However, we can always hide the synchronization details from novice who can only work with simple models. But the synchronization is still very much there albeit deep inside some obscure low-level library package.

The synchronization primitives

Checkout the accompanying master quotes

It was an interesting study to look into the busy waiting models to get an overview into the mindset of yesteryears (all dating back to 1960s). Busy waiting means, that the waiting process uses its full quota of CPU time just waiting. The other alternative is to Sleep, that is, the CPU just checks if the process's waiting condition is over, if not it just dumps it and moves to next process for servicing. This way, CPU time is not wasted in waiting. Various primitives like Semaphores, Event Counters and Monitors are available to use this non-busy waiting.

Like threads, Critical Section too is just a concept, and is subjective of the application's needs. Not all external access to common resources is a Critical Section, if, there is no danger of being in competition with other threads. But usually, common resources are potential danger zones, and hence should mostly be mutually excluded, since we invariably out-grow our small expectations. Critical Sections, though used as it is in some programming languages (like Delphi), is just a convenient way of naming a potential conflict ground during sharing. There are number of solutions put forth to ward off the dangers of sharing, and these are called the Synchronization Primitives. It is like saying that 'being good avoids all evil', but what is good is context dependent. Or, different forms of critical section techniques can be like do-while, while-do and for-loop; though they are equivalent they have their own benefits. Why, we can even compare this to the different languages that are available!

(Modern Operating Systems - Andrew S. Tanenbaum)

Equivalence of Primitives: E.W.Dijkstra (1965) suggested using an integer variable to count the number of wakeups saved for future use (Semaphore).... special kind of variable called an event counters (without using mutual exclusion, by Reed and Kanodia, 1979)... Hore(1974) and Brinch Hansen(1975) proposed a higher level synchronization primitive called a monitor... Reed and Kanodia (1979) described a different inter-process communication method called sequencers. Campbell and Habermann (1974) discussed a method called path expressions. Atkinson and Hewitt (1979) introduced serializers. While the list of different methods is not endless, it is certainly pretty long. With new ones being dreamed up all the time. ...Furthermore... are similar to other ones.

Over the years, each one has accumulated supporters who maintain that their favorite way is the best way. The truth of the matter is that all these methods are essentially semantically equivalent (at least as far as single CPU systems are concerned). Using any of them, you can build other ones... equivalence... provides more insight and understanding about how the primitives work and how they can be implemented.

I have put many of the quotes to the accompanying page, but I cannot do away with the above one. This is in fact the very crux of this whole article. True to its words, in analyzing the equivalence of synchronization primitives, I have come to understand many subtle nuances of threads. By the way, it is interesting to note Tanenbaum is in Netherlands and Dijkstra, whom he quotes profusely, is a Dutch too. Also do you recognize the name Hore? He is same person who was mentioned in the Eiffel article, as the pioneer of Design By Contract paradigm, along with Dijkstra. A legendary clan indeed!

The Java Model

Checkout the accompanying master quotes

When I was with Delphi, having tasted its neat Win32 documentation on Threads and Synchronization, I became bold enough to try the book Modern Operating Systems by Andrew S. Tanenbaum. This masterpiece, talked extensively about synchronization and its equivalence. Then coming into Java, and recognizing for the first time that Java's synchronized keyword, along with the wait() and notify() methods that is available on every Java object (a lock, if need be), I was thrilled to understand firsthand the patterns of great minds. It is only after this that I browsed the Java Language Specification (JLS) by James Gosling, Bill Joy and Guy Steel, thus proving my own finding, for myself. For more details you must read relevant sections of these two authentic and clear sources and of course the concrete JavaTM 2 Platform Std. Ed. v1.4.0 - documentation of Sun, 2001.

Ok, now a teaser: What is the difference between the sleep() method and the wait()?. To confuse the gullible, both can have similar signature (i.e. taking milliseconds as input), and both blocks the thread that call it... Well, don't fail to note that sleep cannot be called with zero-arguments, but wait can be. When wait is called with no arguments, the thread blocks, and will start again only if some other thread notify it. Importantly, while it waits, all the locks it owned, are disowned. But when a thread sleep, it cannot be awakened by anybody but its own timeout duration. And again importantly, all the locks it owns, stays with it. Thus when two threads are competing against each other to use a common resource, a call to sleep doesn't buy anything for the another. If anything, for the duration of sleep, the common resource is used by nobody! But when a thread call wait in the same scenario, the other thread enjoy the resource, with one less competition. Then why do we need sleep? In case we have to wait for an email from an overseas server, say, but don't want to loose ownership of the printer for that time. Something like that.

Another teaser: Why do we need a synchronized keyword and two methods (wait and notify) to achieve synchronization in java? This is a fertile question. First let us finish dealing with synchronization primitives in general, and then come back to this.

Putting threads in perspective

First, what is thread after all? If you consider a top-down program as a flowchart, a thread is just one flowchart, or flow of statements. Multiple threads are just multiple flows. If we already know how to code one flow, we also know to code one hundred of it, if need be. In fact Java makes this flow idea obvious, by abstracting a thread to an interface Runnable which just contain one method, run(). So, programming with threads is after all, programming multiple programs together, or allowing many instance of a given program to run together. What spoils this beautiful picture? The need for communication between threads.

Take a moderately trivial task: To add numbers from 1 to million (1,000,000). If we have Two CPUs in our machine, it would be wasteful to ask one CPU to calculate the whole loop, while the other sleeps (of course, pretend that we don't know the formula for it). One strategy would be to split the problem into two, and ask one CPU to sum all numbers from 1 to 500,000, and the other CPU to sum the rest of 500,000 to 1,000,000. That is two threads of same activity, with only different parameters. Now if there is a master thread which spawns these two sum-mer threads, then it waits for the answers, sums them, and then stops. This is a typical scenario of sharing that we are dealing with. This becomes more complex, as we have more number of CPUs (hence thread), and if all CPUs are not equal (so we have to spilt the problem unequally, and wait for unequal times). This can become far more complex if we are not just summing, but calculating a much complex algorithm, where each thread perform a completely different algorithm, and say, have to wait for its co thread, from time to time. Well, I hope you get the picture.

Now, threading is no longer a simple set of parallel flows, but parallel cooperating flows; and that is wholly a different ball game. And that is where the synchronization and locks come into play. Simply put, we just need a way to block ourselves, or wait. Of course we can go into active loop, and check for the required condition at each iteration. This is obsolete nowadays. As, this means that the processor has to be busy just lopping us. The modern alternative (if you call 1950s and 60s as modern... I call it prehistoric) is to, allow our thread to just be dumped into some part of RAM, and allow other processes to execute. Where then is the communication? It is the lock. If a thread demands a lock before proceeding, and that lock is held by somebody else, the CPU mercilessly dumps this thread, and goes on serving others. Periodically it comes to this dumped thread, and tries to find its locks for it, if found, fine; this sleeping thread is awakened and served. If not allows it to sleep for another round. If some other thread asks for the same lock, it faces the same fate: getting dumped (or passive wait). Only when the lock-holding thread finishes (and announce, that it is releasing), can one of the sleeping thread be served.

What would be the minimal Synchronization Primitive?

By that I mean, what should be the minimum set of keywords or concepts that we need to grasp, before being able to deal with any kind of thread orchestration? Java says that Monitors are enough, i.e. one keyword and two methods in all. But remember what is sufficient may not be necessary. To live a day, say, $1000 is sufficient; but that doesn't mean you need that much. If you are ready to be frugal, we can live with much less. Of course, over time we'll arrive upon a figure that is suitable for our level of lifestyle. All the same, knowing how little we need, will definitely allow us to appreciate better what we already have.

Going basic, we just need a way to passively wait for some condition to happen. If we can wait, we also need a way to be notified (else we'll be in an eternal sleep). And that is all we need! A primitive to say that we are waiting for something, and another primitive to announce that the waiting can cease and all waiting thread should get going. But don't we need a way to say that something has NOT happened, and anybody requiring that, should wait? Yes, but this can be clubbed with the waiting primitive itself: If we wait for something, and that something is available, we acquire it, and posses it, till we explicitly announce that we are releasing that something. But if that something is NOT available, we wait. And that if we posses something, then anybody else asking for it will NOT get that and consequently wait. Thus the tentative basic primitive we are talking about are wait and release, or choosing a new words say hold (same as wait, in a complementary way), and allow (same as notify). Armed with these, we are ready to answer the second teaser that we posed in the 'Java Model' section, above.

Epilogue

I realize that this is a hard way to conclude an article. But with the motto of releasing often, I would rather churn out a partially complete essay, rather than wait for it to be complete. Needless to say, a sequel is on it way.

One thing, I've been fighting while writing this article is, in deciding the amount of quotes that is legitimate. I really don't want to get credit for an article that is just a compilation of intelligent quotes. On the contrary, the quotes included here and elsewhere, are only my humble tribute to those great souls that have allowed me to grasp the unknown. I feel it is far better to quote the original author, than produce a sloppy digest of it in our words. Yes, this cannot license me to reproduce the whole work of the original author, but within honest limits, this is a more reasonable approach. My dilemma has only heightened, when contrasting my own high handed serving of quotes, with that of a friend, who wanted to publish his compilation here. I said no to him, but I'm committing the same sin here. Only consolation is that, I'm indicating that something is a quote and what its source is, as clearly as possible. All the same, my partiality is irksome, to put it mildly. Of course this matter is not closed, I'll be pondering over this issue, would try if possible to get permission from the original authors, or maybe remove these quotes with just links to their pages, if available. After all, hasn't the old style of reading a book to somebody we love, highlighting something here, and warning about something there, been effective? Maybe there is a way to do this within the copyright framework.

Whatever, the next article in this series, will explore these primitive in much more detail, with sample code.