Learning threading

Parmenides · Feb 28, 2015

I'm making it a personal goal to become more skilled in multi threaded programming. I plan to come up with a discrete goal for achieving that. Maybe join an open source project that deals a lot with threading, maybe make a phone app, maybe research a topic and give a talk to fellow programmers... anyways some project for this year. I'm asking a very open ended question on good ideas on how I could immerse myself in threading with some goal in mind. could any of you with experience toss some ideas my way? No idea is dumb or too random.

socK · Feb 28, 2015

I've been digging into this a bit lately and have no idea what I'm doing. I opted to try and break it down to a thread pool. This meant some redesigns in places to generate consumable "Tasks" instead. I don't have it really nailed down how I want to handle it at all - but it is vaguely sort of functional, albeit in no way robust or optimal.

Here's some pictures of said escapades
Trying to draw a mandlebrot, something is fucked and I just draw the same splits somehow on a different thread...
http://i.imgur.com/jcQzNei.png

Still fucked
http://i.imgur.com/Xus6WpW.png

Drawing a different fractal, colored per thread
http://i.imgur.com/1UmI8EP.png

Blazestorm · Feb 28, 2015

Well to be honest, multi-threading itself isn't difficult. Especially when there's no conflict of data/read writes like in the case of ray-traced rendering. You just create a bunch of threads for each tile/pixel you want rendered and they go off and do work. Because none of them should be reading or writing to other parts of the texture, there's no conflict and nothing to synchronize. Only have to wait for all the threads to finish so you can present the image to the screen.

So it's the synchronization and correctness that kills you. You can have a program work correctly 99.99% of the time, and then fail on another platform. Or it just happened to not run into the edge case on any of your machines.

A book I started with, which was good about introducing patterns you often find in multi-threaded scenarios, called The Little Book of Semaphores. It's free and it covers a good amount of topics, and presents lots of common "problems" and solutions. You can try implementing them yourself in more complex programs.

Edit: I guess I was wrong, this book doesn't present broken code, it just presents "correct" code. There are other books/talks that do present broken programs and then the fixes for them. Making it clearer what the problem was.

Parmenides · Mar 2, 2015

Thanks guys,
@socK, sounds like a fun challenge (and perhaps frustrating at the same time)

@Blazestorm, thanks! will look at that. Haven't thought about it, as it sounds there are a lot of common pitfalls that can be catalogued.

Scoobydo · Mar 9, 2015

I will try to give you a very high level overview of threading:

Think of a function in your program that you are calling. The program is running sequentially. Once you hit a function. You can "step into" the function. And that function runs sequentially until it exits. Then it goes back to the parent code.

Threading is exactly the same thing, except the program continues in the parent as the child function runs simultaneously.

I'm sure you already know that. But problems occur if the rest of the program needs results from that thread, or needs to interact with it. You will have no idea when things have completed, or data is available, etc.

Here are the things you need to think about when designing a multi threaded program:

-How do threads communicate with each other?
-How do they deal with global memory?
-How do they start/exit cleanly within the system?

I have no idea what OS you are programming on, or what programming language. So I will use Windows with C++ as the language.

Thread communicate using events. Look up these functions.

CreateEvent();
SetEvent();
ResetEvent();
WaitForMultipleObjects();
WaitForSingleObject();
CloseHandle();

The event is a thread safe "signal" within the system. One thread can trigger a signal (SetEvent), while another thread might be listening for the signal to fire (WaitForSingleObject.) These Wait functions do not consume CPU cycles in a tight loop. The thread will effectively shut down or sleep until the event fires. Then they wake up and continue processing.

It's best if I describe the infrastructure within your program next, so you can see how the puzzle is put together.

-Main program creates whatever events it needs with CreateEvent. Such as "Run Logic" or "Shutdown"

-Main program creates a new thread.

-That new thread has a tight loop that waits for signals from any other thread (Using WaitForMultipleObjects --- bound to the Run Logic and/or Shutdown Event or whatever events you need). Even though its a tight loop, the WaitForMultipleObjects will stall the thread until the event fires.

-If the Shutdown Event is triggered, just exit the thread function by jumping out of your tight loop.

-If the Run Logic event is triggered, run another function that performs the logic.

Now the cool thing with events, is you can set them up to automatically reset (there is a parameter in the CreateEvent). What this means is if you have multiple threads bound to a single event in its WaitForMultipleObjects, once one thread gets the signal, it will consume that signal and will activate, but other threads on that event do not fire. Using this, you can create an array of threads as worker threads, effectively creating a thread pool. Once the thread completes the logic, it loops back to the top and waits for another signal to consume. The shutdown event you want the complete opposite. If the program needs to shutdown, you want all the threads to have its signal fire so they can all shut down.

Now when threads share the same variables. In order to prevent them from both reading/writing to the same variable simultaneous (which can cause errors) you want to use a Mutex.

CreateMutex();
ResetMutex();
WaitForSingleObject();
CloseHandle();

The concept is really similar to the Events above. You create a mutex. When you want to use the variable, you WaitForSingleObject on the mutex you created. If another thread has the mutex, WaitForSingleObject will wait until its finished without using CPU cycles. Once the other thread finishes with the mutex, it calls the ResetMutex effectively releasing that mutex so your thread will wake up. When your thread wakes up, it will hold that mutex until it finishes with the ResetMutex, and so on.

The other thing to worry about is when you create your thread. You can pass in a variable as the starting parameter. Since the thread is a function, it has a stack. So you can pass whatever as a thread starting parameter and not worry about synchronizing or threading issues on that variable. Whatever is passed in is on the stack. (But if you pass a pointer, that pointer is global, and you'd need to mutex it)

I hope this made some sense. I know .Net languages have the same concept for threading. They have Events, WaitFor commands, etc like C++. So the same concepts apply there. Not sure about Java or Linux.

cyclone3d · Mar 9, 2015

Use as few mutexes as possible.

If you can use none, even better.

If you can get your program to a state where you can guarantee that no thread will need to write to the memory another thread writes to or need to read memory at the same time another thread is writing to it, then you are golden.

Mutexes slow things down, but a lot of the time they are unavoidable.

There are also different libraries that ease the use of using threads (at least in c/c++).

For an example of a program that only uses one mutex to wait for all threads to end, see here.
http://sourceforge.net/projects/fastestprimes/?source=directory

mikeblas · Mar 9, 2015

Scoobydo said:
It's best if I describe the infrastructure within your program next, so you can see how the puzzle is put together.

The proposed arrangement is kind of broken. At best, it only works if one thread is setting the event. Setting an event that's already set results in a set event, and doesn't release the waiting thread twice; and may not release a single waiting thread multiple times.

Scoobydo said:
Now when threads share the same variables. In order to prevent them from both reading/writing to the same variable simultaneous (which can cause errors) you want to use a Mutex.

It seems important to point out that the mutex itself is shared. There's only one mutex in your example; not one per thread, or one per thread relationship.

Scoobydo said:
(But if you pass a pointer, that pointer is global, and you'd need to mutex it)

Data needs protection only if it is read-write shared.

Wiseguy2001 · Mar 11, 2015

Locks and mutexes by definition are single threaded (they turn off multithreading). They should only really be used as a last resort.

Threading is quite a deep subject; it's low level (dealing with CPU caches and atomic operations, such as test and set). To designing the high-level architecture to scale across multiple cores/ machines (it's main reason for immutable data structures).

If you're interested in threading from an academic point of view, have a look at Erlang. It's a different programming model to pretty much everything, and works well even dealing with millions of thread/ connections (heavily used in telecoms for this reason) . They're a bit of an unsung hero behind Snapchat. If they had tried to start with the same tech as Twitter did, I don't believe the could have sustained anywhere near such insane growth. Erlang seems to be becoming more popular recently. The creators Robert Virding and Joe Amstrong, have some interesting talks on the net (I've also met Robert in Person).

Another interesting language is Go. It has quite a nice threading syntax, they even named the language after the keyword to run concurrently!

socK · Mar 11, 2015

Wiseguy2001 said:
Locks and mutexes by definition are single threaded (they turn off multithreading). They should only really be used as a last resort.

Wat

mikeblas · Mar 11, 2015

Wiseguy2001 said:
Locks and mutexes by definition are single threaded (they turn off multithreading). They should only really be used as a last resort.

This isn't very accurate. In particular, "used as a last resort" can lead to very poor decisions.

While threads are low-level constructs, developers can make very effective use of threads without worrying about CPU caches or atomic test-and-set operations. If you're telling them they shouldn't use pre-built abstractions like locks and mutexes, you're obliging them to implement similar constructs themselves and they're probably going to make bad implementations when they do so because they'll trip over those subtleties, have to worry about machines with different memory models, and so on.

Wiseguy2001 · Mar 11, 2015

Locks force code to run in one thread. While they're the easiest way to ensure thread safety, they have a cost (namely performance, see Amdahl's law). The lock order must tightly choreographed or you end up with deadlocks.

There's no problem with them being used sparsely, but there's often alternatives.. Like using atomic methods as mentioned before. Also volatile types, which stop threads from creating their own copies of a variable. MVCC (Multiversion concurrency control), which is quite a popular alternative to locking in most database engines.

You won't find big locks in any system where performance is a factor. Threading is difficult and there isn't just one technique that works everywhere. Also, notoriously hard to test (and debug!).

mikeblas · Mar 11, 2015

Wiseguy2001 said:
Locks force code to run in one thread.

This simply isn't true. When a lock is acquired, other threads can't acquire the same lock. But they can be off doing other work that doesn't involve that lock instance. Further, any thread can [eventually] acquire the lock, so the code in question isn't running in only one thread. I think it's more accurate to say that "a lock prevents more than one thread from entering a specific block of code", or maybe "locks are used to prevent more than one thread from simultaneously manipulating the same data".

aL Mac · Mar 11, 2015

Wiseguy2001 said:
Locks force code to run in one thread. While they're the easiest way to ensure thread safety, they have a cost (namely performance, see Amdahl's law). The lock order must tightly choreographed or you end up with deadlocks.

There's no problem with them being used sparsely, but there's often alternatives.. Like using atomic methods as mentioned before. Also volatile types, which stop threads from creating their own copies of a variable. MVCC (Multiversion concurrency control), which is quite a popular alternative to locking in most database engines.

You won't find big locks in any system where performance is a factor. Threading is difficult and there isn't just one technique that works everywhere. Also, notoriously hard to test (and debug!).

Wiseguy, no offense, but your information about locks is completely incorrect. I'm only calling you out on it because I wouldn't want the OP to be confused.

Locks or mutexes do not, in any way, turn off multi-threading and force code to run in one thread. In fact, the opposite is usually the case. You need the locks because the code is being executed by multiple threads.

Volatile types do not prevent you from needing locks and mutual exclusion.

Locks' cost does not relate to Amdahl's law that directly. Amdahl's law refers to the maximum possible expected speed up from concurrent processing being limited by the time it takes for sequential processing. The sequential combination step of concurrent processing is more related to the nature of the problem and algorithm then by the performance of the locking mechanism.

Claiming locks are not used in a system where performance is a big factor also sounds suspect. It's very common in performance oriented software, but of course it is best to limit the amount of time threads need to wait on a lock and therefore limit the locking to be around the smallest portion of code as possible.

mikeblas said:
This simply isn't true. When a lock is acquired, other threads can't acquire the same lock. But they can be off doing other work that doesn't involve that lock instance. Further, any thread can [eventually] acquire the lock, so the code in question isn't running in only one thread. I think it's more accurate to say that "a lock prevents more than one thread from entering a specific block of code", or maybe "locks are used to prevent more than one thread from simultaneously manipulating the same data".

Mike, I would prefer the 2nd description, because the first description may not be true (i.e. the developer screws up), but there are locking mechanisms which are defined on a code block would help to prevent such a mistake.

aL Mac · Mar 11, 2015

Parmenides said:
I'm making it a personal goal to become more skilled in multi threaded programming. I plan to come up with a discrete goal for achieving that. Maybe join an open source project that deals a lot with threading, maybe make a phone app, maybe research a topic and give a talk to fellow programmers... anyways some project for this year. I'm asking a very open ended question on good ideas on how I could immerse myself in threading with some goal in mind. could any of you with experience toss some ideas my way? No idea is dumb or too random.

Keep it simple, there seems to be a lot of confusion and over complication about threading in this thread.

Here's what I might suggest. Come up with an algorithm and implement it using a single thread, such as a divide and conquer algorithm. (merge-sort, matrix multiplication, etc.). Time how long it takes to solve a large data-set. Then, modify it to make it multi-threaded and see if you can speed it up when executed on a multi-core processor. Measure the time it takes and compare it to your expectations. How does it compare to Amdahl's law when you compare the merge step to the portion computed in parallel?

If you want to move on to something more complicated, you can implement a client/server application. For example, you could write a proxy server which receives HTTP requests, retrieves the document, and sends it to the client. The server would listen on the main thread on the main socket/port. When it receives a new connection, it usually receives a new socket and the passes it to a new thread to handle communication to the client. (sending the files or data). Create a cache for your server to store commonly requested documents. Make sure that the cache is not corrupted when accessed by multiple threads at the same time.

Otherwise, use your imagination? Do you like chess? implement something that processes chess games or chess moves in parallel!

Wiseguy2001 · Mar 12, 2015

mikeblas said:
Snip

Arguing over semantics. I never once referred locks as a singular.

aL Mac said:
Wiseguy, no offense, but your information about locks is completely incorrect. I'm only calling you out on it because I wouldn't want the OP to be confused.

None taken, I don't think you interpreted my last post correctly so I highlighted a couple of things.

Wiseguy2001 said:
Locks force code to run in one thread. While they're the easiest way to ensure thread safety, they have a cost (namely performance, see Amdahl's law). The lock order must be tightly choreographed or you end up with deadlocks.

There's no problem with them being used sparsely, but there's often alternatives.. Like using atomic methods as mentioned before. Also volatile types, which stop threads from creating their own copies of a variable. MVCC (Multiversion concurrency control), which is quite a popular alternative to locking in most database engines.

You won't find big locks in any system where performance is a factor. Threading is difficult and there isn't just one technique that works everywhere. Also, notoriously hard to test (and debug!).

mikeblas · Mar 12, 2015

Wiseguy2001 said:
None taken, I don't think you interpreted my last post correctly so I highlighted a couple of things.

Multiple people have taken the same interpretation, so maybe semantics are actually the problem.

Java Concurrency in Practice is a great reference for Java programmers. I haven't seen a book I'd recommend for C++ or C#.

Parmenides · Mar 16, 2015

Thanks a bunch guys! Appreciated every post. I'll try to address all the posts in one. A lot of posts got my ideas flowing on possible projects, and thanks for making sure I don't miss some of those less known topics.

On the topic of Volatiles, is it true that you don't so much have to worry about things going haywire when they are atomic operations like setting a boolean value? And as far as reading during a write, do nasty problems only occur if the data type is big enough to change in more than one atomic CPU cycle?

On the subject of languages, I mostly spend equal time on C#.net and Objective C on OSX. The latest of C# trivializes threading a lot of it by hiding a lot of what's under the hood (which kind of keeps you from needing a deeper understanding which is still important for the nastier bugs). Objective C has some libraries and then it just sits on C so that might be a better opportunity to mess around to get a better understanding. I'm a bit rusty on C.

I've come across Java Concurrency in Practice as the book that seems to get so much recommendation and it does look tempting to just get into a java threading as a extra excuse to try my hand at Android programming. But I have little experience with Java, but going through the book as I learn Java sounds tempting but maybe too much at once.

mikeblas · Mar 17, 2015

Art of Multiprocessor Programming was the other book I was thinking of.

An operation is either atomic or not atomic. On all current Intel processors, aligned 32-bit reads and writes are atomic. If T1 writes a value to such a location, any other thread reading it will see the complete write.

To explain what might go wrong, let's say the memory was misaligned. In that case, maybe the processor writes the first 16 bits, then skips a cycle or two, then writes the last 16 bits. In this case, T1 might write the value and T2 tries reading it but sees a half-baked update; only the first 16-bits are set! It of course doesn't know the update was half-baked, so it carries on working -- with bad data. Meanwhile, T1 writes the remaining 16-bits and now everything is all broken.

The meaning of "volatile" changes from language to language. In C++, it means the compiler should enregister a value as an optimization. Consider this code:

Code:

// global quit flag!
bool bQuitting = false;

// how much work is done?
long nLoops = 0;

void MyThreadFunction()
{
   while (!bQuitting)
   {
      DoWork();
      nLoops += 1;
   }
}

In this example, we have a thread function that's going to loop, doing work, until someone sets bQutting to be true. Presumably, some other thread can come along, decide we're shutting down, and set bQutting to true. Further, each time our thread does some work, it increments a loop count variable to show what's going on.

As it is written, the compiler is allowed to optimize this code by assuming that no other code can write to the bQuitting flag. It may either enregstier the value, or it might optimize it out completely and just loop while(true). Either way, it's a disaster: we can't ever quit our thread! The controlling thread will poke bQuitting to true, but this code isn't ever looking at it and runs forever.

Same is true for the nLoops variable. Maybe it gets loaded into a register, and the code just increments the register. The increased value is never visible to anyone outside this thread.

These are contrived examples, but they're meaningful. They can both be fixed by marking the two variables "volatile", which tells the compiler to never enregister and assume the value can be changed outside the visible semantics of the program.

Java has a more rigorous definition because Java is both a language and a machine. (C++ is just a language, running on someone else's machine.) TBH, I haven't looked up the formal definition of "volatile" for C#, but I figure it's a but more like the Java definition.

Tawnos · Mar 17, 2015

mikeblas said:
TBH, I haven't looked up the formal definition of "volatile" for C#, but I figure it's a but more like the Java definition.

Yeah, the same: https://msdn.microsoft.com/en-us/library/x13ttww7.aspx

The volatile keyword indicates that a field might be modified by multiple threads that are executing at the same time. Fields that are declared volatile are not subject to compiler optimizations that assume access by a single thread. This ensures that the most up-to-date value is present in the field at all times.

The volatile modifier is usually used for a field that is accessed by multiple threads without using the lock statement to serialize access.

Compare to: http://docs.oracle.com/javase/tutorial/essential/concurrency/atomic.html

Using volatile variables reduces the risk of memory consistency errors, because any write to a volatile variable establishes a happens-before relationship with subsequent reads of that same variable. This means that changes to a volatile variable are always visible to other threads. What's more, it also means that when a thread reads a volatile variable, it sees not just the latest change to the volatile, but also the side effects of the code that led up the change.

There is also the language spec for Java: http://docs.oracle.com/javase/specs/jls/se8/html/jls-17.html#jls-17.4 and for C# (still download only/installed with VS, not sure why we don't have an online version) section 10.5.3: https://www.microsoft.com/en-us/download/details.aspx?id=7029

mikeblas · Mar 17, 2015

Yeah, that's the difference. I've seen the Java one written up (maybe in the VM docs?) that it also sets access ordering, and establishes rules around that regardless of the access ordering of the underlying CPU. (These aren't official docs, but they're what I find quickly: http://java.dzone.com/articles/java-memory-model-programer’s)

Chris_Lonardo · Mar 25, 2015

This is pretty much the definitive threading reference (ok, flame on, Albahari detractors) for C#- http://www.albahari.com/threading/

mikeblas · Aug 27, 2015

Parmenides said:
I'm making it a personal goal to become more skilled in multi threaded programming.

It's been about six months. How are you doing?

Dogs · Aug 30, 2015

mikeblas said:
This simply isn't true. When a lock is acquired, other threads can't acquire the same lock. But they can be off doing other work that doesn't involve that lock instance. Further, any thread can [eventually] acquire the lock, so the code in question isn't running in only one thread. I think it's more accurate to say that "a lock prevents more than one thread from entering a specific block of code", or maybe "locks are used to prevent more than one thread from simultaneously manipulating the same data".

As a concrete example, one might consider taking a look at this method, then contemplating some of its possible uses.
http://en.cppreference.com/w/cpp/thread/mutex/try_lock

As they mention in the comment inside the else block, the particular resource examined wasn't available, but there could be some other work to be done while the thread waits its turn. The threads are not sharing the resource concurrently, but that doesn't mean the program isn't working in parallel.

Learning threading

Supreme [H]ardness

Supreme [H]ardness

Supreme [H]ardness

Supreme [H]ardness

Limp Gawd

Fully [H]

[H]ard|DCer of the Month - May 2006

2[H]4U

Supreme [H]ardness

[H]ard|DCer of the Month - May 2006

2[H]4U

[H]ard|DCer of the Month - May 2006

Gawd

Gawd

2[H]4U

[H]ard|DCer of the Month - May 2006

Supreme [H]ardness

[H]ard|DCer of the Month - May 2006

2[H]4U

[H]ard|DCer of the Month - May 2006

[H]ard|OCP Storage Engineer & Editor

[H]ard|DCer of the Month - May 2006

[H]ard|Gawd