Follow along with the video below to see how to install our site as a web app on your home screen.
Note: This feature may not be available in some browsers.
..
For example if I had an array of 32-bit ints with 32 positions and I wanted to use 2 threads, I could write to positions 0-15 with the first thread and write to positions 16-31 with the second thread at the same time?
I thought I read somewhere that this wasn't possible without using locks.
You can't use a 64-bit int because the OS doesn't use a 64-bit thread ID.
Locks are only needed where there's a chance of resource colision - let's say that both threads want to flip a bit in same int. You could have thread1 read, modify write & then thread2 read, modify & write, and you don't have a problem. The danger comes in when you have T1 read, T2 read, T1 modify+write, T2 modify & write - T2 will overwrite changes made by T1.
If it's possible to apply a bitmask in an atomic fashion, you're set. If you can change your fundamental data type to something where you don't have to worry about conflicts (IE - if you're using ints as bools, it doesn't much matter - all writes are going to push them to 0).
Setting a bit to 1 isn't atomic; it's read-modify-write. Only if each thread has its own memory will you avoid contention.
Look for environment variable NUMBER_OF_PROCESSORS.
not sure what the point of this thread is. your maximum speed to be gained is by using the least amount of threads possible to occupy all cores and all hyperthreading. any more than that and you are doing a lot of context switching and other stuff with memory. this is not "work" but "overhead" which takes away from the available processing time
not sure what the point of this thread is.
This environment variable is unreliable and incomplete.Look for environment variable NUMBER_OF_PROCESSORS.
If your code is such that more than one thread never touches the same part of the array at the same time, then you have no contention.The way I have the threads set up, they keep track of what areas of the array they are assigned to work on. No two threads ever work on the same part of the array.
To which specific functions are you referring?Also, in looking up atomic bit functions, it looks like they automatically use locks which I definitely don't want.. or am I missing something here.
It's to help cyclone3d learn how to program with threads, I think. That is, if it is okay with you!
This environment variable is unreliable and incomplete.
You should call GetSystemInfo, then look at the dwActiveProcessorMask and dwNumberOfProcessors fields of the returned SYSTEM_INFO to see how many processors there are, and which ones are available. You can then call GetProcessAffinityMask to see what processors the user has configured for your run of the application.
If your code is such that more than one thread never touches the same part of the array at the same time, then you have no contention.
To which specific functions are you referring?
The locks make them atomic.The ones in atomic.h ... but after looking over them again, it looks like those are just made "atomic" by use of locks so they really aren't atomic in the first place.
No. This should be obvious; think of what you're asking. You want to read a word (or byte, or something) from memory. Then, you want to change one bit. Then, you want to write it back. If the value you're reading with to start is in the processor's cache, then it has to be sure that no other processor (or core) has it cached, as well. Maybe it turns out what's in cache is dirty, and that test isn't atomic. If the value you're reading to start with isn't in cache, the processor has to go to memory to get it. That costs more than 100 clock cycles, usually. Locking a cache line of memory for that long would significantly impact the performance and scalability of multi-core and multi-processor systems.Is there any way to set a single bit in a variable to 0 without doing a read-compare-write?
The locks make them atomic.
No. This should be obvious; think of what you're asking. You want to read a word (or byte, or something) from memory. Then, you want to change one bit. Then, you want to write it back. If the value you're reading with to start is in the processor's cache, then it has to be sure that no other processor (or core) has it cached, as well. Maybe it turns out what's in cache is dirty, and that test isn't atomic. If the value you're reading to start with isn't in cache, the processor has to go to memory to get it. That costs more than 100 clock cycles, usually. Locking a cache line of memory for that long would significantly impact the performance and scalability of multi-core and multi-processor systems.