I am trying to understand various approaches to threading in Maple. I can naturally parallelize some computations, but only if I can figure out threading. The punchline is this - running something on a single core takes significantly longer when I break it into threads.
I picked a command that takes some time - suppose I have a Lie algebra G, and I want to know if H is a subalgebra. If I do this 16 times in a row, it takes about 3 seconds consistently. Here I made an array of 16 copies of H. Then I can simply ask which element is a subalgebra:
for i in list_of algs do
Then running the following gives me an answer of about 3:
st := time();
time() - st;
Suppose I want to run this on cores = 4 or 5 or...
are_subalgebras_t := proc(list, cores)
local ...., mutex, threadlist;
mutex := Threads[Mutex][Create]();
<create cores chunks of list, call it sublist>
for i in sublist do
threadlist:= [op(ts), Create(are_subalgebra(i))];
The unecessary code is removed, but I:
1: Create a mutex
2: Chunk up my list into as many pieces as I want
3: Lock my mutex
4: Spawn a thread on the current chunk, save the thread id in a list
5: Unlock the mutex (I just want to add thread ids to my list safely)
6: Repeat 3-5 for each chunk
7: Wait until all threads are done
My machine has 16 cores, and if I run this on say 8 cores it takes almost 10 seconds. I can see that it starts executing the threads immediately - an example run shows I might spawn two threads, then the first thread starts, the third spawns, and then they complete in some order. Predictably, every run through is different. This tells me that I am getting parallelization, but at the detriment of runtime.
Can anyone tell me where my approach is going wrong? I was unable to get any Map functions to work at all.