I recieved this comment on an older post of mine, and I was going to post a reply, but I noticed my reply was pretty long, and I was also missing a post for last friday, so I figured I'd cheat and make the reply a blog post of its own.
First, I'd like to thank everyone who comments on my posts, they are a very helpful for me. I will try to keep posting regularly, however I am once again running low on topics that apply to Maple. I'm thinking about expanding to topics that involve parallel programming, but are not really that related to Maple. Would people be interested in this kind of material?
It is hard for me to say anything concrete without seeing other people's code, however I can give some general advice.
First, using the Task Model does not make you code thread safe, however it can make it easier for you to write thread safe code. You still need to be very careful when using shared values. If you are getting different behaviour between running your code single threaded vs parallel, the odds are your code is not thread safe (or you are calling a thread unsafe function).
Using a Mutex can make critical sections thread safe, but too much locking can reduce the parallelism of your code and introduce more overhead (acquiring a mutex is not free). Good parallel code will use locks sparingly. Of course in most situations correctness is more important than performance.
The current garbage collector does occasionally run into problems when running parallel code. It is currently the biggest bottleneck we have. I've definitely seen it allocate way more memory than it should. We are working to improve the garbage collector, but it is a big, fundamental piece of the kernel, and so it is going to take some time.
The overhead of the Task Programming Model can be expensive if the time taken to compute the individual tasks is small. Much like a recursive algorithm, picking an appropriate size for your base case can have a significant effect on the running time of your algorithm. Recuding this overhead is another area that we are working on.
For well parallelized code, the largest speed ups we are seeing are in the 3 to 4 times range. We believe the current limitation for this is the garbage collector. If you are seeing speedups less that this, there may be ways to improve up your code.
Are you using Arrays or arrays? Maple "array"s are an old structure that have been replaced by Array, which should be used instead. I suspect that an Array will be more efficient in parallel than an array. Similar comments apply to matrix vs Matrix.
In Maple 13, no functions implemented in the Library are verified to be thread safe. Functions that are implemented by the kernel are thread-safe. For Maple 14, we will have a help page that documents which functions have been verified to be thread-safe (and it should include some library functions).
Of course using the newest version of Maple available will help. We are constantly improving Maple's parallelism. An especially significant change was made in the development of Maple 13 to make it much better than 12.
I suspect that this blog is the best place to get parallel programming advice. If you are willing to share your code with us, I can take a look (assuming it does not require too much specialized knowledge). That said, I have work of my own to do, so I can't make any promises about how quickly I'll be able comment.
-- Kernel Developer Maplesoft