# MaplePrimes Commons General Technical Discussions

The primary forum for technical discussions.

Create a new Post in this category

### OMP_NUM_THREADS and 64bit Maple on Windo...

December 10 2011 by Maple

5

1

Yesterday I wrote a post that began,

"I realized recently that, while 64bit Maple 15 on Windows (XP64, 7) is now using accelerated BLAS from Intel's MKL, the Operating System environment variable OMP_NUM_THREADS is not being set automatically."

But that first sentence is about where it stopped being correct, as far as how I was interpreting the performance on 64bit Maple on Windows. So I've rewritten the whole post, and this is the revision.

I concluded that, by setting the Windows operating system environment variable OMP_NUM_THREADS to 4, performance would double on a quad core i7. I even showed timings to help establish that. And since I know that memory management and dynamic linking can cause extra overhead, I re-ran all my examples in freshly launched GUI sessions, with the user-interface completely closed between examples. But I got caught out in a mistake, nonetheless. The problem was that there is extra real-time cost to having my machine's Windows operating system dynamically open the MKL dll the very first time after bootup.

So my examples done first after bootup were at a disadvantage. I knew that I could not look just at measured cpu time, since for such threaded applications that reports as some kind of sum of cycles for all threads. But I failed to notice the real-time measurements were being distorted by the cost of loading the dlls the first time. And that penalty is not necessarily paid for each freshly launched, completely new Maple session. So my measurements were not fair.

Here is some illustration of the extra real-time cost, which I was not taking into account. I'll do Matrix-Matrix multiplication for a 1x1 example, to try and show just how much this extra cost is unrelated to the actual computation. In these examples below, I've done a full reboot on Windows 7 where so annotated. The extra time cost for the very first load of the dynamic MKL libraries can be from 1 to over 3 seconds. That's about the same as the cpu time this i7 takes to do the full 3000x3000 Matrix multiplication! Hence the confusion.

Roman brought up hyperthreading in his comment on the original post. So part of redoing all these examples, with full restarts between them, is testing each case both with and without hyperthreading enabled (in the BIOS).

Quad core Intel i7. (four physical cores)

-------------------------------

> restart: # actual OS reboot
> getenv(OMP_NUM_THREADS);   # NULL, unset in OS

> CodeTools:-Usage( Matrix([[3.]]) . Matrix([[3.]]) ): # initialize external libs
memory used=217.18KiB, alloc change=127.98KiB, cpu time=219.00ms, real time=3.10s

> CodeTools:-Usage( Matrix([[3.]]) . Matrix([[3.]]) ):
memory used=9.46KiB, alloc change=0 bytes, cpu time=0ns, real time=0ns

> restart: # actual OS reboot
"4"

> CodeTools:-Usage( Matrix([[3.]]) . Matrix([[3.]]) ): # initialize external libs
memory used=216.91KiB, alloc change=127.98KiB, cpu time=140.00ms, real time=2.81s

> CodeTools:-Usage( Matrix([[3.]]) . Matrix([[3.]]) ):
memory used=9.46KiB, alloc change=0 bytes, cpu time=0ns, real time=0ns

------------------------------

> restart: # actual OS reboot
> getenv(OMP_NUM_THREADS);    # NULL, unset in OS

> CodeTools:-Usage( Matrix([[3.]]) . Matrix([[3.]]) ): # initialize external libs
memory used=217.00KiB, alloc change=127.98KiB, cpu time=202.00ms, real time=2.84s

> CodeTools:-Usage( Matrix([[3.]]) . Matrix([[3.]]) ):
memory used=9.46KiB, alloc change=0 bytes, cpu time=0ns, real time=0ns

> restart: # actual OS reboot
"4"

> CodeTools:-Usage( Matrix([[3.]]) . Matrix([[3.]]) ): # initialize external libs
memory used=215.56KiB, alloc change=127.98KiB, cpu time=187.00ms, real time=1.12s

> CodeTools:-Usage( Matrix([[3.]]) . Matrix([[3.]]) ):
memory used=9.46KiB, alloc change=0 bytes, cpu time=0ns, real time=0ns



Having established that the first use after reboot was incurring a real time penalty of a few seconds, I redid the timings in order to gauge the benefit of having OMP_NUM_THREADS set appropriately. These too were done with and without hyperthreading enabled. The timings below appear to indicate that slightly bettern performance can be had for this example in the case that hyperthreading is disabled. The timings also appear to indicate that having OMP_NUM_THREADS unset results in performance competitive with having it set to the number of physical cores.

Hyperthreading disabled in BIOS
-------------------------------

> restart:
> CodeTools:-Usage( Matrix([[3.]]) . Matrix([[3.]]) ): # initialize external libs
memory used=217.84KiB, alloc change=127.98KiB, cpu time=141.00ms, real time=142.00ms

> getenv(OMP_NUM_THREADS);  # NULL, unset in OS

> M:=LinearAlgebra:-RandomMatrix(3000,datatype=float[8]):
> CodeTools:-Usage( M . M ):
memory used=68.67MiB, alloc change=68.74MiB, cpu time=7.50s, real time=1.92s

> restart:
> CodeTools:-Usage( Matrix([[3.]]) . Matrix([[3.]]) ): # initialize external libs
memory used=217.84KiB, alloc change=127.98KiB, cpu time=141.00ms, real time=141.00ms

"1"

> M:=LinearAlgebra:-RandomMatrix(3000,datatype=float[8]):
> CodeTools:-Usage( M . M ):
memory used=68.67MiB, alloc change=68.74MiB, cpu time=7.38s, real time=7.38s

> restart:
> CodeTools:-Usage( Matrix([[3.]]) . Matrix([[3.]]) ): # initialize external libs
memory used=217.11KiB, alloc change=127.98KiB, cpu time=125.00ms, real time=125.00ms

"4"

> M:=LinearAlgebra:-RandomMatrix(3000,datatype=float[8]):
> CodeTools:-Usage( M . M ):
memory used=68.67MiB, alloc change=68.74MiB, cpu time=7.57s, real time=1.94s

------------------------------

> restart:
> CodeTools:-Usage( Matrix([[3.]]) . Matrix([[3.]]) ): # initialize external libs
memory used=216.57KiB, alloc change=127.98KiB, cpu time=125.00ms, real time=125.00ms

> getenv(OMP_NUM_THREADS);  # NULL, unset in OS

> M:=LinearAlgebra:-RandomMatrix(3000,datatype=float[8]):
> CodeTools:-Usage( M . M ):
memory used=68.67MiB, alloc change=68.74MiB, cpu time=8.46s, real time=2.15s

> restart:
> CodeTools:-Usage( Matrix([[3.]]) . Matrix([[3.]]) ): # initialize external libs
memory used=216.80KiB, alloc change=127.98KiB, cpu time=125.00ms, real time=125.00ms

"1"

> M:=LinearAlgebra:-RandomMatrix(3000,datatype=float[8]):
> CodeTools:-Usage( M . M ):
memory used=68.67MiB, alloc change=68.74MiB, cpu time=7.35s, real time=7.35s

> restart:
> CodeTools:-Usage( Matrix([[3.]]) . Matrix([[3.]]) ): # initialize external libs
memory used=216.80KiB, alloc change=127.98KiB, cpu time=125.00ms, real time=125.00ms

> getenv(OMP_NUM_THREADS);  # NULL, unset in OS
"4"

> M:=LinearAlgebra:-RandomMatrix(3000,datatype=float[8]):
> CodeTools:-Usage( M . M ):
memory used=68.67MiB, alloc change=68.74MiB, cpu time=8.56s, real time=2.15s

> restart:
> CodeTools:-Usage( Matrix([[3.]]) . Matrix([[3.]]) ): # initialize external libs
memory used=216.80KiB, alloc change=127.98KiB, cpu time=125.00ms, real time=125.00ms

"8"

> M:=LinearAlgebra:-RandomMatrix(3000,datatype=float[8]):
> CodeTools:-Usage( M . M ):
memory used=68.67MiB, alloc change=68.74MiB, cpu time=8.69s, real time=2.23s


With all those new timing measurements it appears that having to set the global environment variable OMP_NUM_THREADS to the number of physical cores may not be necessary. The performance is comparable, when that variable is left unset. So, while this post is now a non-story, it's interesting to know.

And the lesson about comparitive timings is also useful. Sometimes, even complete GUI/kernel relaunch is not enough to get a level and fair field for comparison.

### Supported platforms for Maple 16

December 09 2011 by Maple

7

3

As we look ahead to our next Maple release, I wanted to let you know of some changes that we are making  to the platforms and operating systems that will be supported by Maple 16.

With Maple 16 we will be adding support for Linux Ubuntu 11.10 and Macintosh OS X 10.7 while dropping support for Linux Ubuntu 10.10 and Macintosh OS X 10.5.  As a result, we will no longer support Maple on the PPC platform (Apple stopped PPC support as of OS X 10.5).

If...

### publisher rejects Maple graphs

December 05 2011 by Maple

12

27

I recently had a journal article accepted but was told that several of the graphs had to be redone because the axes were not thick enough to reproduce. Unfortunately, Maple has no way to edit the axes for thickness, so I had to export the differential equations to Matlab and integrate them there to get publication quality graphs. I have been having more trouble every year with the quality of Maple's graphics output, and this really puts a cherry on it. I would be pleased as punch if Maple could include a graph editor that would let us customize a graph to make it presentable to publishers. Maple does so many things so well it is a shame to leave this on the back burner.

### Computing Implicit Derivatives

November 22 2011 by

Over the weekend I was attempting to estimate the tension change in a bicycle spoke due to an applied load.  After various simplifications and approximations, the problem was reduced to the following.

Given a constraint, F(x,y) = 0, and functions G(x,y) and H(x,y), find dG/dH at a particular point, here (0,0).

The constraint, F, was sufficiently complicated that solving for either variable was not feasible, so implicit differentiation seemed the best...

### 3D plot rotation efficiency

October 12 2011 by Maple

4

0

I was recently looking at rotating a 3D plot, using plottools:-rotate, and noticed something inefficient.

In the past few releases of Maple, efficient float[8] datatype rtables (Arrays or hfarrays) can be used inside the plot data structure. This can save time and memory, both in terms of the users' creation and manipulation of them as well as in terms of the GUI's ability to use them for graphic rendering.

What I noticed is that, if one starts with a 3D plot data structure containing a float[8] Array in the MESH portion, then following application of plottools:-rotate a much less efficient list-of-lists is produced in the resulting structure.

Likewise, an effiecient float[8] Array or hfarray in the GRID portion of a 3D plot structure gets transformed by plottools:-rotate into an inefficient list-of-lists object in the MESH portion of the result. For example,

restart:

p:=plot3d(sin(x),x=-6..6,y=-6..6,numpoints=5000,style=patchnogrid,
axes=box,labels=[x,y,z],view=[-6..6,-6..6,-6..6]):

seq(whattype(op(3,zz)), zz in indets(p,specfunc(anything,GRID)));
hfarray

pnew:=plottools:-rotate(p,Pi/3,0,0):

seq(whattype(op(1,zz)), zz in indets(pnew,specfunc(anything,MESH)));
list


The efficiency concern is not just a matter of the occupying space in memory. It also relates to the optimal attainable methods for subsequent manipulation of the data.

It may be nice and convenient for plottools to get as much mileage as it can out of plottools:-transform, internally. But it's suboptimal. And plotting is a topic where dedicated, optimized helper routines for some particular data format is justified and of merit. If we want plot manipulation to be fast, then both Library-side as well as GUI-side operations need more case-by-case-optimizated.

Here's an illustrative worksheet, using and comparing memory performance with a (new, alternative) procedure that does inplace rotation of a 3D MESH. plot3drotate.mw

### pre-sized plots

September 24 2011 by Maple

10

7

The goal here is to produce plots for inclusion inside Worksheets or Documents of the Standard GUI at specific sizes.

When manually resizing an existing plot, using the mouse pointer, there is no visual cue as to what pixel size has been attained. Hence any worksheet author who wishes to produce a plot of size 600x600 is presented with two barriers. The first is that resizing must be done manually, and the second is that there is no convenient mechanism showing the actual size attained.

The Resize package attempts to address these barriers by allowing construction of a plot, inside a worksheet, with programmatically specified width and height in pixels.

The default behaviour of the package is to produce the plot inside a new Worksheet, from whence it may be selected and copied. An optional behaviour is to show the constructed plot inside a Task Template (a form of help-page), where it may be previewed for correctness and inserted into the current Worksheet or Document at the press of a single button.

It appears to function for both 2D and 3D single plots.

It won't work for so-called Array plots, which are collections of multiple plots displayed side-by-side inside a worksheet table.

This first version is a bit rough. The plot is currently being inserted as input, which is why it isn't centered on the page. I suspect that it would be best to insert the first argument (eg. a plot call) as input to an execution group, and then have the plot be the output. That would look, and hopefully act, just as usual. And with the plot call inserted as input, the original Resize call could be neatly deleted if desired.

To install this thing, use the File->Open from the Standard GUI's menubar. Choose this .mla file as the thing to open. (You may have to slide a scrollbar, and select a view of "All Files", in order to see it in the pop-up File Manager.) Double-clicking on the file, to launch it, should ideally also open it but it looks like that functionality broke for Maple 15.

Resize_installer.mla

Alternatively, you could run the command,

march( 'open', "...full...path...to...Resize_installer.mla");

The attached .mla archive is a (graphically) self-unpacking installer, when opened in this way.

The bundled materials include a pre_built .mla containing the package itself, the source code and a worksheet that rebuilds it from source if desired, a short example worksheet, and a worksheet that rebuilds the whole installer (and re-bundles all those files into it). I used the InstallerBuilder to make the self-unpacking .mla installer, as I think it's a handy tool that is under-appreciated (and, alas, under documented!).

It's supposed to work without the usual hassle of having to set libname. This is an automatic consequence of the place in which it gets installed.

It seems to work in Maple 12, 14, and 15, on Windows 7. Let me know if you have problems with it.

acer

### comments should be up-vote-able and...

September 15 2011 by

I have remarked on this ever since the launch of the second incarnation of mapleprimes. And I recall others expressing similar feelings.

1. It should be possible to "vote" for comments in the same way as we can vote for "answers".

### MRB constant P

September 04 2011 by Maple

The MRB constant =

Concerning the following divergent and convergent series, we see that

=

### Numbers and the Debut of the Algebra...

August 26 2011 by Maple

0

0

### Pushing dsolve to its limits

August 19 2011 by Maple 15

3

6

And so with this provocative title, "pushing dsolve to its limits" I want to share some difficulties I've been having in doing just that. I'm looking at a dynamic system of 3 ODEs. The system has a continuum of stationary points along a line. For each point on the line, there exist a stable (center) manifold, also a line, such that the point may be approached from both directions. However, simulating the converging trajectory has proven difficult.

I have simulated as...

### Not working

August 19 2011 by Maple 12

I am on maple 12 and this is not working.

My programming skills are very limitied but now that I see roughly how to do this (thanks to everyone on here) I can try and tweak it in roder to make it work.

As for the sequence...it isnt as nice as it looks, the pattern doesnt continue.

### ACM SIGSAM Richard Dimick Jenks Memorial...

July 28 2011 by Maple

It may be of interest to this community that the original Maple Project at the University of Waterloo was awarded the ACM SIGSAM Richard Dimick Jenks Memorial Prize in June, at the joint ISSAC/SNC 2011 conference (overlap with FCRC 2011).

See here for the announcement.

### MapleSim 5 - Taking Physical Modeling...

June 15 2011 by MapleSim 5

11

0

MapleSim has been delivering unique advantages in physical modeling and system simulation for many years. Today we release the latest iteration: MapleSim 5. Looking back at some of the earlier versions of our software, it is hard to believe that this is the same product; from the user interface to the component libraries to the simulation engine, every part of the system has experienced a striking evolution.

Like its predecessors, MapleSim 5 is based on the Maple mathematical...

### The order of 1 in GF

June 15 2011 by Maple

0

0

The order of 1 in any finite field (that I tried) created by GF is NULL. For example,

F:=GF(3,2):
use F in order(one) end;


It should be 1.

Alec

### Convergents Constants

May 30 2011 by

0

0

Here is the progress made in the investigation of what I call the convergents constants:
https://oeis.org/wiki/Table_of_convergents_constants

I wonder if anyone would be interested in adding anything to it. I would like to see the convergents constants studied some in Maple to compare with my Mathematica results; my investigation is in dire need of some proof other than my...

 3 4 5 6 7 8 9 Last Page 5 of 68