acer

32490 Reputation

29 Badges

20 years, 7 days
Ontario, Canada

Social Networks and Content at Maplesoft.com

MaplePrimes Activity


These are replies submitted by acer

One has to be extra careful with identify(). A small edit to the above example gets this (incorrect) conclusion,

> identify(evalf(arctan((11+2*5^(1/2))^(1/2)/(5^(1/2)-1))));

                            (1/3)  (1/3)
                        49 2      7     
                        ----------------
                                  (5/6) 
                        80 Zeta(3)      

That returned symbolic value shares only 8 decimal digits with the true value.

Hence for the given example an identify() apporach is only suitable if the result is subsequently tested. But if you tack on the act of converting the relevant trig call back to radical then the method as a whole is no longer so simple.

acer

I forgot to mention that there is an item in the TypeSetting:-RuleAssistant command's pop-up window that might relate. It's a checkbox labeled "Subscript Derivatives".

The help-page ?Typesetting,RuleAssistant has this,

   Subscript Derivatives - Use a subscripted sequence of variable names for partial derivatives
                           of functions with suppressed dependencies.

Unfortunately I don't fully understand what this means. The ?Typesetting,Suppress help-page doesn't mention it, and using that command to suppress f(x) say, and checking that described RuleAssistant box, doesn't appear to enable any subscripted f as a derivative of an operator f. I had my document typesetting level set to "Extended", and the Typesetting setting of userep=true.

In fact this rule seems described as going in the opposite way from what this post is all about, by taking something like del f/del x and printing out f_x. It doesn't look like it's intended to do anything for f_x as 2D Math input. (PDEtools[declare] looks like it too is acting in the opposite direction from this post's topic.)

acer

@belief111 

I'm sorry to disappoint you, but there really isn't much information to be had on this.

This topic is a bit advanced. It hinges on the fact that Maple uses a special form of structure, as intermediary object, for the purpose of 2-dimensional mathematical typesetting (a.k.a. 2D Math). There aren't a full set of commands to construct such objects in a program. I accomplished it above by programmatically (crudely) concatenating together the various portions of such objects.

There are a few other ways to construct such objects (which get interpreted by Maple's GUI as things to pretty-print as 2D Math). In a 2009 workshop, Paulina Chin described some of the basics, labelling the internal form of typeset objects as TypeMK. That is the form that has been used in this post. So I could also do the following:

You might get some insight by issuing these commands as 1D Maple Notation input in a Worksheet.

expr:=f[x]; # a table reference, not an atomic identifier
lprint(expr); # demonstrating that it is a table reference
Typesetting:-Typeset(f[x]); # now as function calls to unassign Typesetting exports
lprint(%); # demonstrating the last claim
convert(%,`global`); # now as function calls to global function names
cat(`#`,convert(%,name)); # now a so-called atomic identifier
lprint(%);
`#msub(mi("f"),mi("x"))`;

Apart from such translation of an expression as a whole to TypeMK (as done above) there are few if any other commands for stitching together TypeMK structures. Hence the few people who accomplish that in programs usually do so by brute-force concatenation.

acer

@epostma sort[inplace] is still nifty, and a nice suggestion, even if it only sorts in a few possible ways for float[8].

If I recall, the BLAS function daxpy might not accept negative strides on all platforms' implementations. But an `adder` might still be compiled, and made to act on the various negative and/or positive parts of the Aliased Array. It could be made to perform specialized adding, striding backward or forward from the neg/pos boundary as you suggest.

@epostma sort[inplace] is still nifty, and a nice suggestion, even if it only sorts in a few possible ways for float[8].

If I recall, the BLAS function daxpy might not accept negative strides on all platforms' implementations. But an `adder` might still be compiled, and made to act on the various negative and/or positive parts of the Aliased Array. It could be made to perform specialized adding, striding backward or forward from the neg/pos boundary as you suggest.

I believe that it can all be done easily, both for using the 'method' option of evalf/Int as well as avoiding cut and paste (perhaps via a handy use of `subs`, even if not more easily). If you upload the worksheet, we could more easily demonstrate it to you. That's less work for us than having to cook up an example, and besides which your sheet may have other aspects that pertain. The upload avoids guesswork.

acer

sort[inplace] isn't documented, and it is buggy for float[8] rtables alongside custom sorting procs (except maybe for `>` and friends).

v := Vector[row]([-3,-7,1,9],datatype=float[8]);
                           v := [-3., -7., 1., 9.]

cv1:=copy(v):
cv2:=copy(v):

p:=proc(a,b)
   if signum(a)<>signum(b) then
      a<b;
   else
      abs(a)<abs(b);
   end if;
end proc:

sort(v); # ok
                             [-7., -3., 1., 9.]

sort(cv1,p); # ok
                             [-3., -7., 1., 9.]

sort[inplace](cv1,p); # oops, should be same as last
                             [-7., -3., 1., 9.]

cv1;
                             [-7., -3., 1., 9.]

sort[inplace](cv2,`>`); # ok
                             [9., 1., -3., -7.]

cv2;
                             [9., 1., -3., -7.]

sort(v,`>`); # ok
                             [9., 1., -3., -7.]

sort[inplace](v,(a,b)->a>b); # oops, should be same as last
                             [-7., -3., 1., 9.]

v;
                             [-7., -3., 1., 9.]


I'll submit some SCRs.

acer

sort[inplace] isn't documented, and it is buggy for float[8] rtables alongside custom sorting procs (except maybe for `>` and friends).

v := Vector[row]([-3,-7,1,9],datatype=float[8]);
                           v := [-3., -7., 1., 9.]

cv1:=copy(v):
cv2:=copy(v):

p:=proc(a,b)
   if signum(a)<>signum(b) then
      a<b;
   else
      abs(a)<abs(b);
   end if;
end proc:

sort(v); # ok
                             [-7., -3., 1., 9.]

sort(cv1,p); # ok
                             [-3., -7., 1., 9.]

sort[inplace](cv1,p); # oops, should be same as last
                             [-7., -3., 1., 9.]

cv1;
                             [-7., -3., 1., 9.]

sort[inplace](cv2,`>`); # ok
                             [9., 1., -3., -7.]

cv2;
                             [9., 1., -3., -7.]

sort(v,`>`); # ok
                             [9., 1., -3., -7.]

sort[inplace](v,(a,b)->a>b); # oops, should be same as last
                             [-7., -3., 1., 9.]

v;
                             [-7., -3., 1., 9.]


I'll submit some SCRs.

acer

@Axel Vogt 

The following call to display (internal Maple Library) routine source shows me that Arraytools:-AddAlongDimension is using Nag's f06ecf for the datatype=float[8] case.

showstat(ArrayTools::AddAlongDimension2D);

And google shows me that Nag's f06ecf is the the same as the standard BLAS function daxpy.

But there are two extra arguments in that external call from the `AddAlongdimension2D` routine. Ie.,

    ExtCall(ncols,alpha,y,i-1,incy,x,0,incx)

The extra two arguments are what comes right after arguments y and x. The `i-1` and the `0`. These are offsets into the contiguous memory used for the double-precision data of those two Maple float[8] rtables.

Such offsets allow the daxpy routines (which is "technically" for 1-d arrays of data) to operate along any row or column of a 2-d array of data. Simply offset to the entry that starts the row of column in question. And then stride (incx, or incy) the data by m, n, or 1, according to whether it is stored in Fortran-order or C-order (column-major or row-major). For example, when passing fortran-order array `x` in C one could (carefully) pass it as daxpy(..., x+3,...) and so offset by so many double widths the initial address that daxpy sees.

@Axel Vogt 

The following call to display (internal Maple Library) routine source shows me that Arraytools:-AddAlongDimension is using Nag's f06ecf for the datatype=float[8] case.

showstat(ArrayTools::AddAlongDimension2D);

And google shows me that Nag's f06ecf is the the same as the standard BLAS function daxpy.

But there are two extra arguments in that external call from the `AddAlongdimension2D` routine. Ie.,

    ExtCall(ncols,alpha,y,i-1,incy,x,0,incx)

The extra two arguments are what comes right after arguments y and x. The `i-1` and the `0`. These are offsets into the contiguous memory used for the double-precision data of those two Maple float[8] rtables.

Such offsets allow the daxpy routines (which is "technically" for 1-d arrays of data) to operate along any row or column of a 2-d array of data. Simply offset to the entry that starts the row of column in question. And then stride (incx, or incy) the data by m, n, or 1, according to whether it is stored in Fortran-order or C-order (column-major or row-major). For example, when passing fortran-order array `x` in C one could (carefully) pass it as daxpy(..., x+3,...) and so offset by so many double widths the initial address that daxpy sees.

My procedure `P` calls the Nag name of the BLAS function daxpy, I believe. And that does this (in pseudo-syntax):

  x -> alpha*y + x

which is an inplace operation on `x`. So `P` uses a 1x1 float[8] Vector for x. And it walks y with a stride of 1, and x with a stride of 0 (!). So x ends up accumulating the sum of all that is in the n-Vector `y`. Something like that.

And `y` is just an Alias of the original 2D Array as a 1D Vector. So the end result is that all the entries in the Array are summed, in one big "add-along" shot.

You could try writing/finding an inplace sorter, and compiling it. (...and maybe give its define_external call the THREAD_SAFE options, down the road.) It needn't sort inplace on the data rtable (the Aliased Vector, say). It might be easier to find one which merely wrote out the sorted result to another temp Vector. And that Vector might be re-usable. (That way is far easier for handling, and is good for non-threaded parent code. But when parallelizing it becomes tricky again as each thread needs its own re-uable temp Vector to hold sorted results.) This is the interesting bit. I see where you are at now. You'd like to test a fast version of this, sorting to get it ready for Kahan or similar carefull summation. Fun.

The BLAS on Windows will be MKL. I'm not sure whether their daxpy is threaded, or just cache-tuned. As you can see, 5000 summations of 110x110 float[8] Arrays in undr 1 second is not bad. Unfortunately, the Intel people say that one is not advised to call MKL from within a thread of another threaded app. They don't come right out and say that MKL is wholly, or partly, thread-unsafe. I've never tried to force it by placing the option on that daxpy's define_external call from within Maple. On Linux the BLAS is ATLAS, and I've never see a described limitation on whether it can be called from within a higher level thread.

You may note that `P` operated on full rectangular storage 2D Arrays (albeit, Aliased). Having my two procs be clever about handling triangular data could be a lot of work... for only a single power of 2 benefit. The daxpy is so quick, and copying out subsections from below the main diagonal (or making repeated calls with tricky offsets) might well blow away such a small speedup. I'd be tempted to stick with full rectangular storage, thus permitting Maple to do the Alias and a single external call daxpy. I'd likely only suggest divvying up the data if it were very sparse. (And in which case we could have even more fun discussing the sparse float[8] structure.)

It'd be interesting to hear how you get along with this task.

nb. I'm not saying that your code is affected by the following. But you might care to see this post on 2D Math runtime efficiency. I saw a wickly brutal double-whammy of this last month or so. Someone had gotten bitten not only by the delayed multiplication parsing within a proc defn, but had alongside that used syntax A[i][j] instead of A[i,j] for Matrix entry access. But the former creates A[i] a whole new temp row Vector, which is then accessed at its jth entry and then disposed of as garbage. The overall penalty was a slowdown of a few hundred times...

acer

My procedure `P` calls the Nag name of the BLAS function daxpy, I believe. And that does this (in pseudo-syntax):

  x -> alpha*y + x

which is an inplace operation on `x`. So `P` uses a 1x1 float[8] Vector for x. And it walks y with a stride of 1, and x with a stride of 0 (!). So x ends up accumulating the sum of all that is in the n-Vector `y`. Something like that.

And `y` is just an Alias of the original 2D Array as a 1D Vector. So the end result is that all the entries in the Array are summed, in one big "add-along" shot.

You could try writing/finding an inplace sorter, and compiling it. (...and maybe give its define_external call the THREAD_SAFE options, down the road.) It needn't sort inplace on the data rtable (the Aliased Vector, say). It might be easier to find one which merely wrote out the sorted result to another temp Vector. And that Vector might be re-usable. (That way is far easier for handling, and is good for non-threaded parent code. But when parallelizing it becomes tricky again as each thread needs its own re-uable temp Vector to hold sorted results.) This is the interesting bit. I see where you are at now. You'd like to test a fast version of this, sorting to get it ready for Kahan or similar carefull summation. Fun.

The BLAS on Windows will be MKL. I'm not sure whether their daxpy is threaded, or just cache-tuned. As you can see, 5000 summations of 110x110 float[8] Arrays in undr 1 second is not bad. Unfortunately, the Intel people say that one is not advised to call MKL from within a thread of another threaded app. They don't come right out and say that MKL is wholly, or partly, thread-unsafe. I've never tried to force it by placing the option on that daxpy's define_external call from within Maple. On Linux the BLAS is ATLAS, and I've never see a described limitation on whether it can be called from within a higher level thread.

You may note that `P` operated on full rectangular storage 2D Arrays (albeit, Aliased). Having my two procs be clever about handling triangular data could be a lot of work... for only a single power of 2 benefit. The daxpy is so quick, and copying out subsections from below the main diagonal (or making repeated calls with tricky offsets) might well blow away such a small speedup. I'd be tempted to stick with full rectangular storage, thus permitting Maple to do the Alias and a single external call daxpy. I'd likely only suggest divvying up the data if it were very sparse. (And in which case we could have even more fun discussing the sparse float[8] structure.)

It'd be interesting to hear how you get along with this task.

nb. I'm not saying that your code is affected by the following. But you might care to see this post on 2D Math runtime efficiency. I saw a wickly brutal double-whammy of this last month or so. Someone had gotten bitten not only by the delayed multiplication parsing within a proc defn, but had alongside that used syntax A[i][j] instead of A[i,j] for Matrix entry access. But the former creates A[i] a whole new temp row Vector, which is then accessed at its jth entry and then disposed of as garbage. The overall penalty was a slowdown of a few hundred times...

acer

This is interesting. I'd like to look at it, but likely cannot work on it for a few days...

acer

@Alejandro Jakubi I hadn't noticed that this was branched from another post. Thanks.

@Alejandro Jakubi I hadn't noticed that this was branched from another post. Thanks.

First 443 444 445 446 447 448 449 Last Page 445 of 594