acer

32490 Reputation

29 Badges

20 years, 9 days
Ontario, Canada

Social Networks and Content at Maplesoft.com

MaplePrimes Activity


These are answers submitted by acer

Can you not simply solve for omega_1 once, up front, and then create R a simple, explicit expression in `a`, and then plot in the usual way?

omega:=solve(eval((10000*Pr^2*(Pr^2*(Pr-1)*(1+a^2/Pi^2)^2
                  +10*Pr^2*(Pr+Pr[m])+Pr[m]^2*omega_1*(Pr-1))
                  /((1+a^2/Pi^2)*(omega_1*(1+a^2/Pi^2)^2*(Pr+Pr*Pr[m])^2
                  +(Pr^2*(1+a^2/Pi^2)^2-Pr[m]*omega_1+10*Pr^2)^2))
                  +10*Pr^2*(Pr-Pr[m])/(Pr[m]^2*omega_1+Pr^2
                  *(1+a^2/Pi^2)^2)+(1+Pr)),[Pr[m]= 0]),omega_1):

R:=eval(((1+a^2/Pi^2)*Pi^2*((1+a^2/Pi^2)^2-omega/Pr+10*(omega*Pr*Pr[m]
        +Pr^2*(1+a^2/Pi^2)^2)/(Pr[m]^2*omega+Pr^2*(1+a^2/Pi^2)^2)
        +10000*Pr*((Pr^2*(1+a^2/Pi^2)^2-Pr[m]*omega+10*Pr^2)
        *(Pr*(1+a^2/Pi^2)^2-Pr[m]*omega)+omega*(1+a^2/Pi^2)^2
        *(Pr+Pr[m])*(Pr+Pr*Pr[m]))/((1+a^2/Pi^2)*(omega*(1+a^2/Pi^2)^2
        *(Pr+Pr*Pr[m])^2+(Pr^2*(1+a^2/Pi^2)^2-Pr[m]
        *omega+10*Pr^2)^2)))/a^2*Pi^4), [Pr[m] = 0]):

R:=eval(simplify(R,size),[Pr = 0.025]);

plot(R,a=4.56..7.56);

plot(R,a=4.56..7.56,adaptive=false,numpoints=10,style=point);

If you really want to built a list of lists, so as to call plots:-pointplot, then better to get in the habit of using `seq` to do it than by repeatedly augmenting a list (which is inefficient for large lists).

L:=[seq([A,evalf[15](eval(R,a=A))], A=4.56..7.56, 0.5)];

plots:-pointplot(L);

acer

Perhaps it is a grammatical slip, and that line in the help-page for last_name_eval was instead supposed to read, "...a name assigned a value that is one of the special types, such as procedures, modules and tables (hence, matrices, vectors and arrays, but not rtables), is not fully evaluated during normal evaluation." Would that make sense?

Since (now-deprecated) things such as matrix, vector, and array are all of type table then they too are of type `last_name_eval`. A list or a set is not of type `last_name_eval`, and neither are any of the rtable subtypes Matrix, Vector, or Array.

A Record (of type `record`) is a kind of module. And Records are thus of type `last_name_eval`, since modules are.

> r := Record( 'a', 'b' ):

> type(r,record);
                                     true

> type(r,`module`);
                                     true

> type(r,last_name_eval);
                                     true

> subtype(record,`module`);
                                     true

> subtype(record,last_name_eval);
                                     true

[edited] In conversation I've been informed that the issue of whether things of type object are also of type `last_name_eval` is messy. If I create Obj a named module with option `object` in Maple 16.01 then type(Obj, last_name_eval) returns true. But this might be a coding oversight in `type`, and it would be interesting to test whether it actually behaves that way under evaluation. (It may be a bug that the command subtype(object,`module`) returns FAIL in Maple 16, while the command type(Obj,`module`) returns true for Obj created as a named Object. I'm not at all sure, but this may be on purpose, as an implementation detail.)

It appears that `last_name_eval` is a super-type of `module`, `procedure`, and `table` (and hence also of all their subtypes). The exception to this rule is  `object`, as mentioned.

You also asked whether Matrices and Vectors (ie. rtables) evaluate fully. The mutable data structures are tricky. When the Help writes of "...under normal evaluation" it's often describing the situation of passing arguments to a procedure. When a mutable structure (table, rtable, but not list or set) is passed as an argument in a procedure call it is desirable to have it be passed as if "by reference". The rationale is that it's desirable to be able to act on its entries "in-place", and to avoid inefficient copying of data. So having the procedure get the name of the thing, rather than a copy of the thing, is desirable. The crazy-brilliant way that old (lowercase, deprecated) matrices, vectors, and arrays attained this was by being of type `last_name_eval`, so that only the assigned name got received by the procedure after normal up-front evaluation of the arguments. The way that Matrices, Vectors, and Arrays attain the same kind of behaviour is by (de facto) not evaluating fully under normal evaluation. That behaviour of rtables works pretty well, but there are some corner cases of usage which gave rise the rtable_eval command.

acer

If your expression is assigned to the name `e`, for example, then you should be able to pull out that a[0] term with the command,

'collect(e, a[0])'

acer

There are various ways to produce the sorting list [2,3,1] which you want to use to sort the rows of Matrix A. Once you have that list there are also several ways to construct the new sorted Matrix.

A := Matrix(3, 3, [1, 3, 4, 2, 3, 1, 2, 4, 1]);

                                    [1  3  4]
                                    [       ]
                               A := [2  3  1]
                                    [       ]
                                    [2  4  1]

B := Array([8, 6, 7]);

                               B := [8, 6, 7]

Bhat:=[seq(r[2], r in sort([seq([B[i],i],i=1..3)],(a,b)->b[1]>a[1]))];

                              Bhat := [2, 3, 1]

# one way
Anew:=Matrix([seq([A[r,1..-1]],r=Bhat)]);

                                      [2  3  1]
                                      [       ]
                              Anew := [2  4  1]
                                      [       ]
                                      [1  3  4]

# another way
Anew:=Matrix(3,3):
for i from 1 to 3 do
  Anew[i,1..-1]:=A[Bhat[i]];
end do:

Anew;

                                  [2  3  1]
                                  [       ]
                                  [2  4  1]
                                  [       ]
                                  [1  3  4]

I've made no effort to make the above especially efficient. It's likely that there are ways which are much faster if you have to do it many times, or do it for a much larger size Matrix.

acer

The `Int` command produces an inert integral but does not compute a result. Maple has some functionality to allow one to manipulate such inert integrals (eg. changing variables, changing the form of the integrand, combining several such definite integrals by merging their range of integation, etc).

The `value` command can be applied to an inert `Int` integral, to try and compute an actual result of integration. This is pretty much the same as calling the active `int` command instead.

Int(sin(x),x=a..b);

                           /b          
                          |            
                          |   sin(x) dx
                          |            
                         /a 
           
value(%);

                        cos(a) - cos(b)

int(sin(x),x=a..b);

                        cos(a) - cos(b)

Unless the end-points of the range of integration of a call to the `int` command are both of type realcons and at least one them is a float then that integral will be attempted using purely exact symbolic methods.

I know from direct email that you are doing computations which involve Matrices of integrals which you need as floating-point results. The following distinction seems relevant to those computations.

Consider the integrand f below,

restart:

f := (1/3)*(-4*eta+15+6*Pi)^2
     *(5-6*cos((1/3)*eta-1/2)+18*cos((1/3)*eta-1/2)^2)
     /sin((1/3)*eta-1/2);

                  /                      /                    
        1         |                    2 |         /1       1\
 ---------------- |(-4 eta + 15 + 6 Pi)  |5 - 6 cos|- eta - -|
      /1       1\ \                      \         \3       2/
 3 sin|- eta - -|                                             
      \3       2/                                             

                       2\\
            /1       1\ ||
    + 18 cos|- eta - -| ||
            \3       2/ //

An inert integral, with exact end-points to its range of integration, might be,

F := Int(f, eta=3+(3/2)*Pi..4+(3/2)*Pi);

      /4 + 3/2 Pi                                           
     |                             /                      / 
     |                   1         |                    2 | 
     |            ---------------- |(-4 eta + 15 + 6 Pi)  |5
     |                 /1       1\ \                      \ 
    /3 + 3/2 Pi   3 sin|- eta - -|                          
                       \3       2/                          

                                             2\\     
              /1       1\         /1       1\ ||     
       - 6 cos|- eta - -| + 18 cos|- eta - -| || deta
              \3       2/         \3       2/ //     

By applying the `evalf` command to that integral one can quickly obtain a floating-point approximation which has pretty good accuracy. This computation gets done by so-called numerical integration (or "numerical quadrature") which is often a form of weighted average of evaluations of the integrand at float values of `eta` taken from the stated range of integration. Since this integrand `f` only attains real, float values under evaluation in this range then the computed approximation of the integral is also a purely real, float value.

ans1 := evalf(F); # numerical integration

                          12.63839218

Now lets look at what happens here, using `int` instead. A quite long exact, symbolic expression is returned, because Maple is able to compute this particular integral symbolically. The computed symbolic expression happens to contain `I` the imaginary unit. But it turns out that this imaginary component is really just zero in disguise. If we were lucky then we might be able to simplify this symbolic result to some purely real exact expression which does not contain `I` at all. If we are not lucky then we might not be able to find such a simplification.

Or perhaps a clever change of variables could be done to Int(f,...) so that subsequently applying `value` would produce an exact result without any `I` appearing. But we might not be able to find such a change of variables.

It's also worth noting that the exact definite integration might take quite a bit longer than the numerical integration performed above. So if you intend on just applying `evalf` to an exact symbolic result, because all you want is a floating-point approximation anyway, then you might be better off doing numeric integration instead.

G := int(f, eta=3+(3/2)*Pi..4+(3/2)*Pi); # exact integration

         /        /1             \\
648 Pi ln|-1 + exp|- I (3 + 3 Pi)||
         \        \6             //

               /       /1             \\
   - 2484 Pi ln|1 + exp|- I (3 + 3 Pi)||
               \       \6             //

            /       /1             \\
   + 1224 ln|1 - exp|- I (3 + 3 Pi)||
            \       \6             //

           /        /1             \\
   - 486 ln|-1 + exp|- I (5 + 3 Pi)||
           \        \6             //

                 /       /1             \\   958  
   - 8352 polylog|3, -exp|- I (5 + 3 Pi)|| - --- I
                 \       \6             //    3   

           2   /        /1             \\
   + 216 Pi  ln|-1 + exp|- I (3 + 3 Pi)||
               \        \6             //

           2   /       /1             \\
   - 828 Pi  ln|1 + exp|- I (3 + 3 Pi)||
               \       \6             //

            /       /1             \\
   - 1602 ln|1 + exp|- I (3 + 3 Pi)||
            \       \6             //

                 /   /1           \\
   + 3726 arctanh|exp|- I (1 + Pi)||
                 \   \2           //

                 /      /1             \\
   + 4896 polylog|3, exp|- I (5 + 3 Pi)||
                 \      \6             //

             / 1           \
   + 2511 exp|-- I (1 + Pi)|
             \ 2           /

                 /       /1             \\
   + 8352 polylog|3, -exp|- I (3 + 3 Pi)||
                 \       \6             //

                    /   /1           \\
   + 4968 Pi arctanh|exp|- I (1 + Pi)||
                    \   \2           //

            2        /   /1           \\
   + 1656 Pi  arctanh|exp|- I (1 + Pi)||
                     \   \2           //

              /1           \            / 1           \
   + 648 I exp|- I (1 + Pi)| - 648 I exp|-- I (1 + Pi)|
              \2           /            \ 2           /

                   /      /1           \\
   - 1224 I polylog|2, exp|- I (1 + Pi)||
                   \      \2           //

               /   /1           \\
   - 1296 Pi ln|exp|- I (1 + Pi)||
               \   \2           //

           2   /   /1           \\
   - 432 Pi  ln|exp|- I (1 + Pi)||
               \   \2           //

            /       /1             \\
   - 1360 ln|1 - exp|- I (5 + 3 Pi)||
            \       \6             //

           /        /1             \\
   + 486 ln|-1 + exp|- I (3 + 3 Pi)||
           \        \6             //

            /       /1             \\   
   - 1836 ln|1 - exp|- I (5 + 3 Pi)|| Pi
            \       \6             //   

           /       /1             \\   2
   - 612 ln|1 - exp|- I (5 + 3 Pi)|| Pi 
           \       \6             //    

                  /       /1             \\
   + 696 I polylog|2, -exp|- I (5 + 3 Pi)||
                  \       \6             //

                  /      /1             \\
   - 408 I polylog|2, exp|- I (5 + 3 Pi)||
                  \      \6             //

              / 1             \            /1             \
   - 216 I exp|-- I (5 + 3 Pi)| + 216 I exp|- I (5 + 3 Pi)|
              \ 6             /            \6             /

                        2            /        /1             \\
   - 432 I Pi - 144 I Pi  - 648 Pi ln|-1 + exp|- I (5 + 3 Pi)||
                                     \        \6             //

               /       /1             \\
   + 2484 Pi ln|1 + exp|- I (5 + 3 Pi)||
               \       \6             //

               /   /1             \\
   + 1296 Pi ln|exp|- I (5 + 3 Pi)||
               \   \6             //

           2   /        /1             \\
   - 216 Pi  ln|-1 + exp|- I (5 + 3 Pi)||
               \        \6             //

           2   /       /1             \\
   + 828 Pi  ln|1 + exp|- I (5 + 3 Pi)||
               \       \6             //

           2   /   /1             \\
   + 432 Pi  ln|exp|- I (5 + 3 Pi)||
               \   \6             //

                    /   /1             \\
   - 4968 Pi arctanh|exp|- I (5 + 3 Pi)||
                    \   \6             //

            2        /   /1             \\
   - 1656 Pi  arctanh|exp|- I (5 + 3 Pi)||
                     \   \6             //

             / 1             \          /       /1             \\
   - 2583 exp|-- I (5 + 3 Pi)| + 1834 ln|1 + exp|- I (5 + 3 Pi)||
             \ 6             /          \       \6             //

            /       /1             \\   
   + 1836 ln|1 - exp|- I (3 + 3 Pi)|| Pi
            \       \6             //   

           /       /1             \\   2
   + 612 ln|1 - exp|- I (3 + 3 Pi)|| Pi 
           \       \6             //    

                   /       /1             \\
   + 2088 I polylog|2, -exp|- I (3 + 3 Pi)||
                   \       \6             //

           /   /1             \\
   + 972 ln|exp|- I (5 + 3 Pi)||
           \   \6             //

                 /   /1             \\
   - 3726 arctanh|exp|- I (5 + 3 Pi)||
                 \   \6             //

                 /      /1           \\
   - 4896 polylog|3, exp|- I (1 + Pi)||
                 \      \2           //

           /   /1           \\           /1             \
   - 972 ln|exp|- I (1 + Pi)|| - 2583 exp|- I (5 + 3 Pi)|
           \   \2           //           \6             /

             /1           \
   + 2511 exp|- I (1 + Pi)|
             \2           /

Now lets evaluate that long exact symbolic result using the `evalf` command. It turns out that the nature (long, complicated, etc) of this particular exact expression means that under any particular working precision the `evalf` command might not produce a very accurate answer. The real part of this application of `evalf` might not even be as accurate as the earlier result using numerical integration. And the floating-point approximation of the exact integration result `G` might also contain imaginary artefacts. (By imaginary artefact I mean that the float imaginary term is small, and might get smaller as the working precision is increased but might not ever vanish altogether.)

ans2 := evalf(G);

                        12.638398 - 0.000018 I

evalf[50](G);

                                                                 -45 
       12.6383921817028241494492915097389371376468101157 - 1.1 10    I

We can notice that `ans2` does not have a real component as accurate as `ans1`. And `ans2` contains a (spurious) imaginary artefact. Now, there will be cases and examples where exact symbolic integration is what one needs and prefers. But for your particular computations it looks as if there is some justification in doing numeric integration with `evalf(Int(...))` rather than doing float approximation of exact integration. Keep in mind also that the `evalf(Int(...))` command allows one optionally to specify increased working precision and/or a tighter error tolerance.

acer

In general `a` and `Db` cannot just come out of the square root. But they can if they are positive reals.

e:=sqrt(a^2*Db^2*tan(b)^4/(cos(g)^4*(tan(g)+tan(b))^4)
         +a^2*Db^2*tan(g)^4/(cos(b)^4*(tan(g)+tan(b))^4));

                                                                   (1/2)
          /       2   2       4                2   2       4      \     
          |      a  Db  tan(b)                a  Db  tan(g)       |     
     e := |-------------------------- + --------------------------|     
          |      4                  4         4                  4|     
          \cos(g)  (tan(g) + tan(b))    cos(b)  (tan(g) + tan(b)) /     


simplify(e,radical) assuming a>0, Db>0;

                                                        (1/2)
                    /      4       4         4       4 \     
                    |tan(b)  cos(b)  + tan(g)  cos(g)  |     
               a Db |----------------------------------|     
                    |      4                  4       4|     
                    \cos(g)  (tan(g) + tan(b))  cos(b) / 

acer

You're not just asking how to factor the polynomial expression, right?

> factor(expr);

                                7   2         2
                               x  (x  + x + 1)

Are you asking how you might go about finding the possible factoring terms, or in dividing the polynomial expression by them?

Maybe this can give you some ideas.

> H:=proc(e,f) local y; if divide(e,f,y) then f*y else e end if end proc:

> expr:=x^11+2*x^10+3*x^9+2*x^8+x^7:                                     

> H(expr,x^3);                                                           

                        3   8      7      6      5    4
                       x  (x  + 2 x  + 3 x  + 2 x  + x )

> facs:=evala(Factors(expr))[2];                                         

                                  2
                       facs := [[x  + x + 1, 2], [x, 7]]

> poss:=[seq(seq(r[1]^j,j=1..r[2]),r in facs)];                          

                  2            2         2      2   3   4   5   6   7
        poss := [x  + x + 1, (x  + x + 1) , x, x , x , x , x , x , x ]

> seq(print(H(expr,p)), p in poss):                                      

                            2            9    8    7
                          (x  + x + 1) (x  + x  + x )

                                7   2         2
                               x  (x  + x + 1)

                           10      9      8      7    6
                       x (x   + 2 x  + 3 x  + 2 x  + x )

                        2   9      8      7      6    5
                       x  (x  + 2 x  + 3 x  + 2 x  + x )

                        3   8      7      6      5    4
                       x  (x  + 2 x  + 3 x  + 2 x  + x )

                        4   7      6      5      4    3
                       x  (x  + 2 x  + 3 x  + 2 x  + x )

                        5   6      5      4      3    2
                       x  (x  + 2 x  + 3 x  + 2 x  + x )

                        6   5      4      3      2
                       x  (x  + 2 x  + 3 x  + 2 x  + x)

                         7   4      3      2
                        x  (x  + 2 x  + 3 x  + 2 x + 1)

acer

I have not measured this for performance. (It might possibly do relatively a bit better for larger datatype=float[8] Matrices since MTM:-sum calls ArrayTools:-AddAlongDimension. But maybe not.)

with(MTM):
A . diag(1/~sum(A,1));

[addendum] For datatype=anything the above is significantly slower than Adri's suggestion. For a larger 300x400 float[8] Matrix it looks like a faster way (at present) is,

A . DiagonalMatrix(1/~MTM:-sum(A,1),storage=rectangular)

In principle that might become even faster (one day), if the underlying method used for multiplication of full float[8] Matrix A by diagonal shape float[8] Matrix B were improved. Right now it is using the same BLAS function as gets used for a pair of full rectangular unshaped float[8] Matrices. So it is O(n^3) instead of O(n^2).

acer

If you execute the command,

Typesetting:-Settings('functionassign'=false):

in your worksheet then you will not get that popup query. Your example 2D Math input statement (and others like it) would get automatically interpreted as a remember-table assignment.

You should also be able to place that call to Typesetting:-Settings in the Startup Code section of your Worksheet/Document.

acer

The exponent is the second operand of a call to `^`.

> op((x+y)^n);

                                   x + y, n

> op(2,(x+y)^n);

                                       n

> expr:=(x+y)^(-p)+4*x^n*y^m+(x+y)^n+x^3:        

> seq([U,op(2,U)],U in indets(expr,`^`));

            n        m              n              (-p)         3
          [x , n], [y , m], [(x + y) , n], [(x + y)    , -p], [x , 3]

> map2(op,2,indets(expr,identical(x+y)^anything));

                                    {n, -p}

acer

The CoefficientVector or CoefficientList commands will be more efficient than a sequence of calls to `coeff`, for picking off all the coefficients of a dense polynomial of non-small degree.

The reason is that CoefficientVector should walk the polynomial structure just once, while each coeff call in the sequence would also walk the structure (which might be unordered).

By creating `a` as a Vector you can use round-brackets for accessing its entries (which you seem to want to do...).

g := taylor(exp(z), z = 0, 100):

a := PolynomialTools:-CoefficientVector(convert(g,polynom),z):

af := evalf(a):

a(3), a(7);

                             1   1 
                             -, ---
                             2  720

evalf(a(3));

                          0.5000000000

af(7);

                         0.001388888889

If you want to access all the entries as floating-point numbers, then you may as well create Vector `af` rather than have to wrap a(k) with `evalf` every time you want it.

acer

Yes, to your question, "... so you have to install the 32 bit version instead?"

No, to your question, "Is it possible to move the finance.dll-file from Maple16(32-bit) to Maple16(64-bit)?" if you expect it to work with the 64bit Maple.

acer

You can use fdiff to get numeric partial derivatives of B (which acts like a black box function).

And there is an alternate syntax for getting that: evalf@D will call fdiff. (See also the help-page for D).

D[1](B)(17.5,26.0);

                      D[1](B)(17.5, 26.0)

evalf(%);  # hooks into fdiff !

                         0.01950091280

evalf(D[2](B)(5.0,2.0));

                         -5.217525923

Is that good enough for your purposes?

acer

One easy way (amongst several) is,

WM:=LinearAlgebra:-RandomMatrix(2,outputoptions=[shape=diagonal]);

                                     [27   0]
                               WM := [      ]
                                     [ 0  57]


Matrix(4,6,[WM,WM],scan=diagonal);

                           [27   0   0   0  0  0]
                           [                    ]
                           [ 0  57   0   0  0  0]
                           [                    ]
                           [ 0   0  27   0  0  0]
                           [                    ]
                           [ 0   0   0  57  0  0]

acer

@jimmyinhmb There is an ArrayTools:-BlockCopy command, which might help.

As you have figured out, you don't really want to be making individual Maple function-calls to copy each of the columns of the block. Maple function calls are relatively expensive (compared to compiled C). So having BlockCopy call out to a precompiled function (to do the looping over all the columns of the block) is probably better than doing that with interpreted code.

If we dig right down into the internal Library routine LinearAlgebra:-LA_External:-MatrixAdd we can find some define_external calls which set up call_externals to something much like the dscal (NAG f06edf) and daxpy (NAG f06ecf) BLAS functions. In some scenarios you might get a noticable improvement by using the call_external to the undocumented daxpy rather than MatrixAdd.

showstat((LinearAlgebra::LA_External)::MatrixAdd,22..24);

(LinearAlgebra::LA_External):-MatrixAdd := proc(A, B, c1, c2)
local extfns, rowsA, colsA, rowsB, colsB, localA, localB, localc1, localc2, extlib, HWCall, lengthA, DatatypeA, IndexFnA, StorageA, DatatypeB, IndexFnB, StorageB, prefStorageB, kl, ku;
       ...
  22   extfns := LinearAlgebra:-`MatrixAdd/HardwareTable`[IndexFnA,DatatypeA,StorageA,prefStorageB];
  23   HWCall[1] := ExternalCalling:-DefineExternal(extfns[1],extlib);
  24   HWCall[2] := ExternalCalling:-DefineExternal(extfns[2],extlib);
       ...
end proc

showstat((LinearAlgebra::LA_External)::MatrixAdd,57..61);

(LinearAlgebra::LA_External):-MatrixAdd := proc(A, B, c1, c2)
local extfns, rowsA, colsA, rowsB, colsB, localA, localB, localc1, localc2, extlib, HWCall, lengthA, DatatypeA, IndexFnA, StorageA, DatatypeB, IndexFnB, StorageB, prefStorageB, kl, ku;
       ...
  57     if evalf(localc1)-1. <> 0. then
  58       userinfo(1,'LinearAlgebra',"NAG",extfns[1]);
  59       HWCall[1](lengthA,evalf(localc1),localA,0,1)
         end if;
  60     userinfo(1,'LinearAlgebra',"NAG",extfns[2]);
  61     HWCall[2]('N',rowsA,colsA,localB,localA,evalf(localc2))
       ...
end proc

kernelopts(opaquemodules=false):
LinearAlgebra:-`MatrixAdd/HardwareTable`[[],float[8],rectangular,rectangular];
                     [hw_f06edf, hw_f06ecf]

So I suppose that you could use a pair of temp Matrices, T1 and T2, with steps like these:
1) BlockCopy [A21 A22] into T1, and [B21 B22] into T2
2) use the daxpy to add T2 inplace to T1
3) BlockCopy T1 back to [A21 A22]

I have doubts that ArrayTools:-Alias could work with that whole subblock, when it gets passed externally later on. The external function that performs the later addition likely wouldn't have the necessary knowledge about the offset and jumps between successive column-end and column-starts of the subblock. Setting up Aliases for each column of the subblock, as an alternative, might be costly. But whether it is much more expensive than all the block-copying in the scheme above... I don't know. Any special offset into the data to specify the start of the subblock, or any special stride to walk the columns of the subblock, won't help with the jump between subblock columns I think.

And then there is the Compiler. You could write a procedure which accepts the two full Matrices, and all scalar details of the dimensions and block boundaries, and then adds inplace into [A21 A22]. And then run Compiler:-Compile against that procedure. No multicore benefit, and likely less cache-tuning, and no Aliasing. But no extra block-copying.

Without trying out each of these three approaches I am not sure which would be fastest.

I think that this is a good question. The `ArrayTools:-BlockCopy` command was needed precisely because even with all the fanciness of `ArrayTools:-Alias`, and `ArrayTools:-Copy`, etc, it still wasn't possible to get the best speed. Maybe there should be a BlockAdd as well, which calls daxpy in a loop in a precompiled external function.

acer

First 264 265 266 267 268 269 270 Last Page 266 of 337