acer

32490 Reputation

29 Badges

20 years, 7 days
Ontario, Canada

Social Networks and Content at Maplesoft.com

MaplePrimes Activity


These are replies submitted by acer

@roman_pearce thanks, Roman. Somehow I missed that. I'm still adjusting.

I'd like to point out that the posted code is not faster than LinearAlgebra:-Transpose when the datatype of the Matrix is float[8] or complex[8] (or sfloat or complex(sfloat) when Digits>15 or UseHardwareFloats=false, etc).

For those datatypes LinearAlgebra:-Transpose is very fast for storage=sparse (and no indexing function). At size 500x500 it is about 1000 times faster than the code here. The performance difference grows with the size, as well.

I realize that Roman's post was about datatype=anything Matrices. And that is an important case. But it is not strictly true that LinearAlgebra:-Transpose is not sparse (as was claimed). For the float datatypes, it is.

acer

When not logged in to Mapleprimes the "Recent" tab takes me to,

    http://www.mapleprimes.com/recent

When logged in, it takes me to,

    http://www.mapleprimes.com/recent/unread

The first of those two gives me the full set, while the second gives me only the ones I haven't read yet. But I would prefer the full view even when logged in. So I could make a bookmark for that one in my web browser.

acer

It is front page material.

acer

"When a true genius appears in this world, you may know him by this sign,
that the dunces are all in confederacy against him." - Swift

I look forward to finding out what mark Alec's qsort gets on the assignment. ;)

The essence of the mechanism by which mysort works looks the same as that of numsorter above (with a few differences like evalhf speed vs evalf flexibility, ~ vs map, etc, etc).

acer

I look forward to finding out what mark Alec's qsort gets on the assignment. ;)

The essence of the mechanism by which mysort works looks the same as that of numsorter above (with a few differences like evalhf speed vs evalf flexibility, ~ vs map, etc, etc).

acer

This can be made more efficient, in several ways. Either the transposition could be inplace on Vt and U, or one could do only vector-transposition (also inplace, but it's less of a savings). And Maple doesn't know that it can take the inverse of S without resorting to the whole LU solve. And the thin option of SingularValues can reduce the size of the output (here just a bit, but more for "taller" problems).

> U,S,Vt := SingularValues(A,output=['U','S','Vt'],'thin'):

> (b^%T.U.DiagonalMatrix(Vector(3,i->evalhf(1/S[i]))).Vt)^%T;
                             [2.63293929474353927]
                             [                   ]
                             [3.66652382290048129]
                             [                   ]
                             [5.20251471911851304]

acer

This can be made more efficient, in several ways. Either the transposition could be inplace on Vt and U, or one could do only vector-transposition (also inplace, but it's less of a savings). And Maple doesn't know that it can take the inverse of S without resorting to the whole LU solve. And the thin option of SingularValues can reduce the size of the output (here just a bit, but more for "taller" problems).

> U,S,Vt := SingularValues(A,output=['U','S','Vt'],'thin'):

> (b^%T.U.DiagonalMatrix(Vector(3,i->evalhf(1/S[i]))).Vt)^%T;
                             [2.63293929474353927]
                             [                   ]
                             [3.66652382290048129]
                             [                   ]
                             [5.20251471911851304]

acer

This is my favourite kind of post -- putting it altogther to get the desired functional result. I look forward to "modding it up" in the new Primes.

Use of ssystem that would work in the Standard GUI, with various quotes in the argument, is an achievement in itself. That function in Standard is quite different from the one in the commandline interface, and some examples of it which work as expected in the latter go wrong in the former.

acer

Yes, I knew that it was builtin, thanks. But I can think of no reason for that last one's returning a float. I will submit an SCR against it.

acer

Yes, I knew that it was builtin, thanks. But I can think of no reason for that last one's returning a float. I will submit an SCR against it.

acer

The calls to combinat:-randperm produce a lot of collectible garbage, which takes time to manage.

In a way, this boils down to discussions of a few years back: the give and take between producing a lot of random values up front (with corresponding large memory allocation) versus repeated Vector (or list) creation of many smaller samples (and the ensuing time cost of memory management of those temp objects). I still say that the ideal solution would consist of being be able to repopulate a single Vector with a multitude of random samples (inplace, using a compiled routine). That's how other good RNGs work. I got the impression, Joe, that you were expressing the same sort of idea.

My solution below is to generate a huge sample up front, based on how many iterations I plan to do and how many values I expect to need (at most, with a bit of leeway).

First, if I may make a few minor corrections, Joe. Your ShuffleMatrix3 had a typo in the typecheck of parameter T. And your example M and T would need subtype=Matrix to get past the parameter typing of ShuffleMatrix3. But I'll get rid of ShuffleMatrix3 altogether below and call PermuteMatrixRows directly in the loop, to save all those extra function calls.

I also commented out the part of PermuteMatrixRows that updated M, to try and cut its work&time in half. I think that it's fine to have a routine which re-uses a container like T, and that with such in hand it's not necessary to "fake" inplace operation on input M.

> restart:
>
> PermuteMatrixRows := proc(M :: Array(datatype=integer[4])
>                           , T :: Array(datatype=integer[4])
>                           , P :: Vector(datatype=integer[4])
>                           , m :: nonnegint
>                           , n :: nonnegint
>                          )
> local i,j;
>     for i to m do
>         for j to n do
>             T[i,j] := M[P[i],j];
>         end do;
>     end do;
>     ## Letting it off the hook for updating M.
>     #for i to m do
>     #    for j to n do
>     #        M[i,j] := T[i,j];
>     #    end do;
>     #end do;
>     return NULL;
> end proc:
>
> kernelopts(printbytes=false):
>
> (m,n):=3,5:
> M := rtable(1..m,1..n,random(-99..99),'subtype'=Matrix,'datatype'=integer[4]):> T := rtable(1..m,1..n,'subtype'=Matrix,'datatype'=integer[4]):
>
> maxnum:=100000:
>
> st:=time():
> cPermuteMatrixRows:=Compiler:-Compile(PermuteMatrixRows):
> for k from 1 to maxnum do
>   P := Vector(combinat:-randperm(m), 'datatype'=integer[4]):
>   cPermuteMatrixRows(M, T, P, m, n);
> end do:
> time()-st;
                                    20.465

Now, using a method which explicitly creates its own permutation of 1..m. (I hope that it is a valid implementation of Fisher-Yates, and that any mistake will be pointed out.) It's a bit wasteful is this manner: whenever it needs a random value from 1..j with 1<=j<=m it uses the same big sample, reject values >j which wastes quite a lot of the precomputed values. So the memory use could be improved, by utilizing several reservoirs instead.

> restart:
>
> Mshuffle:=proc(M1::Matrix(datatype=integer[4]),
>                M2::Matrix(datatype=integer[4]),
>                m::integer, n::integer,
>                V::Vector(datatype=integer[4]),
>                ransam::Vector(datatype=integer[4]))
# Copies row i of of mxn M1 into row V[i] of mxn M2,
# for i=1..m, where V is an inplace generated random
# shuffle of 1..m.
# ransam is a source of precomputed random values in 1..m,
# with ransam[1] storing position of previuos run's last
# accessed entry.
> local i::integer, j::integer, count::integer,
>       need::integer, cand::integer;
>   count:=ransam[1];
>   for i from 1 to m do
>     V[i]:=i;
>   end do;
>   for i from m to 2 by -1 do
>     need:=1;
>     while need=1 do
>       count:=count+1;
>       cand:=ransam[1+count];
>       if cand<=i then
>         need:=0;
>       end if;
>     end do;
>     V[i],V[cand]:=V[cand],V[i];
>   end do;
>   for i from 1 to m do
>     for j from 1 to n do
>        M2[V[i],j]:=M1[i,j];
>     end do;
>   end do;
>   ransam[1]:=count;
>   NULL;
> end proc:
>
> kernelopts(printbytes=false):
>
> (m,n) := 3,5:
> M := rtable(1..m,1..n,random(-99..99),'subtype'=Matrix,'datatype'=integer[4]):> T := rtable(1..m,1..n,'subtype'=Matrix,'datatype'=integer[4]):
>
> maxnum:=100000:
>
> st:=time():
> cMshuffle:=Compiler:-Compile(Mshuffle):
> B:=Vector[row](m,datatype=integer[4]):
> rsrc:=LinearAlgebra:-RandomVector(maxnum*m*m,generator=1..m,
>                            outputoptions=[datatype=integer[4]]):
> for k from 1 to maxnum do
>   cMshuffle(M,T,m,n,B,rsrc);
> end do:
> time()-st;
                                     0.837

For 1000000 repetitions cMshuffle took 7.5 sec and 110MB for the session while cPermuteMatrixRows&randperm took 201.5 sec and 35.8MB for the session.

I'd like to get a little philosophical. I don't think that implementing simple algortihms (for sorting, or standard deviation, or shuffling, etc) is too much of a pain. The Compiler gives the current best performance possible in Maple for many purely numerical problems (except compared to calling external to some other program...). I like to think that examples which illustrate it (along with inplace a memory management refinements) are going to help some people with other similar tasks.

BTW, I think that your ideas, Joe, about efficiently permuting the entries of any datatype rtable is feasible. The OpenMaple API allows for it, not problem. But it might require writing a (reusable) compiled external routine.

acer

The calls to combinat:-randperm produce a lot of collectible garbage, which takes time to manage.

In a way, this boils down to discussions of a few years back: the give and take between producing a lot of random values up front (with corresponding large memory allocation) versus repeated Vector (or list) creation of many smaller samples (and the ensuing time cost of memory management of those temp objects). I still say that the ideal solution would consist of being be able to repopulate a single Vector with a multitude of random samples (inplace, using a compiled routine). That's how other good RNGs work. I got the impression, Joe, that you were expressing the same sort of idea.

My solution below is to generate a huge sample up front, based on how many iterations I plan to do and how many values I expect to need (at most, with a bit of leeway).

First, if I may make a few minor corrections, Joe. Your ShuffleMatrix3 had a typo in the typecheck of parameter T. And your example M and T would need subtype=Matrix to get past the parameter typing of ShuffleMatrix3. But I'll get rid of ShuffleMatrix3 altogether below and call PermuteMatrixRows directly in the loop, to save all those extra function calls.

I also commented out the part of PermuteMatrixRows that updated M, to try and cut its work&time in half. I think that it's fine to have a routine which re-uses a container like T, and that with such in hand it's not necessary to "fake" inplace operation on input M.

> restart:
>
> PermuteMatrixRows := proc(M :: Array(datatype=integer[4])
>                           , T :: Array(datatype=integer[4])
>                           , P :: Vector(datatype=integer[4])
>                           , m :: nonnegint
>                           , n :: nonnegint
>                          )
> local i,j;
>     for i to m do
>         for j to n do
>             T[i,j] := M[P[i],j];
>         end do;
>     end do;
>     ## Letting it off the hook for updating M.
>     #for i to m do
>     #    for j to n do
>     #        M[i,j] := T[i,j];
>     #    end do;
>     #end do;
>     return NULL;
> end proc:
>
> kernelopts(printbytes=false):
>
> (m,n):=3,5:
> M := rtable(1..m,1..n,random(-99..99),'subtype'=Matrix,'datatype'=integer[4]):> T := rtable(1..m,1..n,'subtype'=Matrix,'datatype'=integer[4]):
>
> maxnum:=100000:
>
> st:=time():
> cPermuteMatrixRows:=Compiler:-Compile(PermuteMatrixRows):
> for k from 1 to maxnum do
>   P := Vector(combinat:-randperm(m), 'datatype'=integer[4]):
>   cPermuteMatrixRows(M, T, P, m, n);
> end do:
> time()-st;
                                    20.465

Now, using a method which explicitly creates its own permutation of 1..m. (I hope that it is a valid implementation of Fisher-Yates, and that any mistake will be pointed out.) It's a bit wasteful is this manner: whenever it needs a random value from 1..j with 1<=j<=m it uses the same big sample, reject values >j which wastes quite a lot of the precomputed values. So the memory use could be improved, by utilizing several reservoirs instead.

> restart:
>
> Mshuffle:=proc(M1::Matrix(datatype=integer[4]),
>                M2::Matrix(datatype=integer[4]),
>                m::integer, n::integer,
>                V::Vector(datatype=integer[4]),
>                ransam::Vector(datatype=integer[4]))
# Copies row i of of mxn M1 into row V[i] of mxn M2,
# for i=1..m, where V is an inplace generated random
# shuffle of 1..m.
# ransam is a source of precomputed random values in 1..m,
# with ransam[1] storing position of previuos run's last
# accessed entry.
> local i::integer, j::integer, count::integer,
>       need::integer, cand::integer;
>   count:=ransam[1];
>   for i from 1 to m do
>     V[i]:=i;
>   end do;
>   for i from m to 2 by -1 do
>     need:=1;
>     while need=1 do
>       count:=count+1;
>       cand:=ransam[1+count];
>       if cand<=i then
>         need:=0;
>       end if;
>     end do;
>     V[i],V[cand]:=V[cand],V[i];
>   end do;
>   for i from 1 to m do
>     for j from 1 to n do
>        M2[V[i],j]:=M1[i,j];
>     end do;
>   end do;
>   ransam[1]:=count;
>   NULL;
> end proc:
>
> kernelopts(printbytes=false):
>
> (m,n) := 3,5:
> M := rtable(1..m,1..n,random(-99..99),'subtype'=Matrix,'datatype'=integer[4]):> T := rtable(1..m,1..n,'subtype'=Matrix,'datatype'=integer[4]):
>
> maxnum:=100000:
>
> st:=time():
> cMshuffle:=Compiler:-Compile(Mshuffle):
> B:=Vector[row](m,datatype=integer[4]):
> rsrc:=LinearAlgebra:-RandomVector(maxnum*m*m,generator=1..m,
>                            outputoptions=[datatype=integer[4]]):
> for k from 1 to maxnum do
>   cMshuffle(M,T,m,n,B,rsrc);
> end do:
> time()-st;
                                     0.837

For 1000000 repetitions cMshuffle took 7.5 sec and 110MB for the session while cPermuteMatrixRows&randperm took 201.5 sec and 35.8MB for the session.

I'd like to get a little philosophical. I don't think that implementing simple algortihms (for sorting, or standard deviation, or shuffling, etc) is too much of a pain. The Compiler gives the current best performance possible in Maple for many purely numerical problems (except compared to calling external to some other program...). I like to think that examples which illustrate it (along with inplace a memory management refinements) are going to help some people with other similar tasks.

BTW, I think that your ideas, Joe, about efficiently permuting the entries of any datatype rtable is feasible. The OpenMaple API allows for it, not problem. But it might require writing a (reusable) compiled external routine.

acer

Yes, numapprox, mentioned also in an ealier reply above, is a good approach for this.

But be careful, the accuracy might not be as good as that last argument to minimax.

For 8 digits (was that actually requested by the OP!?) one could try something like this, after evaluating at the values for Km, S0 and v,

Digits:=15:

numapprox:-minimax(s,t=0..135,[5,6],.5e-9);

And so on.

That gives a result that is easy to code in double-precision, with 15 digits float coefficients.

And the numerator and denominator are even returned already in Horner form.

A tougher question might be: how to get or demonstrate that a numapprox result is adequate for all the parameter ranges mentioned in the OP's followup replies.

acer

Yes, numapprox, mentioned also in an ealier reply above, is a good approach for this.

But be careful, the accuracy might not be as good as that last argument to minimax.

For 8 digits (was that actually requested by the OP!?) one could try something like this, after evaluating at the values for Km, S0 and v,

Digits:=15:

numapprox:-minimax(s,t=0..135,[5,6],.5e-9);

And so on.

That gives a result that is easy to code in double-precision, with 15 digits float coefficients.

And the numerator and denominator are even returned already in Horner form.

A tougher question might be: how to get or demonstrate that a numapprox result is adequate for all the parameter ranges mentioned in the OP's followup replies.

acer

First 461 462 463 464 465 466 467 Last Page 463 of 594