acer

29498 Reputation

29 Badges

18 years, 145 days
Ontario, Canada

Social Networks and Content at Maplesoft.com

MaplePrimes Activity


These are replies submitted by acer

The author starts out with an omission, claiming that modern software is experienced almost exclusively through pictures or pointing and pushing. Mathematical software, such as Maple, can also allow interaction by communicating via language. Language is another distinct and important mode of interaction and exchange of information, and it's surprising to see it overlooked in what purports to be a near complete characterization. The communication of mathematics, even when done using specialised notation, is a mechanism with linguistic aspects. acer
Perhaps this anecdote might explain it. acer
Planck's constant has a dimension. It's not so important what the dimensions are (angular momentum) so much as what consequences that brings with it. *You* get to pick your own units for that dimension. So you can decide that Planck's constant is just about any number in magnitude. It depends on what system of units you choose. It's not as if the SI system were god-given! Choose an appropriate system of units, and the constant can then be of size 1 in that system. By that I mean, choose a base size for the units of length, time, etc, so that the constant is 1 in that system. If the question were rephrased to be something like, "is Planck's constant irrational when expressed using SI units?" then it might depend on stuff like whether space itself (distance, or time) is quantized. acer
Keeping the total memory allocation down is often an important part of efficiency. But, also, keeping the memory usage (garbage production and cleanup) down can also leads to time savings. I suspect that ArrayTools:-Reshape produce a full copy of the rtable argument, while ArrayTools:-Alias actually provides an alternate view of the same data in memory without making a copy. So, using Alias can be more efficient than using Reshape. Also, in the test3 as given, it was not necessary to keep all of the instances of Vector a around. And indeed in Joe's test3a the Vector a was overwritten due to a being reassigned. But each assignment to a in test3a extracts a new row of A, producing a new Vector each time. Memeory usage can be reduced by instead forming the container for Vector a just once, and then doing a fast copy of the appropriate row of A into Vector a. So, assuming that I coded it correctly, test3b := proc(N,n) local i,a,X,A; use Statistics in X := RandomVariable(Normal(0,1)): A := ArrayTools:-Alias(Sample(X,n*N),[N,n],Fortran_order); a := Vector[row](n,datatype=float,order=Fortran_order); for i from 1 to N do ArrayTools:-Copy(n,A,n*(i-1),1,a,0,1); end do; end use; end proc: Now, to measure the efficiency, let's also look at memory used and memory allocated, as well as time. > (st,ba,bu):=time(),kernelopts(bytesalloc),kernelopts(bytesused): > test3(30000,100): > time()-st,kernelopts(bytesalloc)-ba,kernelopts(bytesused)-bu; 18.419, 8518120, 899923608 > (st,ba,bu):=time(),kernelopts(bytesalloc),kernelopts(bytesused): > test3a(30000,100): > time()-st,kernelopts(bytesalloc)-ba,kernelopts(bytesused)-bu; 0.711, 56088544, 79456288 > (st,ba,bu):=time(),kernelopts(bytesalloc),kernelopts(bytesused): > test3b(30000,100): > time()-st,kernelopts(bytesalloc)-ba,kernelopts(bytesused)-bu; 0.571, 31844664, 35960072 Of course, what would be really nice would be to get the speed of test3b or test3a and the lower memory allocation (by at least a factor of four) of test3. This may not be currently possible. One way to get it might be with an enhancement to the Statistics:-Sample routine, so that it accepts as an optional argument the container Vector for the result. It could re-use the same Vector container, and fill it with new sample data, operating in-place. That could allow the total memory allocation to stay low, to generate only n pieces of data at any one time, but avoid production of a lot of Vectors as garbage. It's a bit of a shame, that it isn't implemented in this way. Since at default Digits=10 the Vectors have hardware double precision float[8] datatype, there wouldn't be garbage produced from all the entries as software floats. ps. It may be that A should be C_order, according to how I set up the strides for ArrayTools:-Alias, so that it walks rows of A and not columns. It shouldn't affect the timings much. acer
Jacques' points about pragmatism and making progress in the face of technical or theoretical difficulties are very much to the point. m := matrix(2,3,[1,1,a,1,1,b]): linalg[LUdecomp](m,'U1'='u1'): seq(`if`(type(u1[i,i],Non(constant)),u1[i,i],NULL),i=1..2); # Are these bugs below? # How about better documentation of the following? # Eg, as an Example on the gaussjord help-page. Testzero := proc(x) if is(x=0) = true then true else false; end if; end proc: linalg[gaussjord](m) assuming b-a=0; # Should it work for LinearAlgebra too? # Or is Normalizer used when Testzero ought to be instead? M := Matrix(m): LinearAlgebra[ReducedRowEchelonForm](M) assuming b-a=0; So, I also wish for more thorough documentation of Testzero and Normalizer, of where they are used any why. ps. Yes, I realize that I could probably have gotten away with an even simpler true/false/FAIL conditional in my Normalizer. Apologies if it brings up anyone's pet peeves. acer
One can also consider memory usage as an important component of efficiency. Apart from wanting to keep memory allocation down for its own benefits, one can also try to keep memory use and re-use down so as to minimize garbage collection time. Other benefits of this sort of array memory optimization can be that total memory allocation can sometimes be reduced, that large problems can become tractable, and sometimes speed is improved. The reasons for this can be complicated, but it can relate to memory fragmentation and also to the fact that garbage is not immediately freed. For example, Y := (X.beta)^%T; might become Y := X.beta; LinearAlgebra:-Transpose(Y,inplace=true); That should avoid producing an unnecessary object of the size of Y. The creating of Arrays n1 and n2 by forming each N12 sub-Array might also be improved. One might be able to allocate empty n1 and n2 Vector[row]'s of the desired size and datatype, just once, and then use ArrayTools:-Copy each time through the loop so as to get the right portion of N12 into them. The key would be using the right offset and stride. One might also be able to allocate space for Y1 and Y2 to be used in procedure `compute`, just the once. Eg, produce Y1 and Y2 as empty Vector[row]'s of the desired datatype, outside of `compute`, just once. Then, outside of `compute` but each time through the loop, use ArrayTools:-Copy to get Y into Y1. Follow that by VectorAdd(Y1,n1,inplace=true). And similarly for Y2. Notice also that n1 and n2 might only get used in `compute` to produce Yp. So why not produce a re-uasable container for Yp just once, and never produce Y1 and Y2 at all! How about something like this, Yp:=Vector[row](num,datatype=float); # just once n1:=Vector[row](num,datatype=float); # just once # and now, inside the i loop ArrayTools:-Copy(...,N12,...n1...); # get data into n1 ArrayTools:-Copy(...,N12,...Yp...); # get n2 into Yp VectorAdd(Yp,n1,inplace=true); VectorScalarMultiply(Yp,0.5,inplace=true); # (n1+n2)/2 VectorAdd(Yp,Y); # (Y+n1 + Y+n2)/2 = (Y1+Y2)/2 Now consider the lines in `compute` like, add(Yp[k]*X[k,3]/(n/2),k=1..n): These line are the only places where Yp gets used, yes? So why not first scale Yp, inplace, by n/2 and then have those lines be like, add(Yp[k]*X[k,3],k=1..n): The Y1-Y2 object is just n1-n2, no? So the Y1-Y2 = n1-n2 object could also be created just once, as a re-usable Vector `Y1minusY2` outside the loop. But outside the loop, one already has n1 from above code. And n1 is no longer needed, once Yp is formed. So use it to hold Y1-Y2=n1-n2. Ie, outside the loop do, VectorAdd(n1,n2,1,-1,inplace=true); Hopefully I haven't made major mistakes. I'm sure that there are other improvements possible, and some of these techniques above won't make a big difference for smaller problems. acer
Hi Joe, I wasn't trying to say that copy() wasn't needed for anything but tables. As you reiterated, it behaves differently from the rtable constructors themselves. What happens if you try this in Maple 11, versus Maple 10? a := array(1..3,[A,B,C]); b := array(a); a[1]:=z: eval(a); eval(b); It seems to me that in Maple 11 the command array(a) produces a copy of a, and that this is even documented on the ?array help-page. But I make no such claim about the vector constructor. I agree with you, that better documentation of the assignment operations would be of benefit. More obvious explanations of the differences in the array, table, list, rtable, etc, data structures could be good, as well as general advice on what each may be useful for (typically), and why. acer
Hi Joe, I wasn't trying to say that copy() wasn't needed for anything but tables. As you reiterated, it behaves differently from the rtable constructors themselves. What happens if you try this in Maple 11, versus Maple 10? a := array(1..3,[A,B,C]); b := array(a); a[1]:=z: eval(a); eval(b); It seems to me that in Maple 11 the command array(a) produces a copy of a, and that this is even documented on the ?array help-page. But I make no such claim about the vector constructor. I agree with you, that better documentation of the assignment operations would be of benefit. More obvious explanations of the differences in the array, table, list, rtable, etc, data structures could be good, as well as general advice on what each may be useful for (typically), and why. acer
Does this serve? proc(n::posint) rtable(1..n,1..2,frandom(0.0,1.0),subtype=Array,datatype=float[8]); end proc: acer
The constraints are namd like Constaint1, etc, but the call to NLPSolve has them as Constraint1, etc. acer

Axel made some good points. I believe that his comment on Theta being uninitialized is mistaken, however. I too am not a big fan of 2Dmath for authoring maple programs. I hope that my conversion of to 1dmath input is correct. Here are some suggestions and comments. - Use float[8] an complex[8] datatypes, if possible. This helps avoid temporary memory allocation by pushing some of the computation out to compiled external code. - Initialize x, R, and Theta using the initializer available within the Matrix constructor. This is more efficient than initializing them after creation, with loops. - Vpq is enormous, requiring >800MB of memory with complex[8] datatype. This may not be avoidable. More on this at the end. - Rpq is as big as Vpq, but it is not needed. I understand that Vpq may be wanted, after the work is done. (It could be returned, with moments, instead of being a global.) But it seems that only two or three entries of Rpq are needed at any time within the inner part of the double-loop at the end of the code. So by getting rid of Rpq you might be able to save >800MB. - Use ArrayTools[Alias] instead of ArrayTools[Reshape]. The latter produces an unnecessary copy, while the former does not. - Use evalhf where possible. This gives you double-precision, and actually saves memory too. - In the double-loop to initialize pq, observe that the inner ("and") condition of q<=p can be dropped if the inner loop variable q is only allowed to go as high as p. - Use evalhf for all the computations done in the nasty, super expensive double loop at the end of the procedure. More on this later. - The moments Vector can be created by the muliplication. You don't need to create it, and then loop to assign to it. - Don't use A . B . C for efficient linear algebra. Make use of the LinearAlgebra commands directly. That saves at least one unnecessary copy, or perhaps two. Make sure you control these operations. Scale moments by coef, inplace after the muliplication. (The worst scenario would be scaling Vpq which is enormous by coef. Bad because it's so big, and horrible if it produces an enourmous scaled copy.) For the dim=200,pmax=70,qmax=70 case this code below took about 830MB and 3h20min on a fast P4. Please check that I didn't make mistakes in the translation. You can try running smaller examples against original and mine. If I did make mistakes then they can likely be corrected while remaining efficient. Most of my changes are in somewhat independent chunks. I tried to check that moments[1] stayed the same, but I didn't test Vpq results. zernike:=proc(dim,pmax,qmax,image) local i,j,y,kmax,k,r,t,Z,Ztot,coef,exponente,size,p,q,pq,s, H1,H2,H3,im,imtot,Theta,R,Image,Vpq,moments,x, Rpqis,Rpqis1,Rpqis2,bainit,st,oldDigits; st,bainit := time(),kernelopts(bytesalloc); # Raise Digits during initialization of x, R, and Theta. oldDigits,Digits := Digits,max(Digits,trunc(evalhf(Digits))); x:=Matrix(dim,1,(i)->evalf(((sqrt(2)*(i-1)))/((dim-1))-1/(sqrt(2))), datatype=float); y:=x; R:=Matrix(dim,dim,(j,i)->evalf(sqrt(x[i,1]^2+y[dim-j+1,1]^2)), datatype=float); Theta:=Matrix(dim,dim,(j,i)->evalf(arctan(y[dim-j+1,1],x[i,1])), datatype=float); Digits := oldDigits; if type(pmax,odd)=true then size:= 1/4*(pmax+1)*(pmax+3); pq:=Matrix(size,2,datatype=float); else size:=1/4*(pmax+2)^2; pq:= Matrix(size,2, datatype=float ); end if; userinfo(1,`zernike`,`stage 1`, time()-st, kernelopts(bytesalloc)-bainit ); i:=0; for p from 0 to pmax do for q from 0 to p do if type(p-abs(q),even)=true then i:= i+1; pq[i,1]:= p; pq[i,2]:= q; end if; end do; end do; userinfo(1,`zernike`,`stage 2`, time()-st, kernelopts(bytesalloc)-bainit ); R:=ArrayTools[Alias](R,[dim^2,1]); Theta:=ArrayTools[Alias](Theta,[dim^2,1]); Image:=ArrayTools[Alias](image,[dim^2]); Vpq:=Matrix(size,dim^2,datatype=complex(float)); userinfo(1,`zernike`,`stage 3`, time()-st, kernelopts(bytesalloc)-bainit ); Rpqis1,Rpqis2:=0.0,0.0; for s from size by -1 to 1 do for i to dim^2 do if pq[s,1]=pq[s,2] then Rpqis:=evalhf( R[i,1]^(pq[s,1]) ); elif pq[s,1]-pq[s,2]=2 then Rpqis:= evalhf( pq[s,1]*R[i,1]^(pq[s,1]) - (pq[s,1]-1)*R[i,1]^((pq[s,1]-2)) ); else H3:= evalhf( (-4*(pq[s+2,2]-2)*(pq[s+2,2]-3))/ ((pq[s,1]+pq[s+2,2]-2)*(pq[s,1]-pq[s+2,2]+4)) ): H2:= evalhf( (H3*(pq[s,1]+pq[s+2,2])*(pq[s,1]-pq[s+2,2]+2))/ (4*(pq[s+2,2]-1))+(pq[s+2,2]-2) ): H1:= evalhf( (pq[s+2,2]*(pq[s+2,2]-1))/2 -pq[s+2,2]*H2+ (H3*(pq[s,1]+pq[s+2,2]+2)*(pq[s,1]-pq[s+2,2]))/8 ): Rpqis:= evalhf( H1*Rpqis2+(H2+H3/(R[i,1]^2))*Rpqis1 ): Rpqis2,Rpqis1:=Rpqis1,Rpqis; end if; Vpq[s,i]:= evalhf( Rpqis*(exp(I*pq[s,2]*Theta[i,1])) ); end do; coef:=evalhf( ((2*pq[s,1]+2))/(Pi*((dim))^2) ); end do; userinfo(1,`zernike`,`stage 4`, time()-st, kernelopts(bytesalloc)-bainit ); moments := LinearAlgebra[MatrixVectorMultiply](Vpq,Image); LinearAlgebra[VectorScalarMultiply](moments,coef,inplace=true); userinfo(1,`zernike`,`stage 5`, time()-st, kernelopts(bytesalloc)-bainit ); return moments, Vpq; end proc: kernelopts(printbytes=false): #(dim,pmax,qmax):=200,70,70: (dim,pmax,qmax):=20,7,7; # Just for testing. mI := LinearAlgebra[RandomMatrix](dim,dim,outputoptions=[datatype=complex(float)]): # Let it be verbose, about how it's doing. infolevel[`zernike`]:=1: solmom,solVpq:=zernike(dim,pmax,qmax,mI): solmom[1]; Now, what can be said about this double loop, with something like 1200 and 10000 iterations at each level? That's where almost all of the 3hr20min of computation time goes. If it didn't have exp(I*pq[s,2]*Theta[i,1]) in it you could stick it inside its own procedure and run Compiler[Compile] against it. You still could do that, actually, by wrapping an eval() around that problematic line involving nonreal, complex computation. But then memory usage might go up high. I'm not sure how much garbage collection is possible when a Compiled routine runs. In order to try this, you would have to use ArrayTools[ComplexAsFloat] on Vpq. I was thinking something like this... first, get rid of the double loop at the end of zernike, and instead make that just one line like, say, coef := stage4_compiled(dim,size,pq,R,real_Vpq,Theta); And then, outside of zernike, have something like all this. st4_r := proc(x,y,z) local temp; temp := x*exp(I*y*z); return Re(temp); end proc: st4_i := proc(x,y,z) local temp; temp := x*exp(I*y*z); return Im(temp); end proc: stage4 := proc(dim::integer,size::integer,pq::Matrix(datatype=float[8]),R::Matrix(datatype=float[8]),Vpq::Matrix(datatype=float[8]),Theta::Matrix(datatype=float[8])) local Rpqis, Rpqis1, Rpqis2, H1, H2, H3, s, i, coef; global st4_r,st4_i; Rpqis1,Rpqis2:=0.0,0.0; for s from size by -1 to 1 do for i to dim^2 do if pq[s,1]=pq[s,2] then Rpqis:= R[i,1]^(pq[s,1]); elif pq[s,1]-pq[s,2]=2 then Rpqis:= pq[s,1]*R[i,1]^(pq[s,1]) - (pq[s,1]-1)*R[i,1]^((pq[s,1]-2)); else H3:= (-4*(pq[s+2,2]-2)*(pq[s+2,2]-3))/ ((pq[s,1]+pq[s+2,2]-2)*(pq[s,1]-pq[s+2,2]+4)): H2:= (H3*(pq[s,1]+pq[s+2,2])*(pq[s,1]-pq[s+2,2]+2))/ (4*(pq[s+2,2]-1))+(pq[s+2,2]-2): H1:= (pq[s+2,2]*(pq[s+2,2]-1))/2 -pq[s+2,2]*H2+ (H3*(pq[s,1]+pq[s+2,2]+2)*(pq[s,1]-pq[s+2,2]))/8: Rpqis:= H1*Rpqis2+(H2+H3/(R[i,1]^2))*Rpqis1: Rpqis2,Rpqis1:=Rpqis1,Rpqis; end if; Vpq[2*s-1,i]:= eval(st4_r(Rpqis,pq[s,2],Theta[i,1])); Vpq[2*s,i]:= eval(st4_i(Rpqis,pq[s,2],Theta[i,1])); end do; coef:= ((2*pq[s,1]+2))/(Pi*((dim))^2); end do; return coef; end proc: stage4_compiled:=Compiler[Compile](stage4); You'd almost certainly want to check that I hadn't done the wrong thing, with the assignments into Vpq above. You might also try to run CodeGeneration[C] on something like stage4, and replacing the exp call by a dummy unrecognized name. Then afterwards you could replace that dummy in the produced C souce with a call to cexp(). Then you could compile it into a dynamic library outside of Maple. And then replace all the stuff for it in the example by setting up a Maple procedure to point to the compiled external function using define_external(). acer

Axel made some good points. I believe that his comment on Theta being uninitialized is mistaken, however. I too am not a big fan of 2Dmath for authoring maple programs. I hope that my conversion of to 1dmath input is correct. Here are some suggestions and comments. - Use float[8] an complex[8] datatypes, if possible. This helps avoid temporary memory allocation by pushing some of the computation out to compiled external code. - Initialize x, R, and Theta using the initializer available within the Matrix constructor. This is more efficient than initializing them after creation, with loops. - Vpq is enormous, requiring >800MB of memory with complex[8] datatype. This may not be avoidable. More on this at the end. - Rpq is as big as Vpq, but it is not needed. I understand that Vpq may be wanted, after the work is done. (It could be returned, with moments, instead of being a global.) But it seems that only two or three entries of Rpq are needed at any time within the inner part of the double-loop at the end of the code. So by getting rid of Rpq you might be able to save >800MB. - Use ArrayTools[Alias] instead of ArrayTools[Reshape]. The latter produces an unnecessary copy, while the former does not. - Use evalhf where possible. This gives you double-precision, and actually saves memory too. - In the double-loop to initialize pq, observe that the inner ("and") condition of q<=p can be dropped if the inner loop variable q is only allowed to go as high as p. - Use evalhf for all the computations done in the nasty, super expensive double loop at the end of the procedure. More on this later. - The moments Vector can be created by the muliplication. You don't need to create it, and then loop to assign to it. - Don't use A . B . C for efficient linear algebra. Make use of the LinearAlgebra commands directly. That saves at least one unnecessary copy, or perhaps two. Make sure you control these operations. Scale moments by coef, inplace after the muliplication. (The worst scenario would be scaling Vpq which is enormous by coef. Bad because it's so big, and horrible if it produces an enourmous scaled copy.) For the dim=200,pmax=70,qmax=70 case this code below took about 830MB and 3h20min on a fast P4. Please check that I didn't make mistakes in the translation. You can try running smaller examples against original and mine. If I did make mistakes then they can likely be corrected while remaining efficient. Most of my changes are in somewhat independent chunks. I tried to check that moments[1] stayed the same, but I didn't test Vpq results. zernike:=proc(dim,pmax,qmax,image) local i,j,y,kmax,k,r,t,Z,Ztot,coef,exponente,size,p,q,pq,s, H1,H2,H3,im,imtot,Theta,R,Image,Vpq,moments,x, Rpqis,Rpqis1,Rpqis2,bainit,st,oldDigits; st,bainit := time(),kernelopts(bytesalloc); # Raise Digits during initialization of x, R, and Theta. oldDigits,Digits := Digits,max(Digits,trunc(evalhf(Digits))); x:=Matrix(dim,1,(i)->evalf(((sqrt(2)*(i-1)))/((dim-1))-1/(sqrt(2))), datatype=float); y:=x; R:=Matrix(dim,dim,(j,i)->evalf(sqrt(x[i,1]^2+y[dim-j+1,1]^2)), datatype=float); Theta:=Matrix(dim,dim,(j,i)->evalf(arctan(y[dim-j+1,1],x[i,1])), datatype=float); Digits := oldDigits; if type(pmax,odd)=true then size:= 1/4*(pmax+1)*(pmax+3); pq:=Matrix(size,2,datatype=float); else size:=1/4*(pmax+2)^2; pq:= Matrix(size,2, datatype=float ); end if; userinfo(1,`zernike`,`stage 1`, time()-st, kernelopts(bytesalloc)-bainit ); i:=0; for p from 0 to pmax do for q from 0 to p do if type(p-abs(q),even)=true then i:= i+1; pq[i,1]:= p; pq[i,2]:= q; end if; end do; end do; userinfo(1,`zernike`,`stage 2`, time()-st, kernelopts(bytesalloc)-bainit ); R:=ArrayTools[Alias](R,[dim^2,1]); Theta:=ArrayTools[Alias](Theta,[dim^2,1]); Image:=ArrayTools[Alias](image,[dim^2]); Vpq:=Matrix(size,dim^2,datatype=complex(float)); userinfo(1,`zernike`,`stage 3`, time()-st, kernelopts(bytesalloc)-bainit ); Rpqis1,Rpqis2:=0.0,0.0; for s from size by -1 to 1 do for i to dim^2 do if pq[s,1]=pq[s,2] then Rpqis:=evalhf( R[i,1]^(pq[s,1]) ); elif pq[s,1]-pq[s,2]=2 then Rpqis:= evalhf( pq[s,1]*R[i,1]^(pq[s,1]) - (pq[s,1]-1)*R[i,1]^((pq[s,1]-2)) ); else H3:= evalhf( (-4*(pq[s+2,2]-2)*(pq[s+2,2]-3))/ ((pq[s,1]+pq[s+2,2]-2)*(pq[s,1]-pq[s+2,2]+4)) ): H2:= evalhf( (H3*(pq[s,1]+pq[s+2,2])*(pq[s,1]-pq[s+2,2]+2))/ (4*(pq[s+2,2]-1))+(pq[s+2,2]-2) ): H1:= evalhf( (pq[s+2,2]*(pq[s+2,2]-1))/2 -pq[s+2,2]*H2+ (H3*(pq[s,1]+pq[s+2,2]+2)*(pq[s,1]-pq[s+2,2]))/8 ): Rpqis:= evalhf( H1*Rpqis2+(H2+H3/(R[i,1]^2))*Rpqis1 ): Rpqis2,Rpqis1:=Rpqis1,Rpqis; end if; Vpq[s,i]:= evalhf( Rpqis*(exp(I*pq[s,2]*Theta[i,1])) ); end do; coef:=evalhf( ((2*pq[s,1]+2))/(Pi*((dim))^2) ); end do; userinfo(1,`zernike`,`stage 4`, time()-st, kernelopts(bytesalloc)-bainit ); moments := LinearAlgebra[MatrixVectorMultiply](Vpq,Image); LinearAlgebra[VectorScalarMultiply](moments,coef,inplace=true); userinfo(1,`zernike`,`stage 5`, time()-st, kernelopts(bytesalloc)-bainit ); return moments, Vpq; end proc: kernelopts(printbytes=false): #(dim,pmax,qmax):=200,70,70: (dim,pmax,qmax):=20,7,7; # Just for testing. mI := LinearAlgebra[RandomMatrix](dim,dim,outputoptions=[datatype=complex(float)]): # Let it be verbose, about how it's doing. infolevel[`zernike`]:=1: solmom,solVpq:=zernike(dim,pmax,qmax,mI): solmom[1]; Now, what can be said about this double loop, with something like 1200 and 10000 iterations at each level? That's where almost all of the 3hr20min of computation time goes. If it didn't have exp(I*pq[s,2]*Theta[i,1]) in it you could stick it inside its own procedure and run Compiler[Compile] against it. You still could do that, actually, by wrapping an eval() around that problematic line involving nonreal, complex computation. But then memory usage might go up high. I'm not sure how much garbage collection is possible when a Compiled routine runs. In order to try this, you would have to use ArrayTools[ComplexAsFloat] on Vpq. I was thinking something like this... first, get rid of the double loop at the end of zernike, and instead make that just one line like, say, coef := stage4_compiled(dim,size,pq,R,real_Vpq,Theta); And then, outside of zernike, have something like all this. st4_r := proc(x,y,z) local temp; temp := x*exp(I*y*z); return Re(temp); end proc: st4_i := proc(x,y,z) local temp; temp := x*exp(I*y*z); return Im(temp); end proc: stage4 := proc(dim::integer,size::integer,pq::Matrix(datatype=float[8]),R::Matrix(datatype=float[8]),Vpq::Matrix(datatype=float[8]),Theta::Matrix(datatype=float[8])) local Rpqis, Rpqis1, Rpqis2, H1, H2, H3, s, i, coef; global st4_r,st4_i; Rpqis1,Rpqis2:=0.0,0.0; for s from size by -1 to 1 do for i to dim^2 do if pq[s,1]=pq[s,2] then Rpqis:= R[i,1]^(pq[s,1]); elif pq[s,1]-pq[s,2]=2 then Rpqis:= pq[s,1]*R[i,1]^(pq[s,1]) - (pq[s,1]-1)*R[i,1]^((pq[s,1]-2)); else H3:= (-4*(pq[s+2,2]-2)*(pq[s+2,2]-3))/ ((pq[s,1]+pq[s+2,2]-2)*(pq[s,1]-pq[s+2,2]+4)): H2:= (H3*(pq[s,1]+pq[s+2,2])*(pq[s,1]-pq[s+2,2]+2))/ (4*(pq[s+2,2]-1))+(pq[s+2,2]-2): H1:= (pq[s+2,2]*(pq[s+2,2]-1))/2 -pq[s+2,2]*H2+ (H3*(pq[s,1]+pq[s+2,2]+2)*(pq[s,1]-pq[s+2,2]))/8: Rpqis:= H1*Rpqis2+(H2+H3/(R[i,1]^2))*Rpqis1: Rpqis2,Rpqis1:=Rpqis1,Rpqis; end if; Vpq[2*s-1,i]:= eval(st4_r(Rpqis,pq[s,2],Theta[i,1])); Vpq[2*s,i]:= eval(st4_i(Rpqis,pq[s,2],Theta[i,1])); end do; coef:= ((2*pq[s,1]+2))/(Pi*((dim))^2); end do; return coef; end proc: stage4_compiled:=Compiler[Compile](stage4); You'd almost certainly want to check that I hadn't done the wrong thing, with the assignments into Vpq above. You might also try to run CodeGeneration[C] on something like stage4, and replacing the exp call by a dummy unrecognized name. Then afterwards you could replace that dummy in the produced C souce with a call to cexp(). Then you could compile it into a dynamic library outside of Maple. And then replace all the stuff for it in the example by setting up a Maple procedure to point to the compiled external function using define_external(). acer

The gauss procedure uses ilcm. Its parameter-processing calls `type/listlist`. The setup of the data A, prior to calling gauss(), uses RandomTools. acer
You showed a result from Maple of 19.99909999 . Presumably that was done at the default value of Digits=10. But, evalf[11](exp(Pi)-Pi) returns as 19.999099979 . So why isn't the the first result, at the default value of Digits=10, instead 19.99909998 ? acer
Notice that the answer obtained from A.Vogt's method was 0.0197... times I. That is, it was a purely imaginary number. The result was not the real number 0.0197... Notice also that, if you plug in the imaginary number, remembering to make it I*0.0197.., then that does produce a value very close to 0.5. Ie, evalf(abs(f(nTst,0.01970538187*I))); So, the answer I*0.0197.. was correct, given that you did not specify a real-valued range for the result from fsolve. You can supply a purely real-valued range to fsolve, eg, f := (n,v) -> Zeta(0,n+1,1-2*Pi*v*I)/Zeta(n+1): nTst:= 5: fsolve( abs(f(nTst,v))=0.5, v=0..0.2 ); That gives the result 0.08039... that you mentioned. acer
First 534 535 536 537 538 539 540 Page 536 of 542