Hello,
I would like to use Maple instead of Matlab, well I rewrite the Matlab code to Maple like following. The output is correct, but it spends more time than in Matlab. Is it possible to rewrite my Maple code to get output faster?
restart:
linearni_model:=proc(beta,sigma)
local n1,n2,Y,X,Y1,Y2,se,se2,t1,t2,t3,t4,vbeta,t,si2,e1,e2,e3,e4,Yp,N,s2,n;
use Statistics in
X:=Matrix([[1, 1, 1, 1, 1], [1, 1, 1, 1, -1], [1, 1, 1, -1, 1], [1, 1, 1, -1, -1], [1, 1, -1, 1, 1], [1, 1, -1, 1, -1], [1, 1, -1, -1, 1], [1, 1, -1, -1, -1], [1, -1, 1, 1, 1], [1, -1, 1, 1, -1], [1, -1, 1, -1, 1], [1, -1, 1, -1, -1], [1, -1, -1, 1, 1], [1, -1, -1, 1, -1], [1, -1, -1, -1, 1], [1, -1, -1, -1, -1]]);
n:=16;
N := RandomVariable(Normal(0,sigma)):
n1:=Sample(N,16);
n2:=Sample(N,16):
Y:=X.beta:
Y:=convert(Y,Vector[row]):
Y1 := Y + n1:
Y2 := Y + n2:
Yp := (Y1 + Y2)/2:
e1:=add(Yp[k]*X[k,2]/(n/2),k=1..16):
e2:=add(Yp[k]*X[k,3]/(n/2),k=1..16):
e3:=add(Yp[k]*X[k,4]/(n/2),k=1..16):
e4:=add(Yp[k]*X[k,5]/(n/2),k=1..16):
si2:=Vector[row](1..16):
si2 := 1/2*map(x->x^2, Y1[1..16]-Y2[1..16]):
s2:=add(si2[k],k=1..16)/n:
se2 := 4*s2/(2*n):
se:=sqrt(se2):
t1 := e1/se;
t2 := e2/se;
t3 := e3/se;
t4 := e4/se;
t := 2.1199:
#% z tabulek k=16, alfa=0,05
#% testy vyznamnosti efektu,1-je vyznamny,0-neni vyznamny
vbeta := <0,0,0,0>:
if abs(t1)>t then
vbeta[1]:=1:
end if;
if abs(t2)>t then
vbeta[2]:=1:
end if;
if abs(t3)>t then
vbeta[3]:=1:
end if;
if abs(t4)>t then
vbeta[4]:=1:
end if;
end use;
return vbeta;
end proc:
test:=proc(beta, sigma, pocet_opakovani)
local i,vyznamnost;
vyznamnost := <0,0,0,0>;
for i from 1 to pocet_opakovani do
vyznamnost := vyznamnost + linearni_model(beta,sigma);
end do;
vyznamnost := map(x->x/pocet_opakovani,vyznamnost)*100;
end proc:
beta:=<11,7,5,-9,6>;
linearni_model(beta,sigma);
test(beta,1,1000);
Thank you very much.
Vladimir
View 551_mapleprimes.mw on MapleNet or Download 551_mapleprimes.mw
View file details
Comments
Getting faster: start with CodeTools[Profiling]
The first question you need to answer is how much faster do you need it to be.
Before attempting any serious optimization, you should get some idea as to what steps are taking most of the time. The CodeTools[Profiling] subpackage is useful for that. It can be used as follows:
sigma := 3.0: # this wasn't assigned with(CodeTools[Profiling]: Profile(linearni_model): test(beta,1,1000): PrintProfiles(); linearni_model linearni_model := proc(beta, sigma) local n1, n2, Y, X, Y1, Y2, se, se2, t1, t2, t3, t4, vbeta, t, si2, e1, e2, e3, e4, Yp, N, s2, n; |Calls Seconds Words| PROC | 1000 13.430 40194262| 1 | 1000 0.310 1537769| X := Matrix([[1, 1, 1, 1, 1], [1, 1, 1, 1, -1], [1, 1, 1, -1, 1], [1, 1, 1, -1, -1], [1, 1, -1, 1, 1], [1, 1, -1, 1, -1], [1, 1, -1, -1, 1], [1, 1, -1, -1, -1], [1, -1, 1, 1, 1], [1, -1, 1, 1, -1], [1, -1, 1, -1, 1], [1, -1, 1, -1, -1], [1, -1, -1, 1, 1], [1, -1, -1, 1, -1], [1, -1, -1, -1, 1], [1, -1, -1, -1, -1]]); 2 | 1000 0.000 0| n := 16; 3 | 1000 0.490 660757| N := Statistics:-RandomVariable(Normal(0,sigma)); 4 | 1000 1.160 3383694| n1 := Statistics:-Sample(N,16); 5 | 1000 1.200 3267384| n2 := Statistics:-Sample(N,16); 6 | 1000 1.740 4796540| Y := X.beta; 7 | 1000 0.240 662748 | Y := convert(Y,Vector[row]); 8 | 1000 1.870 5232405| Y1 := Y+n1; 9 | 1000 1.850 5270749| Y2 := Y+n2; 10 | 1000 0.000 153004| Yp := 1/2*Y1+1/2*Y2; 11 | 1000 0.470 1638911| e1 := add(2*Yp[k]*X[k,2]/n,k = 1 .. 16); 12 | 1000 0.410 1737491| e2 := add(2*Yp[k]*X[k,3]/n,k = 1 .. 16); 13 | 1000 0.370 1770133| e3 := add(2*Yp[k]*X[k,4]/n,k = 1 .. 16); 14 | 1000 0.430 1740252| e4 := add(2*Yp[k]*X[k,5]/n,k = 1 .. 16); 15 | 1000 0.210 738338 | si2 := Vector[row](1 .. 16); 16 | 1000 2.130 6121594| si2 := 1/2*map(x -> x^2,Y1[1 .. 16]-Y2[1 .. 16]); 17 | 1000 0.350 877050| s2 := add(si2[k],k = 1 .. 16)/n; 18 | 1000 0.030 58980| se2 := 2*s2/n; 19 | 1000 0.070 327876| se := sqrt(se2); 20 | 1000 0.020 19010| t1 := e1/se; 21 | 1000 0.010 29733| t2 := e2/se; 22 | 1000 0.010 29423| t3 := e3/se; 23 | 1000 0.010 32201| t4 := e4/se; 24 | 1000 0.000 0| t := 2.1199; 25 | 1000 0.000 23919| vbeta := `<,>`(0,0,0,0); 26 | 1000 0.030 13338| if t < abs(t1) then 27 | 1000 0.000 8245| vbeta[1] := 1 end if; 28 | 1000 0.000 15171| if t < abs(t2) then 29 | 1000 0.000 5659| vbeta[2] := 1 end if; 30 | 1000 0.010 25612| if t < abs(t3) then 31 | 1000 0.010 4869| vbeta[3] := 1 end if; 32 | 1000 0.000 8043| if t < abs(t4) then 33 | 1000 0.000 3364| vbeta[4] := 1 end if; 34 | 1000 0.000 0| return vbeta end procFrom that it is apparent that there are a few trivial improvements you can make. There is no use reassigning X each time inside the procedure, either make a global, pass it in as an argument, or lexically scope it. Statement 15, the initialization of si2, is not needed and can be deleted. Those should be done, but the overall effect is small. To make a serious dent you're probably going to have rewrite this so that evalhf can be used. That won't work, alas, on the statements that use the Statistics package, which currently take about 20% of the total time. So without redoing that, the best you can get is a 5X improvement.
Another improvement you can make is to precompute X.beta, no since doing that each time in the inner loop. Following is a rewrite with these changes. I've split your procedure into two parts, the second part (now called `compute') can be called with the evalhf procedure.
restart: X := Matrix([[1, 1, 1, 1, 1], [1, 1, 1, 1, -1], [1, 1, 1, -1, 1], [1, 1, 1, -1, -1], [1, 1, -1, 1, 1], [1, 1, -1, 1, -1], [1, 1, -1, -1, 1], [1, 1, -1, -1, -1], [1, -1, 1, 1, 1], [1, -1, 1, 1, -1], [1, -1, 1, -1, 1], [1, -1, 1, -1, -1], [1, -1, -1, 1, 1], [1, -1, -1, 1, -1], [1, -1, -1, -1, 1], [1, -1, -1, -1, -1]] ,datatype=float[8]): n := 16; sample :=proc(sigma) local N; global n; uses Statistics; N := RandomVariable(Normal(0,sigma)): Sample(N,n),Sample(N,n); end proc: compute := proc(Y::'Vector[row]'(16,datatype=float[8]) ,n1,n2,n,vbeta) local Y1,Y2,se,se2,t1,t2,t3,t4,t,si2,e1,e2,e3,e4,Yp,s2,k; Y1 := Y + n1: Y2 := Y + n2: Yp := (Y1 + Y2)/2: e1 := add(Yp[k]*X[k,2]/(n/2),k=1..n): e2 := add(Yp[k]*X[k,3]/(n/2),k=1..n): e3 := add(Yp[k]*X[k,4]/(n/2),k=1..n): e4 := add(Yp[k]*X[k,5]/(n/2),k=1..n): si2 := 1/2*map(x->x^2, Y1-Y2): s2 := add(si2[k],k=1..n)/n: se2 := 4*s2/(2*n): se := sqrt(se2): t1 := e1/se; t2 := e2/se; t3 := e3/se; t4 := e4/se; t := 2.1199: if abs(t1)>t then vbeta[1] := 1: end if; if abs(t2)>t then vbeta[2] := 1: end if; if abs(t3)>t then vbeta[3] := 1: end if; if abs(t4)>t then vbeta[4] := 1: end if; vbeta; end proc; test := proc(beta, sigma, pocet_opakovani) local i,vyznamnost,n1,n2,Y,vbeta,vbeta0; vyznamnost := <0,0,0,0>; Y := (X.beta)^%T; vbeta0 := Vector(4, datatype=float[8]); for i from 1 to pocet_opakovani do (n1,n2) := sample(sigma); vbeta := copy(vbeta0); vyznamnost := vyznamnost + evalhf(compute(Y,n1,n2,n,vbeta)); end do; vyznamnost := map(x->x/pocet_opakovani,vyznamnost)*100; end proc: beta := Vector([11,7,5,-9,6],datatype=float[8]): sigma := 3:Profiling this we get
with(CodeTools[Profiling]): Profile(test,sample): PrintProfiles(); sample sample := proc(sigma) local N; global n; |Calls Seconds Words| PROC | 1000 2.910 7395969| 1 | 1000 0.540 674583| N := Statistics:-RandomVariable(Normal(0,sigma)); 2 | 1000 2.370 6721386| Statistics:-Sample(N,n), Statistics:-Sample(N,n) end proc test test := proc(beta, sigma, pocet_opakovani) local i, vyznamnost, n1, n2, Y, vbeta, vbeta0; |Calls Seconds Words| PROC | 1 0.440 913992| 1 | 1 0.000 351| vyznamnost := `<,>`(0,0,0,0); 2 | 1 0.070 161693| Y := (X.beta)^%T; 3 | 1 0.000 455| vbeta0 := Vector(4,datatype = float[8]); 4 | 1 0.010 0| for i to pocet_opakovani do 5 | 1000 0.010 4292| n1, n2 := sample(sigma); 6 | 1000 0.090 88718| vbeta := copy(vbeta0); 7 | 1000 0.250 648017| vyznamnost := vyznamnost+evalhf(compute(Y,n1,n2,n,vbeta)) end do; 8 | 1 0.010 10466| vyznamnost := 100*map(x -> x/pocet_opakovani,vyznamnost) end procThis runs about 3-4 times faster than the original. To significantly it improve it you might have to code your own procedures for Statistics[sample], which now use up most of the time.
faster yet
On reflection, a few more improvements can be made. The call to RandomVariable should be taken out of the outer loop. Rather than making multiple calls to Sample, we can use a very large sample to initialize a Matrix, then select one row at a time. The following code does that (compute is not changed):
test := proc(beta, sigma, pocet_opakovani) local i,vyznamnost,n1,n2,Y,vbeta,vbeta0,N,N1,N2; global n; uses Statistics; vyznamnost := <0,0,0,0>; Y := (X.beta)^%T; vbeta0 := Vector(4, datatype=float[8]); N := RandomVariable(Normal(0,sigma)); N1 := Matrix(pocet_opakovani, n, convert(Sample(N,n*pocet_opakovani),list)); N2 := Matrix(pocet_opakovani, n, convert(Sample(N,n*pocet_opakovani),list)); for i from 1 to pocet_opakovani do n1 := N1[i,1..-1]; n2 := N2[i,1..-1]; vbeta := copy(vbeta0); vyznamnost := vyznamnost + evalhf(compute(Y,n1,n2,n,vbeta)); end do; vyznamnost := map(x->x/pocet_opakovani,vyznamnost)*100; end proc: beta := Vector([11,7,5,-9,6],datatype=float[8]): sigma := 3: with(CodeTools[Profiling]): Profile(test); time(test(beta, sigma, 1000)); PrintProfiles(); test test := proc(beta, sigma, pocet_opakovani) local i, vyznamnost, n1, n2, Y, vbeta, vbeta0, N, N1, N2; global n; |Calls Seconds Words| PROC | 1 0.890 2899786| 1 | 1 0.000 186| vyznamnost := `<,>`(0,0,0,0); 2 | 1 0.060 161140| Y := (X.beta)^%T; 3 | 1 0.000 413| vbeta0 := Vector(4,datatype = float[8]); 4 | 1 0.010 50436| N := Statistics:-RandomVariable(Normal(0,sigma)); 5 | 1 0.190 657490| N1 := Matrix(pocet_opakovani,n,convert(Statistics:-Sample(N,n*pocet_opakovani),list)); 6 | 1 0.190 687860| N2 := Matrix(pocet_opakovani,n,convert(Statistics:-Sample(N,n*pocet_opakovani),list)); 7 | 1 0.000 0| for i to pocet_opakovani do 8 | 1000 0.000 69326| n1 := N1[i,1 .. -1]; 9 | 1000 0.000 50770| n2 := N2[i,1 .. -1]; 10 | 1000 0.070 104326| vbeta := copy(vbeta0); 11 | 1000 0.370 1105480| vyznamnost := vyznamnost+evalhf(compute(Y,n1,n2,n,vbeta)) end do; 12 | 1 0.000 12359| vyznamnost := 100*map(x -> x/pocet_opakovani,vyznamnost) end procThat is an improvement of over an order of magnitude from the original.
thanks
Thank you for your posts on this topic, I've found them insightful.
ArrayTools:-Reshape
You're welcome. I usually learn as much doing this as I hope others do. Here is another improvement, which doubles the speed. As alluded to in my previous post, it is a lot faster to create one large random sample in one call to Sample, then to make multiple calls. Instead of making two calls, to generate N1 and N2, it is better to generate a larger array and then subdivide. I didn't do that originally because I didn't come up with an elegant way of doing so. However, there is one and it works nicely. Use ArrayTools:-Reshape command to reshape a the large Vector returned by Sample to a multidimensional Array. This is very fast and elegant.
test := proc(beta, sigma, pocet_opakovani) local i,vyznamnost,n1,n2,Y,vbeta,vbeta0,N,V,N12; global n; uses Statistics; vyznamnost := <0,0,0,0>; Y := (X.beta)^%T; vbeta0 := Vector(4, datatype=float[8]); N := RandomVariable(Normal(0,sigma)); V := Sample(N,2*n*pocet_opakovani); N12 := ArrayTools:-Reshape(V,pocet_opakovani,n,2); for i from 1 to pocet_opakovani do n1 := N12[i,1..-1,1]; n2 := N12[i,1..-1,2]; vbeta := copy(vbeta0); vyznamnost := vyznamnost + evalhf(compute(Y,n1,n2,n,vbeta)); end do; vyznamnost := map(x->x/pocet_opakovani,vyznamnost)*100; end proc:Running Profiling on this gives
test := proc(beta, sigma, pocet_opakovani) local i, vyznamnost, n1, n2, Y, vbeta, vbeta0, N, N12, V; global n; |Calls Seconds Words| PROC | 1 0.410 1243475| 1 | 1 0.000 181| vyznamnost := `<,>`(0,0,0,0); 2 | 1 0.070 162542| Y := (X.beta)^%T; 3 | 1 0.000 497| vbeta0 := Vector(4,datatype = float[8]); 4 | 1 0.020 50321| N := Statistics:-RandomVariable(Normal(0,sigma)); 5 | 1 0.070 82392| V := Statistics:-Sample(N,2*n*pocet_opakovani); 6 | 1 0.000 67716| N12 := ArrayTools:-Reshape(V,pocet_opakovani,n,2); 7 | 1 0.000 0| for i to pocet_opakovani do 8 | 1000 0.020 84299| n1 := N12[i,1 .. -1,1]; 9 | 1000 0.000 66057| n2 := N12[i,1 .. -1,2]; 10 | 1000 0.050 120740| vbeta := copy(vbeta0); 11 | 1000 0.170 597630| vyznamnost := vyznamnost+evalhf(compute(Y,n1,n2,n,vbeta)) end do; 12 | 1 0.010 11100| vyznamnost := 100*map(x -> x/pocet_opakovani,vyznamnost) end procWe should be able, I think, to do even better by not reshaping and using ArrayTools:-Alias in the inner loop to create n1 and n2, however, not enough time can be saved there to make a significant change. Better to work on the long poles. The Matrix multiplication and transpose in line 2 (above) can be improved.
memory usage as component of efficiency
One can also consider memory usage as an important component of efficiency. Apart from wanting to keep memory allocation down for its own benefits, one can also try to keep memory use and re-use down so as to minimize garbage collection time.
Other benefits of this sort of array memory optimization can be that total memory allocation can sometimes be reduced, that large problems can become tractable, and sometimes speed is improved. The reasons for this can be complicated, but it can relate to memory fragmentation and also to the fact that garbage is not immediately freed.
For example,
Y := (X.beta)^%T;
might become
Y := X.beta;
LinearAlgebra:-Transpose(Y,inplace=true);
That should avoid producing an unnecessary object of the size of Y.
The creating of Arrays n1 and n2 by forming each N12 sub-Array might also be improved. One might be able to allocate empty n1 and n2 Vector[row]'s of the desired size and datatype, just once, and then use ArrayTools:-Copy each time through the loop so as to get the right portion of N12 into them. The key would be using the right offset and stride.
One might also be able to allocate space for Y1 and Y2 to be used in procedure `compute`, just the once. Eg, produce Y1 and Y2 as empty Vector[row]'s of the desired datatype, outside of `compute`, just once. Then, outside of `compute` but each time through the loop, use ArrayTools:-Copy to get Y into Y1. Follow that by VectorAdd(Y1,n1,inplace=true). And similarly for Y2.
Notice also that n1 and n2 might only get used in `compute` to produce Yp. So why not produce a re-uasable container for Yp just once, and never produce Y1 and Y2 at all! How about something like this,
Yp:=Vector[row](num,datatype=float); # just once
n1:=Vector[row](num,datatype=float); # just once
# and now, inside the i loop
ArrayTools:-Copy(...,N12,...n1...); # get data into n1
ArrayTools:-Copy(...,N12,...Yp...); # get n2 into Yp
VectorAdd(Yp,n1,inplace=true);
VectorScalarMultiply(Yp,0.5,inplace=true); # (n1+n2)/2
VectorAdd(Yp,Y); # (Y+n1 + Y+n2)/2 = (Y1+Y2)/2
Now consider the lines in `compute` like,
add(Yp[k]*X[k,3]/(n/2),k=1..n):
These line are the only places where Yp gets used, yes? So why not first scale Yp, inplace, by n/2 and then have those lines be like,
add(Yp[k]*X[k,3],k=1..n):
The Y1-Y2 object is just n1-n2, no? So the Y1-Y2 = n1-n2 object could also be created just once, as a re-usable Vector `Y1minusY2` outside the loop. But outside the loop, one already has n1 from above code. And n1 is no longer needed, once Yp is formed. So use it to hold Y1-Y2=n1-n2. Ie, outside the loop do,
VectorAdd(n1,n2,1,-1,inplace=true);
Hopefully I haven't made major mistakes. I'm sure that there are other improvements possible, and some of these techniques above won't make a big difference for smaller problems.
acer