A while ago, I was trying to determine the fastest way, using Maple, to compute a list of a particular sequence of integers defined by a recurrence relation. This isn't a new idea of course, but I think the exercise is a useful introduction into the various ways of coding algorithms in Maple, and why some may be better than others.

My original example was a bit esoteric, so for simplicity I've redone it using the standard example of a recurrence relation, the Fibonacci sequence. We'll fix N, the number of Fibonacci numbers to compute, at 40000.

> N := 40000:

Below, we'll try to find the fastest way, in Maple, to compute a list of these numbers from scratch. Note that the speed comparisons might be specific to this machine, OS, or Maple version, and even though the runtime of most of these implementations is about the same, the fastest method at N=40000 might not be the fastest one for larger N.

The first idea is the completely naive approach. We'll store the Fibonacci numbers in a Maple list L, step through a loop counting up the sequence, and in each loop iteration add to the list with L := [op(L), new_element].

f1 := proc(N)
    local i, L;
    L := [1, 1];
    for i from 2 to N do
        L := [op(L), L[-2] + L[-1]];
    end do;
    L
end proc:

The problem with this approach is that Maple lists are not intended to be grown in this way, and each redefinition of the list L costs O(N) time, making the whole procedure O(N).

> time(f1(N));
31.440

From now on we'll use only implementations that take O(N) time, and see how they compare.

The next implementation uses a Maple table stored as a local variable of a module. We compute the sequence and assign it to table entries, and at the end generate a Maple list from these table entries. The use of tables is advantageous because tables can be dynamically grown more efficiently than lists can, but this approach has the disdvantage that the table must be completely re-traversed in the last step to generate the list.

f2 := module()
    local ModuleApply, T;

    T := table([0=1, 1=1]);

    ModuleApply := proc(N)
        local i;
        for i from 2 to N do
            T[i] := T[i-2] + T[i-1];
        end do:
        [seq(T[i], i=0..N)]
    end proc:
end module:

You'll see that since this code avoids list-appending, it's quite a bit faster:

> time(f2(N));
1.110

The next idea is to do essentially what we did above, but using remember tables rather than built-in Maple tables. We'll pre-populate the remember table for the utility procedure p with entries for the base cases, F0 and F1, and then build our list through recursive calls.

f3 := module()
    local p, ModuleApply;

    p := proc(n) option remember;
        procname(n-2) + procname(n-1)
    end proc:

    p(0) := 1;
    p(1) := 1;
 
    ModuleApply := proc(N)
        [1, 1, seq(p(i), i=2..N)]
    end proc:
end module:

As we might expect, the computed result is pretty close (but a bit higher than) our last number.

> time(f3(N));
1.229

One disadvantage of the last two approaches is that they both rely on creating an intermediate data structure of size O(N) (the table or remember table), which is promptly discarded, since what we're interested in is a list. Instead we can use a 'cache', which is a new feature of Maple 9.5, and is essentially a remember table of bounded size.

f4 := module()
    local p, ModuleApply;

    p := proc(n) option cache;
        procname(n-2) + procname(n-1)
    end proc:

    p(0) := 1;
    p(1) := 1;

    ModuleApply := proc(N)
        [1, 1, seq(p(), i=2..N)]
    end proc:
end module:

It works out to be slightly slower than f3; this may be because of the time taken in garbage-collecting the information from the cache.

> time(f4(N));
1.459

Next, I've tried to implement something like a cache, without actually using one. Here we have a utility procedure with two counters, which represent the last and second-last Fibonacci number respectively. The procedure updates the values and returns the appropriate one. We'll build the list with recursive calls to this procedure. Like the cache, the intermediate data structures used here are of bounded size, since there are just two of them.

f5 := module()
    local p, n1, n2, ModuleApply;

    n1 := 1;
    n2 := 1;

    p := proc(n)
        (n1, n2) := (n2, n1+n2);
        n2;
    end proc:

    ModuleApply := proc(N)
        [1, 1, seq(p(), i=2..N)]
    end proc:
end module:

It's about the same speed as using the cache, which is impressive since it involves a lot more procedure calls into the procedure body than the cache example, which mostly consisted of calls into the cache.

> time(f5(N));
1.439

Finally, mostly for completeness, we try to see we get by pre-allocating an Array and adding to that. Arrays can be modified inplace in O(1) time, but cannot be dynamically grown. Here we make an array of the appropriate size, fill it, then convert it to a list.

f6 := proc(N)
    local A, i;
    A := Array(0..N, {0=1,1=1});
    for i from 2 to N do
        A[i] := A[i-2] + A[i-1];
    end do:
    convert(A, 'list')
end proc:

It's about as fast as the previous two examples, and has the advantage of simplicity of design. However, building the array and converting that to a list is another instance of using large intermediate data structures, which we avoided with the previous two examples.

> time(f6(N));
1.469

So it looks like f2 is the winner in this very specialized contest. This probably has more to do with the quirks of this example than any difference in efficiency of the code concerned. And if minimizing intermediate memory use is the goal, then f4 or f5 would be preferable.


Please Wait...