Not talking about giving wrong answer (as, for example, in A110375 thread), many Maple comands are amazingly inefficient. For example, SearchAll from ListTools pakage,
time(ListTools:-SearchAll(2,['$1..1000'$1000]));
33.977
Compare it, say, with the following obvious way of doing that:
mySearchAll:=(n,L)->select(m->L[m]=n,[seq(1..nops(L))])[]:
time(mySearchAll(2,['$1..1000'$1000]));
2.589
The answers given are identical,
is([ListTools:-SearchAll(2,['$1..100'$100])]=
[mySearchAll(2,['$1..100'$100])]);
true
With growing of the size of lists, the time increases approximately linearly in both procedures,
time(mySearchAll(2,['$1..10000'$1000]));
26.442
time(ListTools:-SearchAll(2,['$1..10000'$1000]));
302.626
Now, about effectiveness of this site. I posted that on this site about 3 years ago, during Maple 10 period. Nothing has changed - it is the same in Maple 12.
That could be my first contribution in Maple patch library (independent).
Alec
Optimizing a different Case
Consider a different case:
So, it looks like SearchAll is attempting to be output sensitive and optimal in the case when there are very few occurrences.
On the other hand, that makes this a O(n^2) worst case algorithm. Consider:
M:=['1001'$(10000)]: time(ListTools:-SearchAll(1001, M)); # 0.992 M:=['1001'$(20000)]: time(ListTools:-SearchAll(1001, M)); # 3.924 M:=['1001'$(40000)]: time(ListTools:-SearchAll(1001, M)); # 15.996 time(select(m->M[m]=1001, [seq](1..nops(M)))); 0.044Perhaps there should be flag to avoid the output sensitive case when you suspect there will be many occurences.
John
select
John,
That's an interesting example.
It also shows the weakness of select in this case. Unfortunately, select is built-in so the source code is not available.
By the way, I didn't mean to say that using select is the best way of doing that. It was just the first one that came to mind, and it worked OK in the real life example in which I needed it while SearchAll was too slow. It was a sequence of 0s and 1s, so there were many repetitions.
It can be easily improved using standard tricks - writing it as a sequence, compiling etc.
By the way, [seq](...) is slower than [seq(...)],
time([seq](1..10^7));
0.405
time([seq(1..10^7)]);
0.218
Alec
magic number
As John points out, the algorithm used in ListTools:-SearchAll is worst-case O(n^2) when the list consists of all items that are being searched.
Assume that a list of n elements has m matches, uniformly distributed. It can be shown that the algorithm is then O(k1*n + k2*m/2*(n+1)), where k1 is the coefficient for member and k2 the coefficient for allocating memory/copying (which is what L[p0..-1] must do). A simple linear search is O(k3*n). If we equate the estimates and solve for m we get
m = 2*n*(k3-k1)/k2/(n+1) ~ 2*(k3-k1)/k2
The implication is that there is a magic number m, such that if there are more matches than that in a uniformly distributed list, then it is faster to use a simple sequential search.
Here is a faster sequential searcher than Alec's proposal.
Testing on my machine indicates that, for equal time m ~ 50 while for equal memory usage m ~ 10.
when it rains, it pours
Curiously, another member of our mathematical software development group mentioned this routine to me today. The developer -- who mentioned looking at this a while ago -- said that the routine behaves something like K*m*n , and might not seq(`if`(..)..) be faster? I asked if this stemmed from the mapleprimes thread, but apparently it was independent.
I used to be more surprised by such coincidences, when things being considered in-house matched some external query or request. It's hard to always keep in mind that coincidences get noticed while the many more instances of no coincidence pass by quietly.
And then I found out that John and yet another math developer had been discussing mixed strategies. For example, starting off with one method and then possibly switching methods if the number of matches reached a key value and if there was yet some other key value of entries still left to search. (Such key values might depend on each other, and on what proportion had been searched so far.)
So, I'm pretty sure that we'll figure out a good heuristic for this. I've been reminded that ListTools is important as a lower level utility in the Math Library (eg. in `solve`, etc).
Dave Linder
Mathematical Software, Maplesoft
patch library
Just returned home from a trip to Champaign, IL (not related to Mathematica). Seems to be a nice place. Not exactly London, England, but may be better than Jackson, TN.
It is great to see that SearchAll may work better in the next Maple release. If things worked like that, the maple patch library would not be necessary (or interesting). Wiki still may be useful though. I plan to put this thread there (probably, tomorrow) in some form that doesn't infringe (is this a correct word?) with copyright - but if people posting in the thread could do that themselves, that would certainly be better (and less work for me.)
Alec