Thanks for the quick help! I didn't think of that...
But is there a reason why Maple code is often soooo much slower than a Matlab implementation (even when Maple calls the compiled NAG hardware floating point routines)???

Thanks for the quick help! I didn't think of that...
But is there a reason why Maple code is often soooo much slower than a Matlab implementation (even when Maple calls the compiled NAG hardware floating point routines)???

Sorry for being unclear. Similar to what you did, I measured the elapsed time with Maple's time command. Of course, the increase in the required time is not so big in your example... but on the other hand the procedure is anyway quite slow and I found it a rather weird behaviour that it becomes even slower. Moreover, the idea of such random matrices is, of course, that I want to use really many of them. That's how I stumbled over this behaviour. For instance:
```
> time( Feynman_random_rho(4) );
0.468
> for i from 1 to 100 do
Feynman_random_rho(4);
end do:
> time( Feynman_random_rho(4) );
0.640
```

...which is a significant (not to say annoying) increase.
I have really no idea why this occurs (and also why it is so slow anyway!)

Sorry for being unclear. Similar to what you did, I measured the elapsed time with Maple's time command. Of course, the increase in the required time is not so big in your example... but on the other hand the procedure is anyway quite slow and I found it a rather weird behaviour that it becomes even slower. Moreover, the idea of such random matrices is, of course, that I want to use really many of them. That's how I stumbled over this behaviour. For instance:
```
> time( Feynman_random_rho(4) );
0.468
> for i from 1 to 100 do
Feynman_random_rho(4);
end do:
> time( Feynman_random_rho(4) );
0.640
```

...which is a significant (not to say annoying) increase.
I have really no idea why this occurs (and also why it is so slow anyway!)

Thank you for your suggestion, acer.
Ultimately, however, I wanted to use the Optimization package for the local search phase within a _global_ optimizatio program for rather expensive "black box" functions (i.e. procedures). My early tests seem to indicate that the numerical approximation of the gradients will be too inefficient then.
Currently, it seems that the nonlinearsimplex method of the Optimization package is the best compromise in this case.
Since my experience in optimization problems is very limited: Is anybody aware of a nice simulated annealing (or something similar) code for Maple? Mathematica has something built-in. I guess if I program it myself it will become much more inefficient... :-(

Thank you for your suggestion, acer.
Ultimately, however, I wanted to use the Optimization package for the local search phase within a _global_ optimizatio program for rather expensive "black box" functions (i.e. procedures). My early tests seem to indicate that the numerical approximation of the gradients will be too inefficient then.
Currently, it seems that the nonlinearsimplex method of the Optimization package is the best compromise in this case.
Since my experience in optimization problems is very limited: Is anybody aware of a nice simulated annealing (or something similar) code for Maple? Mathematica has something built-in. I guess if I program it myself it will become much more inefficient... :-(

Well, stupid me didn't see that possibility. Actually, it is rather straightforward and it also turns out to be quite fast. This makes it a nice alternative for the above parametrization. However, I'm a bit confused about the ranges of the coefficients in the linear combination of the lambda matrices... Is there an easy way to see them? I understand that the coefficients should be somehow cyclic, so that it shouldn't take me out of the unitaries if I use too large coefficients, right?

Well, stupid me didn't see that possibility. Actually, it is rather straightforward and it also turns out to be quite fast. This makes it a nice alternative for the above parametrization. However, I'm a bit confused about the ranges of the coefficients in the linear combination of the lambda matrices... Is there an easy way to see them? I understand that the coefficients should be somehow cyclic, so that it shouldn't take me out of the unitaries if I use too large coefficients, right?

Unfortunately this is slower. I was also curious about that but my tests indicate that using shape=hermitian yields a performance penalty of roughly 15..20%.
Since profiling shows that most of the time is used for the rather low level functions MatrixMatrixMultiply() and MatrixFunction(), I guess there is not too much room for improvement left (only the experiments with datatype and shape).

Unfortunately this is slower. I was also curious about that but my tests indicate that using shape=hermitian yields a performance penalty of roughly 15..20%.
Since profiling shows that most of the time is used for the rather low level functions MatrixMatrixMultiply() and MatrixFunction(), I guess there is not too much room for improvement left (only the experiments with datatype and shape).

Thank you for your help. I learned that I should care more about the datatypes and the required conversions. However, I'm still a bit confused why Maple is so much slower than Matlab although it uses the compiled NAG routines. Anyway, at least for smaller N, the procedures are now usable.

Thank you for your help. I learned that I should care more about the datatypes and the required conversions. However, I'm still a bit confused why Maple is so much slower than Matlab although it uses the compiled NAG routines. Anyway, at least for smaller N, the procedures are now usable.

I thought about the exp(A)exp(B)=exp(A+B) story, too. But as acer said, it is a published article of respected authors, so I assume that the non-commutativity of the lambdas is the problem here. Also, some random tests seem to indicate that exp(A)exp(B)=exp(A+B) does not work here.
But anyway, if there is another way to get a _fast_ parametrization I'm of course interested! By the way, it seems not at all obvious to me what the correct parameter ranges are in order to fully cover all unitaries.
Still, I'm a little bit disappointed that the performance advantage of Matlab is really so huge here. I'm afraid putting this parametrization inside an optimization loop is prohibitive when I have to wait several seconds for every unitary matrix...

I thought about the exp(A)exp(B)=exp(A+B) story, too. But as acer said, it is a published article of respected authors, so I assume that the non-commutativity of the lambdas is the problem here. Also, some random tests seem to indicate that exp(A)exp(B)=exp(A+B) does not work here.
But anyway, if there is another way to get a _fast_ parametrization I'm of course interested! By the way, it seems not at all obvious to me what the correct parameter ranges are in order to fully cover all unitaries.
Still, I'm a little bit disappointed that the performance advantage of Matlab is really so huge here. I'm afraid putting this parametrization inside an optimization loop is prohibitive when I have to wait several seconds for every unitary matrix...

Thanks for pointing this out. Once more I realize that loops are to be avoided. Your suggestions make the creation of the generator matrices (lambdas) notably faster. However, once they are created, the problem is the repeated use of the MatrixExponential command and also the many matrix-matrix multiplications (which seem to eat up roughly the same amount of CPU time as the matrix exponentials...). Of course, I understand that Matlab is more optimized on the numerical side, but I guess one could squeeze out a bit more performance if one tells Maple how to do it, right? So does anyone have a suggestion for this?