acer

20490 Reputation

29 Badges

15 years, 30 days

Social Networks and Content at Maplesoft.com

MaplePrimes Activity


These are replies submitted by acer

@erik10 The Statistics:-Tally command does that. It's easy enough to evalf the ensuing rhs's, or reformat as a Matrix.

(In other words, you don't have to create yet another discrete empirical RandomVariable from every sample you create, so as to use its ProbabilityFunction just to get the sample's frequencies. As you've noticed, that's quite a roundabout effort.)

ps. One of the reasons I suggested DocumentTools:-Tabulate in my earlier Answer was that these kinds of running details about the sample's can be added as additional expressions in the Table (and would change with each iteration of sample size N say.)

restart:

with(Statistics):

U:=(a,b)->piecewise(a=1 or b=1,-4,abs(b-a)):
all:=[seq(seq(U(i,j),i=[$1..6]),j=[$1..6])]:
M:=[op({op(all)})]:
X:=RandomVariable(EmpiricalDistribution(all)):

 

 

 

 

N:=10^4:

S:=Sample(X,N):

L:=map(t->lhs(t)=evalf[5](rhs(t)/N),
       sort(Tally(S)));

[HFloat(-4.0) = .30580, HFloat(0.0) = .13760, HFloat(1.0) = .22820, HFloat(2.0) = .16450, HFloat(3.0) = .11030, HFloat(4.0) = 0.53600e-1]

Matrix(Vector~([lhs~,rhs~](L)));

Matrix(6, 2, {(1, 1) = -4.0, (1, 2) = .30580, (2, 1) = 0., (2, 2) = .13760, (3, 1) = 1.0, (3, 2) = .22820, (4, 1) = 2.0, (4, 2) = .16450, (5, 1) = 3.0, (5, 2) = .11030, (6, 1) = 4.0, (6, 2) = 0.53600e-1})

 

Download sample_freq.mw

(or Matrix([lhs~,rhs~](L))^%T for the Matrix)

@Carl Love Regarding finite support and discreteness, I was thinking only of a minor clarification for the general reader as possibly trumping redundancy. I did not mean to imply that you had missed a necessary part of the definition, and I apologize if it came off that way.

@Carl Love That is nice.

Perhaps you meant to write something like, "independent random variables from discrete distributions" in the description, though?

An illustrative example involving dice from a non-uniform distribution might be nice.

Also, do you see a way for it to handle an RV from a discrete distribution defined only by a general symbolic ProbabilityFunction and DiscreteValueMap? (Eg, the first Example from the help page for DiscreteValueMap. Or something where PDF(r,t) were like Sum(2^(-k)*Dirac(t-3^(-k)), k = 1 .. infinity).)

@Carl Love It's true that it uses the GPU. But it only offers a exports for a couple of BLAS operations (eg. O(n^3) Matrix-Matrix multiplcation).

I have mentally blocked it out because the current optimized float[8] BLAS used by the regular LinearAlgebra package (CPU), from the Intel MKL, is so fast  on modern CPU chipsets that it actually beats the CUDA dgemm used via Maple for most reasonably sized examples.

The problem is that the CUDA package has to send all the float[8] Matrix data to the GPU up front, and then get it back afterwards. The overhead of that transfer is on the same order as the time it now takes for the CPU dgemm to compute at most size Matrices that one usually encounter in all but the most esoteric uses in Maple.

Hence the Maple CUDA package only offers a couple of multiplication functions and is no longer generally outperforming the regular LinearAlgebra equivalents on modern CPUs.

There are a couple of CUDA-based general LAPACK projects, but they are still maturing. I believe that, with the current model of having to transfer the data up and down again for each call, operations like SVD and eigen-solving would take enough time for the transfer overhead to be a negligible portion at large but non-enormous size examples. And so those functions would be most promising for a significant gain. LU decompostion less so. I experimented with this a couple of years ago, but performance results were not encouraging because those functions were not yet fully-optimized (and some not yet implemented) on the card.

Another possibility for the future might be something like the Jacket add-on for Matlab. In essence, all the matrix/vector data could get pushed onto the GPU for the duration of an involved multi-operation computation. It can be accessed (transfered back) on demand, but the benefit is that repeated computations with it can be done without repeated up-down transfer. In this scenario only specially treated procedures -- compiled on the GPU using a CUDA compiler -- would compute with such data. So one would need a procedure that 1) did large LinearAlgebra, 2) did enough general computation for it to be worthwhile pushing the data up/down, 3) could be treated by an enhanced Compiler:-Compile to compile on the GPU and call the GPU LAPACK/BLAS.  Items 1) and 2) restrict the use cases. Item 3 is problematic because Compiler:-Compile doesn't yet translate Matrix/Vector operations to LAPACK/BLAS calls for regular CPU operations, so getting that for the GPU is an even further goal.

Having said all that, there a large chance that the OP is hoping for sweeping general computation for which parallelization would be automatically done under the hood by the Maple engine. That does not happen, even on the CPU. Having it happen on the GPU is more distant still.

The OP has a long history of asking bizarre questions with almost no context given.

@asa12 sometimes

@mmcdara Well, yes, that integration is (for this kind of example) an alternative to tallying as a means to computing the probabilities for each of the possible outcomes. It involves more calls to g.

But the full set of inputs -- to be passed to g -- is still produced with the nested seq. (The RVs assigned to Dice1 and Dice2 are not used.)

@erik10 I forgot to illustrate: You can also plot the CDF of the random variable W that I obtained above from an EmpiricalDistribution. I'll go back and add it in.

If you don't mind generating all the outcomes (as mmcdara has shown using a nested `seq`), then you could also proceed as follows.

(I originally assumed that you wanted Statistics to work out everything from the two Dice RandomVariables. But proceeding empirically seems like a viable alternative to wrestling with Statistic's handling of formulas combining discrete distributions.)

restart:

plots:-setoptions(gridlines=false):

with(Statistics):

U:=(a,b)->piecewise(a=1 or b=1,-4,abs(b-a)):

all:=[seq(seq(U(i,j),i=[$1..6]),j=[$1..6])];

[-4, -4, -4, -4, -4, -4, -4, 0, 1, 2, 3, 4, -4, 1, 0, 1, 2, 3, -4, 2, 1, 0, 1, 2, -4, 3, 2, 1, 0, 1, -4, 4, 3, 2, 1, 0]

M:=[op({op(all)})];

[-4, 0, 1, 2, 3, 4]

W:=RandomVariable(EmpiricalDistribution(all)):

Mean(W);

-1/9

Variance(W);

620/81

[seq(w=Probability(W=w),w=M)];

[-4 = 11/36, 0 = 5/36, 1 = 2/9, 2 = 1/6, 3 = 1/9, 4 = 1/18]

P:=unapply(ProbabilityFunction(W,x),x):

map(x->x=P(x),M);

[-4 = 11/36, 0 = 5/36, 1 = 2/9, 2 = 1/6, 3 = 1/9, 4 = 1/18]

S:=Sample(W,10^5):
Histogram(S,
          ':-binbounds'=[seq(j+1/2,j=min(S)-1..max(S))],
          ':-view'=0..12/36,
          ':-axis[2]'=[':-tickmarks'=[0=0,
                                      seq(i/36=sprintf("%a/%a",op(i/36)),i=1..12)],
                                     ':-gridlines'=[seq(i/36,i=1..11)]]);

map(P,M);
ListTools:-PartialSums(map(P,M));
L:=map(x->Probability(W<=x),M);

[11/36, 5/36, 2/9, 1/6, 1/9, 1/18]

[11/36, 4/9, 2/3, 5/6, 17/18, 1]

[11/36, 4/9, 2/3, 5/6, 17/18, 1]

LL:=map(u->u=`if`(u::fraction,sprintf("%a/%a",op(u)),u),[0,op(L)]):
plot(CDF(W,x), x=-5..5,
     filled, color="Niagara Azure", transparency=0,
     axis[2]=[tickmarks=LL, gridlines=LL],
     axes=frame, size=[410,400]);

# Or, depending on what you think the values on the axis should be...
#
LL:=map(u->u=`if`(u::fraction,sprintf("%a/%a",op(u)),u),[0,op(L)]):
plot(CDF(W,M[floor(i)]), i=0..nops(M)+1,
     filled, color="Niagara Azure", transparency=0,
     axis[1]=[tickmarks=[seq(i=M[i],i=1..nops(M))]],
     axis[2]=[tickmarks=LL, gridlines=LL],
     axes=frame, labels=["",""], size=[410,400]);

 

Download histo_fun_M4.mw

@erik10 I tried for some time last night to get ProbabilityFunction to produce something useful for this example, without success.

I tried with `Or`, as well as variations using min, etc, for the first conditional. I tried using variations using min/max/signum/abs for the fallback value. I also played with mmcdara's idea of using Heaviside.

In some variants (where I also used unknown x as the second argument) I was able to get an invalid nested piecewise thing, where the inner piecewise structures had lists (with two or no entries) as the values. And it seemed to not be inferring the right support from both random variables used in the formula. I'm going to submit a bug report. In some other variants I obtained incorrect results containing Dirac(0).

It seems more a pity, since Mean seems to handle it. I don't know exactly how Probability or ProbabilityFunction is trying to do things differently (using sum, say). I also don't know exactly how or whether it considers the "constructed" formula as itself representing a discrete RV.

I'll let you know, if I discover anything useful.

For a very old version like Maple 13, and for a 3D plot, you may need to use an approach like Kitonum has done here.

@mmcdara Actually it's even less useful to utilize `if`, since that applies evalb itself.

Above, I had it in my head that you were trying piecewise rather than `if`.

For the Mean example it is possible to grind it out with magic amounts of delayed evaluation. But it gets ugly. And that kind of unevaluation quote usage it prone to breakage since every intermediate evaluation strips off a pair. Not recommended.

restart;

with(Statistics):

Dice1 := RandomVariable(DiscreteUniform(1, 6)):
Dice2 := RandomVariable(DiscreteUniform(1, 6)):

K := (a,b) -> piecewise(Or(a=1,b=1), -4, abs(b-a));

K := proc (a, b) options operator, arrow; piecewise(a = 1 or b = 1, -4, abs(b-a)) end proc

K(Dice1,Dice2);

piecewise(_R = 1 or _R0 = 1, -4, abs(-_R0+_R))

Mean(K(Dice1,Dice2));

-1/9

W1 := (a,b) -> piecewise(a=1 or b=1, -4, abs(b-a));

W1 := proc (a, b) options operator, arrow; piecewise(a = 1 or b = 1, -4, abs(b-a)) end proc

W1(Dice1,Dice2);

abs(-_R0+_R)

Mean(W1(Dice1,Dice2));

35/18

W2 := (a,b) -> piecewise('a=1 or b=1', -4, abs(b-a));

proc (a, b) options operator, arrow; piecewise('a = 1 or b = 1', -4, abs(b-a)) end proc

W2(Dice1,Dice2);

piecewise(_R = 1 or _R0 = 1, -4, abs(-_R0+_R))

Mean(W2(Dice1,Dice2));

-1/9

W3 := (a,b) -> `if`(a=1 or b=1, -4, abs(b-a));

proc (a, b) options operator, arrow; `if`(a = 1 or b = 1, -4, abs(b-a)) end proc

W3(Dice1,Dice2);

abs(-_R0+_R)

Mean(W3(Dice1,Dice2));

35/18

W4 := (a,b) -> '''`if`'''('a=1 or b=1', -4, abs(b-a));

proc (a, b) options operator, arrow; ('''`if`''')('a = 1 or b = 1', -4, abs(b-a)) end proc

W4(Dice1,Dice2);

(''`if`'')(_R = 1 or _R0 = 1, -4, abs(-_R0+_R))

Mean(W4(Dice1,Dice2));

-1/9

 

Download ugh.mw

@mmcdara I have to dash out the door right now, but regarding Question 1,

The lowercase `or` and `and` commands do an immediate evalb of a strict equality or strict "non-equality".

a = b or u = 5;
                             false

a = b or u < 5;
                             u < 5

b <> 4 or 5 = 7;
                              true

a = b and b <> 4;
                             false

It can sometimes work if you use a pair of non-strict inequalities (which logically imply the strict equality). Eg,

a<=b and a>=b and u<5;
                       a <= b and b <= a and u < 5

Hence my use of the capitalized `Or` command.

Here's what you were, in effect, passing to the Mean command,

with(Statistics): 	
Dice1 := RandomVariable(DiscreteUniform(1, 6)):
Dice2 := RandomVariable(DiscreteUniform(1, 6)):

K := (a,b) -> `if`(a=1 or b=1, -4, abs(b-a) ):

K(Dice1, Dice2);
                         |-_R16 + _R15|

Mean(K(Dice1, Dice2));
                               35
                               --
                               18

Mean(abs(Dice2-Dice1));
                               35
                               --
                               18

@emendes From the second bullet point of the Description in the Help page,

"The return value of the command is the identity of the inserted parent Table, as a string. The display of the returned value can be suppressed by terminating the statement with a full colon."

You could read the other bullet points in that Description, as there are ramifications as to what you can and cannot do with Tabulate.

The return value, being the identity of the parent Table, could allow one you to utilize the repliceid option of the DocumentTools:-InsertContent command to redisplay entirely new content to the originally inserted Task region (even from another Execution Group). That's a very esoteric use, and I don't recall ever seeign anyone else use it.

It's easier to do running replacement from within the very same Execution Group, simply by calling Tabulate more than once. This can produce a nice effect if you want in-place overwriting of the displayed content at key points of a long computation.

But it's important to realize that, unlike plots:-display, the return value of Tabulate is not any kind of "plot" structure understood by commands from the plots package.

By the way, you can resize the GUI Table that is inserted by plots:-display. You can even right click in the Table and adjust the Table properties, to change the exterior and interior borders, width-mode, etc. But such changes are not stored in any usual programmatic structure, and are transient in the sense that repeating the plots:-display call will wipe out such manual changes.

In summary, Tabulate gives only an immediate and purely printing effect, but allows you to programmatically control more aspects of the encapsulating GUI Table than does plots:-display.

 

@Rim So, clearly you need at least some amount of trig simplification, because your examples might include the need to make, say, sin(t)^2+cos(t)^2 -> 1.

If you're building up bigger and bigger expressions within a loop, it's might be best to keep the intermediate expressions "simplified" at each iteration. But you may wish to avoid everything except basic factoring/grouping and that trig simplification. That is, you may be better off with some degree of control about expression swell and trig simplification at each iteration, but without `simplify` trying its hardest to get the very "simplest" representation at each iteration.

I'm mostly just guessing, because I don't have the full package or the complete examples.

So I'll make a few suggestions for replacement to the costly (according to your analysis) call to just `simplify`. You could try them out separately and re-profile. Perhaps one of them might be faster for your full examples.

The idea is to replace the call  simplfy(ee)  which you've identified as a bottleneck, where ee is just the single expression argument.

simplify(ee, trig)
simplify(normal(ee, trig))
`simplify/nosize`(ee, trig)
`simplify/nosize`(normal(ee),trig)

There are other actions that might be thrown into this mix (combine, factor, frontend, etc).

What might be even more fruitful is if you could produce some enormous, "wrong" final result that has not been simplified as you need. If you can produce such, you could upload a worksheet here as an attachment (big green up-arrow).

What version are you using, BTW?

You seem to have made a typo as a leading minus sign on your second example.

some_trig_simp.mw

@Rim Suppose that the code just calls simplify at some internal part of the code.

One possibility is that the code does this inefficiently, eg. calling it unnecessarily and repetitively inside a loop. Presumably you are already checking out for that.

Another possibility is that simplify is too costly a tool for the simplification task at hand. We might be able to help there, but it would likely require several large examples of both input expression and desired output expression. There may be a combination of lighter weight commands that produce the same simplifications.

If you repost the procedure's source code here the please provide either a URL where you got it or the name of the author(s).

First 71 72 73 74 75 76 77 Last Page 73 of 439