:

## Several contentious issues in the treatment of discrete random variables

Hi,

This is more of an open discussion than a real question. Maybe it would gain to be displaced in the post section?

Working with discrete random variables I found several inconsistencies or errors.
In no particular order:

• The support of a discrete RV is not defined correctly (a real range instead of a countable set)
• The plot of the probability function (which, in my opinion, would gain to be renamed "Probability Mass Function, see https://en.wikipedia.org/wiki/Probability_mass_function) is not correct.
• The  ProbabiliytFunction of a discrte rv of EmpiricalDistribution can be computed at any point, but its formal expression doesn't exist (or at least is not accessible).
• Defining the discrete rv "toss of a fair dice"  with EmpiricalDistribution and DiscreteUniform gives different results.

The details are given in the attached file and I do hope that the companion text is clear enough to point the issues.
I believe there is no major issues here, but that Maple suffers of some lack of consistencies in the treatment of discrete (at least some) rvs. Nothing that could easily be fixed.

As I said above, if some think this question has no place here and ought to me moved to the post section, please feel free to do it.

 > restart:
 > with(Statistics):

Two alternate ways to define a discrete random variable on a finite set
of equally likely outcomes.

 > Universe    := [\$1..6]: toss_1_dice := RandomVariable(EmpiricalDistribution(Universe)); TOSS_1_DICE := RandomVariable(DiscreteUniform(1, 6));
 (1)

Let's look to the ProbabilityFunction of each RV

 > ProbabilityFunction(toss_1_dice, x); ProbabilityFunction(TOSS_1_DICE, x);
 (2)

It looks like the procedure ProbabilityFunction is not an attribute of RV with EmpiticalDistribution.
Let's verify

 > law := [attributes(toss_1_dice)][3]: lprint(exports(law))
 Conditions, ParentName, Parameters, CDF, DiscreteValueMap, Mean, Median, Mode, ProbabilityFunction, Quantile, Specialize, Support, RandomSample, RandomVariate

Clearly ProbabilityFunction is an attribute of toss_1_dice.

In fact it appears the explanation of the difference of behaviours relies upon different definitions
of the set of outcomes of toss_1_dice and TOSS_1_DICE

 > LAW := [attributes(TOSS_1_DICE)][3]: exports(LAW): law:-Conditions; LAW:-Conditions;
 (3)

From :-Conditions one can see that toss_1_dice is realy a discrete RV defined on a countable set of outcomes,
but that nothing is said about the set over which TOSS_1_DICE is defined.

The truly discrete definition of toss_1_dice is confirmed here :
(the second result is correct

ProbabilityFinction(toss_1_dice, x) = {0 if x < 1, 0 if x > 6, 1/6 if x::integer, 0 otherwise

 > ProbabilityFunction~(toss_1_dice, Universe); ProbabilityFunction~(toss_1_dice, [seq(0..7, 1/2)]);
 (4)

One can also see that the Support of both of these RVs are wrong

(see for instance https://en.wikipedia.org/wiki/Discrete_uniform_distribution)

There should be {1, 2, 3, 4, 5, 6}, not a RealRange.

 > Support(toss_1_dice); Support(TOSS_1_DICE);
 (5)
 >

Now this is the surprising ProbabilityFunction of TOSS_1_DICE.
This obviously wrong result probably linked to the weak definition of the conditions for this RB.

 > # plot(ProbabilityFunction(TOSS_1_DICE, x), x=0..7); plot(ProbabilityFunction(TOSS_1_DICE, x), x=0..7, discont=true)

These differences of treatments raise a lot of questions :
-  Why is a DiscreteUniform RV not defined on a countable set?
-  Why does the ProbabilityFunction of an EmpiricalDistribution return no result
if its second parameter is not set to one  its outcomes.

All this without even mentioning the wrong plot shown above.

I believe something which would work like the module below would be much better than what is done

right now

 > EmpiricalRV := module() export MassDensityFunction, PlotMassDensityFunction, Support: MassDensityFunction := proc(rv, x)   local u, v, N:   u := [attributes(rv)][3]:   if u:-ParentName = EmpiricalDistribution then     v := op([1, 1], u:-Conditions);     N := numelems(v):     return piecewise(op(op~([seq([x=v[n], 1/N], n=1..N)])), 0)   else     error "The random variable does not have an EmpiricalDistribution"   end if end proc: PlotMassDensityFunction := proc(rv, x1, x2)   local u, v, a, b:   u := [attributes(rv)][3]:   if u:-ParentName = EmpiricalDistribution then     v := op([1, 1], u:-Conditions);     a := select[flatten](`>=`, v, x1);     b := select[flatten](`<=`, a, x2);     PLOT(seq(CURVES([[n, 0], [n, 1/numelems(v)]], COLOR(RGB, 0, 0, 1), THICKNESS(3)), n in b), VIEW(x1..x2, default))   else     error "The random variable does not have an EmpiricalDistribution"   end if end proc: Support := proc(rv, x1, x2)   local u, v, a, b:   u := [attributes(rv)][3]:   if u:-ParentName = EmpiricalDistribution then     v := op([1, 1], u:-Conditions);     return {entries(v, nolist)}   else     error "The random variable does not have an EmpiricalDistribution"   end if end proc: end module:
 > EmpiricalRV:-MassDensityFunction(toss_1_dice, x);
 (6)
 > f := unapply(EmpiricalRV:-MassDensityFunction(toss_1_dice, x), x): f(2); f(5/2);
 (7)
 > EmpiricalRV:-PlotMassDensityFunction(toss_1_dice, 0, 7);
 >