## 5635 Reputation

7 years, 188 days

## How to build a multivariate random varia...

Maple

For years I've been angry that Maple isn't capable of formally manipulating random vectors (aka multivariate random variables).
For the record Mathematica does.

The problem I'm concerned with is to create a vector W such that

`type(W, RandomVariable)`

will return true.
Of course defining W from its components w1, .., wN, where each w is a random variable is easy, even if these components are correlated or, more generally dependent ( the two concepts being equivalent iif all the w are gaussian random variables).
But one looses the property that W is no longer a (multivariate) random variable.
See a simple example here: NoRandomVectorsInMaple.mw

This is the reason why I've developped among years several pieces of code to build a few multivariate random variable (multinormal, Dirichlet, Logistic-Normal, Skew Multivariate Normal, ...).

In the framework of my activities, they are of great interest and the purpose of this post is to share what I have done on this subject by presenting the most classic example: the multivariate gaussian random variable.

My leading idea was (is) to build a package named MVStatistics on the image of the Statistics package but devoted to Multi Variate random variables.
I have already construct such a package aggregating about fifty different procedures. But this latter doesn't merit the appellation of "Maple package" because I'm not qualified to write something like this which would be at the same time perennial, robust, documented, open and conflict-free with the  Statistics package.
In case any of you are interested in pursuing this work (because I'm about to change jobs), I can provide it all the different procedures I built to construct and manipulate multivariate random variables.

To help you understand the principles I used, here is the most iconic example of a multivariate gaussian random variable.
The attached file contains the following procedures

```MVNormal
Constructs a gaussian random vector whose components can be mutually correlated
The statistics defined in Distribution are: (this list could be extended to other
statistics, provided they are "recognized" statitics, see at the end of this
post):
PDF
Mode
Mean
Variance
RandomSample

DispersionEllipse
Builds and draws the dispersion ellipses of a bivariate gaussia, random vector

DispersionEllipsoid
Builds and draws the dispersion ellipsoids of a trivariate gaussia, random vector

MVstat
Computes several statistics of a random vector (Mean, Variance, ...)

Iserlis
Computes the moments of any order of a gaussian random vector

MVCentralMoment
Computes the central moments of a gaussian random vector

Conditional
Builds the conditional random vector of a gaussian random vector wrt some of its components
the moments of any order of a gaussian random vector.
Note: the result has type RandomVariable.

MarginalizeAgainst
Builds the marginal random vector of a gaussian random vector wrt some of its components
the moments of any order of a gaussian random vector.
Note: the result has type RandomVariable.

MardiaNormalityTest
The multi-dimensional analogue of the Shapiro-Wilks normality test

HZNormalityTest
Henze-Zirkler test for Multivariate Normality

MVWaldWolfowitzTest
A multivariate version of the non-parametrix Wald-Folfowitz test
```

Do not hesitate to ask me any questions that might come to mind.
In particular, as Maple introduces limitations on the type of some attributes (for instance Mean  must be of algebraic type), I've been forced to lure it by transforming vector or matrix quantities into algebraic ones.
An example is

`Mean = add(m[k]*x[k], k=1..K)`

where m[k] is the expectation of the kth component of this random vector.
This implies using the procedure MVstat to "decode", for instance, what Mean returns and write it as a vector.

MultivariateNormal.mw

About the  statistics ths Statistics:-Distribution constructor recognizes:
To get them one can do this (the Normal distribution seems to be the continuous one with the most exhaustive list os statistics):

```restart
with(Statistics):
X := RandomVariable(Normal(a, b)):
attributes(X);
protected, RandomVariable, _ProbabilityDistribution

map(e -> printf("%a\n", e), [exports(attributes(X)[3])]):
Conditions
ParentName
Parameters
CharacteristicFunction
CDF
CGF
HodgesLehmann
Mean
Median
MGF
Mode
PDF
RousseeuwCrouxSn
StandardDeviation
Support
Variance
CDFNumeric
QuantileNumeric
RandomSample
RandomSampleSetup
RandomVariate
MaximumLikelihoodEstimate
```

Unfortunately it happens that for some unknown reason a few statistics cannot be set by the user.
This is for instance the case of Parameters serious consequences in certain situations.
Among the other statistics that cannot be set by the user one finds:

• ParentName,
• QuantileNumeric  whose role is not very clear, at least for me, but which I suspect is a procedure which "inverts" the CDF to give a numerical estimation of a quantile given its probability.
If it is so accessing  QuantileNumeric would be of great interest for distributions whose the quantiles have no closed form expressions.
• CDFNumeric  (same remark as above)

Finally, the statistics Conditions, which enables defining the conditions the elements of Parameters must verify are not at all suited for multivariate random variables.
It is for instance impossible to declare that the variance matrix (or the correlation matrix) is a square symmetric positive definite matrix).

## An existential question...

MaplePrimes

What do you think is the acceptable limit to the effort required to answer a question?

At what point does the question-and-answer game between two contributors become unreasonable?

How do you, the most highly ranked, deal with situations that last for days?

## Solving a least-squares problem with QR ...

Maple

This post is motivated by a question asked by @vs140580  ( The program is making intercept zero even though There is a intercept in regression Fit (A toy code showing the error attached) ).

The problem met by @vs140580 comes from the large magnitudes of the (two) regressors and the failure to Fit/LinearFit to find the correct solution unless an undecent value of Digits is used.
This problem has been answerd by @dharr after scaling the data (which is always, when possible, a good practice) and by
myself while using explicitely the method called "Normal Equations" (see https://en.wikipedia.org/wiki/Least_squares).

The method of "Normal Equations" relies upon the inversion of a symmetric square matrix H whose dimension is equal to the number of coefficients of the model to fit.
It's well known that this method can potentially lead to matrices H extremely ill-conditionned, and thus face severe numerical problems (the most common situation being the fit of a high degree polynomial).

• In English: http://www.math.kent.edu/~reichel/courses/intr.num.comp.1/fall09/lecture4/lecture4.pdf
• In French: https://moodle.utc.fr/pluginfile.php/24407/mod_resource/content/5/MT09-ch3_ecran.pdfI

The attached file illustrates how the QR decomposition method works.
The test case is @vs140580's.

Maybe the development team could enhance Fit/LinearFit in future versions by adding an option which specifies what method is to be used?

 > restart:
 > with(Statistics):
 > interface(version)
 (1)
 > Data := Matrix([[4.74593554708566, 11385427.62, 2735660038000], [4.58252830679671, 25469809.77, 12833885700000], [4.29311160501838, 1079325200, 11411813200000000], [4.24176959154225, 1428647556, 18918585950000000], [5.17263072694618, 1428647556, 18918585950000000], [4.39351114955735, 1877950416, 30746202150000000], [4.39599006758777, 1428647556, 18918585950000000], [5.79317412396815, 2448320309, 49065217290000000], [4.48293612651735, 2448320309, 49065217290000000], [4.19990181982522, 2448320309, 49065217290000000], [5.73518217699046, 1856333905, 30648714900000000], [4.67943831980476, 3071210420, 75995866910000000], [4.215240105336, 2390089264, 48670072110000000], [4.41566877563247, 3049877383, 75854074610000000], [4.77780395369828, 2910469403, 74061327950000000], [4.96617430604669, 1416936352, 18891734280000000], [4.36131111330988, 1416936352, 18891734280000000], [5.17783192063198, 1079325200, 11411813200000000], [4.998266287191, 1067513353, 11402362980000000], [4.23366152474871, 2389517120, 48661380410000000], [4.58252830679671, 758079709.3, 5636151969000000], [6.82390874094432, 1304393838, 14240754750000000], [4.24176959154225, 912963601.2, 8621914602000000], [4.52432881167557, 573965555.4, 3535351888000000], [4.84133601918601, 573965555.4, 3535351888000000], [6.88605664769316, 732571773.2, 5558875538000000], [4.35575841415627, 1203944381, 13430693320000000], [4.42527441640593, 955277678, 8795128298000000], [6.82390874094432, 997591754.9, 8968341995000000], [4.35144484433733, 143039477.1, 305355143300000]]):
 > # Direct use of LinearFit. # # As far as I know LinearFit is based on the resolution of the "Normal Equations" # (see further down), a system of equations that is known to be ill-conditioned # when regressors have large values (in particular when polynomial regression # is used). X := Data[.., [2, 3]]: Y := Data[.., 1]: LinearFit(C1+C2*v+C3*w, X, Y, [v, w]);
 (2)
 > # For roundoff issues the 3-by-3 matrix involved in the "Normal Equations" (NE) # appears to of rank < 3. # The rank of this matrix is rqual to 1+rank(X) and one can easily verify that # the 2 columns of X are linearly independent: LinearAlgebra:-LinearSolve(X, Vector(numelems(Y), 0)); LinearAlgebra:-Rank(X);
 (3)
 > # Solve the least squares problem by using explicitely the NE. # # To account for an intercept we augment X by a vector column of "1" # traditionally put in column one. Z := `<|>`(Vector(numelems(Y), 1), X):   A := (Z^+ . Z)^(-1) . Z^+ . Y;          # Normal Equations
 (4)
 > # What is the rank of Z? # Due to the scale of compared to "1", Rank fails to return the good value # of rank(Z), which is obviously equal to rank(X)+1. LinearAlgebra:-LinearSolve(Z, Vector(numelems(Y), 0)); LinearAlgebra:-Rank(Z);
 (5)

A WORKAROUND : SCALING THE DATA

 > model := unapply( LinearFit(C1+C2*v+C3*w, Scale(X), Scale(Y), [v, w]), [v, w] );
 (6)
 > mX, sX := (Mean, StandardDeviation)(X); mY, sY := (Mean, StandardDeviation)(Y);
 (7)
 > MODEL := model((x1-mX[1])/sX[1], (x2-mX[2])/sX[2]) * sY + mY
 (8)
 > # Check that the vector of regression coefficients is almost equal to A found above # relative error lesst than 10^(-14) A_from_scaling       := < coeffs(MODEL) >: Relative_Discrepancy := (A_from_scaling - A) /~ A
 (9)

THE QR DECOMPOSITION  (applied on raw data)

The QR decomposition, as well as Given's rotation method, are two alternatives to the the NE method
to find the vector of regression coefficients.
Both of them are known to be less sensitive to the magnitudes of the regressors and do nt require (not
always) a scaling of the data (which can be quite complex with polynomial regression or when some
transformation is used to liearize the statistical model, for instanc Y=a*exp(b*X) --> log(Y)=log(a)+b*X).

 > N := numelems(Y); P := numelems(Z[1]);
 (10)
 > # Perform the QR decomposition of Z. Q, R := LinearAlgebra:-QRDecomposition(Z, fullspan);
 (11)
 > # Let C the column vector of length P defined by: C := (Q^+ . Y)[1..P];
 (12)
 > # Then the vector of regression coefficients is given by: A_QR                 := (R[1..P, 1..P])^(-1) . C; Relative_Discrepancy := (A_QR - A) /~ A
 (13)
 > # The matrix H = Z^+ . Z writes H                    := Z^+ . Z: H_QR                 := R^+ . Q^+ . Q . R: Relative_Discrepancy := (H_QR - H) /~ H
 (14)
 > # H_QR expression is required to obtain the covariance matrix of the regression coefficients.

## What is the true equation your time and...

Maple

This post is closely related to this one Centered Divided Difference approximations whose purpose is to build Finite-Difference (FD) approxmation schemes.
In this latter post I also talked, without explicitely naming it, about Truncation Error, see for instance https://en.wikiversity.org/wiki/Numerical_Analysis/Truncation_Errors.

I am foccusing here on a less known concept named "Equivalent Equation" (sometimes named "Modified Equation").
The seminal paper (no free acccess, which is surprising since it was first published 50 years ago) is this one by Warming and Hyett https://www.sciencedirect.com/science/article/pii/0021999174900114.
For a scholar example you can see here https://guillod.org/teaching/m2-b004/TD2-solution.pdf.
An alternative method to that of Warming and Hyett to derive the Equivalent Equation is given here (in French)
http://www.numdam.org/item/M2AN_1997__31_4_459_0.pdf (Carpentier et al.)

I never heard of the concept of Modified Equation applied to elliptic PDEs ; it's main domain of application is advection PDEs (parabolic PDEs in simpler cases).

Basically, any numerical scheme for solving an ODE or a PDE has a Truncation Error: this means there is some discrepancy between the exact solution of this PDE and the solution the scheme provides.
This discrepancy depends on the truncation error, in space alone for ODEs or elliptic PDEs, or in space and time for hyperbolic or advection PDEs.

One main problem with the Truncation Error is that it doesn't enable to understand the impact it will have on the solution it returns. For instance, will this sheme introduce diffusion, antidiffusion, scattering, ...
The aim of the Modified Equation is to answer these questions by constructing the continuous equation a given numerical scheme solves with a zero truncation error.
This is the original point of view Warming and Hyett developped in their 1974 paper.

This is a subject I work on 30 years ago (Maple V), and later in 2010.
It is very easy with Maple to do the painstaking development that Warming and Hyett did by hand half a century ago. And it is even so easy that the trick used by Carpentier et al. to make the calculations easier has lost some of its interest.

Two examples are given in the attched file
EquivalentEquation.mw

If a moderator thinks that this post should be merged with Centered Divided Difference approximations, he can do it freely, I won't be offended.

## Centered Divided Difference approximatio...

Maple

This code enables building Centered Divided Difference (CDD) approximations of derivatives of a univariate function.
Depending on the stencil we choose we can get arbitrary high order approximations.

The extension to bivariate functions is based upon what is often named tensorization in numerical analysis: for instance diff(f(x, y), [x, y] is obtained this way (the description here is purely notional)

1. Let CDD_x the CDD approximation of diff( f(x), x) ) .
CDD_x is a linear combination of shifted replicates of f(x)
2. Let s one of this shifted replicates
Let CDD_y(s) the CDD approximation of diff( s(y), y) ) .
3. Replace in CDD_x all shifted replicates by their corresponding expression CDD_y(s)

REMARKS:

• When I write for instance "approximation of diff(f(x), x)", this must be intended as a short for "approximation of diff(f(x), x) at point x=a"
• I agree that a notation like, for instance, diff(f(a), a) is not rigourous and that something like a Liebnitz notation would be better. Unfortunately I don't know how to get it when I use mtaylor.

 > restart:

CDDF stands for Cendered Divided Difference Formula

 > CDDF := proc(f, A, H, n, stencil)   description "f = target function,\nA = point where the derivatives are approximated,\nH = step,\nn = order of the derivative,\nstencil = list of points for the divided differenceCDDF\n";   local tay, p, T, sol, unknown, Unknown, expr:   tay := (s, m) -> convert(                      eval(                        convert(                          taylor(op(0, f)(op(1, f)), op(1, f)=A, m),                          Diff                        ),                        op(1, f)=A+s*H),                      polynom                    ):   p   := numelems(stencil):   T   := add(alpha[i]*tay(i, p+1), i in stencil):   T   := convert(%, diff):   if p > n+1 then     sol := solve([seq(coeff(T, h, i)=0, i in subsop(n+1=NULL, [\$0..p]))], [seq(alpha[i], i in stencil)])[];   else     sol := solve([seq(coeff(T, H, i)=0, i in subsop(n+1=NULL, [\$0..n]))], [seq(alpha[i], i in stencil)])[];   end if:   if `and`(is~(rhs~(sol)=~0)[]) then     WARNING("no solution found"):     return   else     unknown := `union`(indets~(rhs~(sol))[])[];     Unknown := simplify(solve(eval(T, sol) = Diff(op(0, f)(A), A\$n), unknown));     sol     := lhs~(sol) =~ eval(rhs~(sol), unknown=Unknown);     expr    := normal(eval(add(alpha[i]*op(0, f)(A+i*H), i in stencil), sol));   end if:   return expr end proc:
 > Describe(CDDF)
 # f = target function, # A = point where the derivatives are approximated, # H = # step, # n = order of the derivative, # stencil = list of points for the divided # differenceCDDF # CDDF( f, A, H, n, stencil )
 > # 2-point approximation of diff(f(x), x) | x=a CDDF(f(x), a, h, 1, [-1, 1]); convert(simplify(mtaylor(%, h=0, 4)), Diff);
 (1)
 > # 3-point approximation of diff(f(x), x\$2) | x=a CDDF(f(x), a, h, 2, [-1, 0, 1]); convert(simplify(mtaylor(%, h=0)), Diff);
 (2)
 > # 5-point pproximation of diff(f(x), x\$2) | x=a CDDF(f(x), a, h, 2, [\$-2..2]); convert(simplify(mtaylor(%, h=0, 8)), Diff);
 (3)
 > # 7-point approximation of diff(f(x), x\$2) | x=a CDDF(f(x), a, h, 2, [\$-3..3]); # simplify(taylor(%, h=0, 10)); convert(simplify(mtaylor(%, h=0, 10)), Diff);
 (4)
 > # 4-point staggered approximation of diff(f(x), x\$3) | x=a CDDF(f(x), a, h, 3, [seq(-3/2..3/2, 1)]); convert(simplify(mtaylor(%, h=0, 6)), Diff);
 (5)
 > # 6-point staggered approximation of diff(f(x), x\$3) | x=a CDDF(f(x), a, h, 3, [seq(-5/2..5/2, 1)]); # simplify(taylor(%, h=0, 8)); convert(simplify(mtaylor(%, h=0, 8)), Diff);
 (6)
 > # 5-point approximation of diff(f(x), x\$4) | x=a CDDF(f(x), a, h, 4, [\$-2..2]); convert(simplify(mtaylor(%, h=0, 8)), Diff);
 (7)
 > # 7-point approximation of diff(f(x), x\$4) | x=a CDDF(f(x), a, h, 4, [\$-3..3]); convert(simplify(mtaylor(%, h=0, 10)), Diff);
 (8)

A FEW 2D EXTENSIONS

 > # diff(f(x, y), [x, y]) approximation over a (2 by 2)-point stencil stencil := [-1, 1]: # step 1: approximate diff(f(x, y), x) over stencil < stencil > fx  := CDDF(f(x), a, h, 1, stencil): fx  := eval(% , f=(u -> f[u](y))): ix  := [indets(fx, function)[]]: # step 2: approximate diff(g(y), y) over stencil < stencil > where #         g represents any function in fx. fxy := add(map(u -> CDDF(u, b, k, 1, stencil)*coeff(fx, u), ix)): # step 3: rewrite fxy in a more convenient form [seq(u=op([0, 0], u)(op([0, 1], u), op(1, u)), u in indets(fxy, function))]: fxy := simplify( eval(fxy, %) ); convert(mtaylor(fxy, [h=0, k=0]), Diff)
 (9)
 > # Approximation of diff(f(x, y), [x, x, y, y] a (3 by 3)-point stencil stencil := [-1, 0, 1]: # step 1: approximate diff(f(x, y), x) over stencil < stencil > fx  := CDDF(f(x), a, h, 2, stencil): fx  := eval(% , f=(u -> f[u](y))): ix  := [indets(fx, function)[]]: # step 2: approximate diff(g(y), y) over stencil < stencil > where #         g represents any function in fx. fxy := add(map(u -> CDDF(u, b, k, 2, stencil)*coeff(fx, u), ix)): # step 3: rewrite fxy in a more convenient form [seq(u=op([0, 0], u)(op([0, 1], u), op(1, u)), u in indets(fxy, function))]: fxy := simplify( eval(fxy, %) ); convert(mtaylor(fxy, [h=0, k=0], 8), Diff)
 (10)
 > # Approximation of diff(f(x, y), [x, x, y] a (3 by 2)-point stencil stencil_x := [-1, 0, 1]: stencil_y := [-1, 1]: # step 1: approximate diff(f(x, y), x) over stencil < stencil > fx  := CDDF(f(x), a, h, 2, stencil_x): fx  := eval(% , f=(u -> f[u](y))): ix  := [indets(fx, function)[]]: # step 2: approximate diff(g(y), y) over stencil < stencil > where #         g represents any function in fx. fxy := add(map(u -> CDDF(u, b, k, 1, stencil_y)*coeff(fx, u), ix)): # step 3: rewrite fxy in a more convenient form [seq(u=op([0, 0], u)(op([0, 1], u), op(1, u)), u in indets(fxy, function))]: fxy := simplify( eval(fxy, %) ); convert(mtaylor(fxy, [h=0, k=0], 6), Diff)
 (11)
 > # Approximation of the laplacian of f(x, y) stencil := [-1, 0, 1]: # step 1: approximate diff(f(x, y), x) over stencil < stencil > fx  := CDDF(f(x), a, h, 2, stencil): fy  := CDDF(f(y), b, k, 2, stencil): fxy := simplify( eval(fx, f=(u -> f(u, b))) + eval(fy, f=(u -> f(a, u))) ); convert(mtaylor(fxy, [h=0, k=0], 6), Diff)
 (12)
 >