Daniel Skoog

## 1766 Reputation

13 years, 167 days

## Maple Assistants in MapleCloud...

Maple

As you know, the MapleCloud is a good way to share all sorts of interactive documents with others, in private groups or so they are accessible to everyone. We recently posted some new content that I thought people might be particularly interested in: a collection of Maple Assistants.

Up until now, Maple Assistants were only available from within Maple, but now you can take advantage of these powerful tools wherever you are, using your web browser.

Code Generation  - Translate Maple code to C, Java, Python, R, and more

Scientific Constants – Explore over 20000 values of physical constants and properties of chemical elements, including units and uncertainty values

Special Functions – Explore the properties of over 200 special functions, including the Hypergeometric, Bessel, Mathieu, Heun and Legendre families of functions.

Units Converter – Convert between over 500 units of measurement. (In addition to the standard stuff, you can find out how many fortnights old you are, or how long your commute is in furlongs!)

## ClusterAnalysis applications: Finding a ...

Maple 2017

A project that I have been working on is adding some functionality for Cluster Analysis to Maple (a small part of a much bigger project to increase Maple’s toolkit for exploratory data mining and data analysis). The launch of the MapleCloud package manager gave me a way to share my code for the project as it evolves, providing others with some useful new tools and hopefully gathering feedback (and collaborators) along the way.

At this point, there aren’t a lot of commands in the ClusterAnalysis package, but I have already hit upon several interesting applications. For example, while working on a command for plotting clusters of points, one problem I encountered was how to draw the minimal volume enclosing ellipsoid around a group (or cluster) of points. After doing some research, I stumbled upon Khachiyan’s Algorithm, which related to solving linear programming problems with rational data. The math behind this is definitely interesting, but I’m not going to spend any time on it here. For further reading, you can explore the following:

Khachiyan’s Algorithm had previously been applied in some other languages, but to the best of my knowledge, did not have any Maple implementations. As such, the following code is an implementation of Khachiyan’s Algorithm in 2-D, which could be extended to N-dimensional space rather easily.

This routine accepts an Nx2 dataset and outputs either a plot of the minimum volume enclosing ellipsoid (MVEE) or a list of results as described in the details for the ‘output’ option below.

MVEE( X :: DataSet, optional arguments, additional arguments passed to the plotting command );

The optional arguments are as follows:

• tolerance : realcons;  specifies the convergence criterion
• maxiterations : posint; specifies the maximum number of iterations
• output : {identical(data,plot),list(identical(data,plot))}; specifies the output. If output includes plot, then a plot of the enclosing ellipsoid is returned. If output includes data, then the return includes is a list containing the matrix A, which defines the ellipsoid, the center of the ellipse, and the eigenvalues and eigenvectors that can be used to find the semi-axis coordinates and the angle of rotation, alpha, for the ellipse.
• filled : truefalse; specifies if the returned plot should be filled or not

Code:

#Minimum Volume Enclosing Ellipsoid
MVEE := proc(XY,
{tolerance::positive:= 1e-4}, #Convergence Criterion
{maxiterations::posint := 100},
{output::{identical(data,plot),list(identical(data,plot))} := data},
{filled::truefalse := false}
)

local alpha, evalues, evectors, i, l_error, ldata, ldataext, M, maxvalindex, n, ncols, nrows, p1, semiaxes, stepsize, U, U1, x, X, y;
local A, center, l_output; #Output

if hastype(output, 'list') then
l_output := output;
else
l_output := [output];
end if;

kernelopts(opaquemodules=false):

ldata := Statistics:-PreProcessData(XY, 2, 'copy');

nrows, ncols := upperbound(ldata);
ldataext := Matrix([ldata, Vector[column](nrows, ':-fill' = 1)], 'datatype = float');

if ncols <> 2 then
error "expected 2 columns of data, got %1", ncols;
end if;

l_error := 1;

U := Vector[column](1..nrows, 'fill' = 1/nrows);

##Khachiyan Algorithm##
for n to maxiterations while l_error >= tolerance do

X := LinearAlgebra:-Transpose(ldataext) . LinearAlgebra:-DiagonalMatrix(U) . ldataext;
M := LinearAlgebra:-Diagonal(ldataext . LinearAlgebra:-MatrixInverse(X) . LinearAlgebra:-Transpose(ldataext));
maxvalindex := max[index](map['evalhf', 'inplace'](abs, M));
stepsize := (M[maxvalindex] - ncols - 1)/((ncols + 1) * (M[maxvalindex] - 1));
U1 := (1 - stepsize) * U;
U1[maxvalindex] := U1[maxvalindex] + stepsize;
l_error := LinearAlgebra:-Norm(LinearAlgebra:-DiagonalMatrix(U1 - U));
U := U1;

end do;

A := (1/ncols) * LinearAlgebra:-MatrixInverse(LinearAlgebra:-Transpose(ldata) . LinearAlgebra:-DiagonalMatrix(U) . ldata - (LinearAlgebra:-Transpose(ldata) . U) . LinearAlgebra:-Transpose((LinearAlgebra:-Transpose(ldata) . U)));
center := LinearAlgebra:-Transpose(ldata) . U;
evalues, evectors := LinearAlgebra:-Eigenvectors(A);
evectors := evectors(.., sort[index](1 /~ (sqrt~(Re~(evalues))), `>`, ':-output' = ':-permutation'));
semiaxes := sort(1 /~ (sqrt~(Re~(evalues))), `>`);
alpha := arctan(Re(evectors[2,1]) / Re(evectors[1,1]));

if l_output = [':-data'] then
return A, center, evectors, evalues;
elif has( l_output, ':-plot' ) then
x := t -> center[1] + semiaxes[1] * cos(t) * cos(alpha) - semiaxes[2] * sin(t) * sin(alpha);
y := t -> center[2] + semiaxes[1] * cos(t) * sin(alpha) + semiaxes[2] * sin(t) * cos(alpha);
if filled then
p1 := plots:-display(subs(CURVES=POLYGONS, plot([x(t), y(t), t = 0..2*Pi], ':-transparency' = 0.95, _rest)));
else
p1 := plot([x(t), y(t), t = 0..2*Pi], _rest);
end if;
return p1, `if`( has(l_output, ':-data'), op([A, center, evectors, evalues]), NULL );
end if;

end proc:

You can run this as follows:

M:=Matrix(10,2,rand(0..3)):

plots:-display([MVEE(M,output=plot,filled,transparency=.3),
plots:-pointplot(M, symbol=solidcircle,symbolsize=15)],
size=[0.5,"golden"]);

As it stands, this is not an export from the “work in progress” ClusterAnalysis package – it’s actually just a local procedure used by the ClusterPlot command. However, it seemed like an interesting enough application that it deserved its own post (and potentially even some consideration for inclusion in some future more geometry-specific package). Here’s an example of how this routine is used from ClusterAnalysis:

with(ClusterAnalysis);

X := Import(FileTools:-JoinPath(["datasets/iris.csv"], base = datadir));

kmeans_results := KMeans(X[[`Sepal Length`, `Sepal Width`]],
clusters = 3, epsilon = 1.*10^(-7), initializationmethod = Forgy);

ClusterPlot(kmeans_results, style = ellipse);

The source code for this is stored on GitHub, here:

https://github.com/dskoog/Maple-ClusterAnalysis/blob/master/src/MVEE.mm

If you don’t have a copy of the ClusterAnalysis package, you can install it from the MapleCloud window, or by running:

PackageTools:-Install(5629844458045440);

## Finding better adjusted r-squared values...

Maple

A couple of weeks ago, I recorded a short video that discussed various applications for the Statistics:-Fit command. One of the more interesting examples examined how manually adjusting the number of parameters used for a regression model affected the resulting adjusted r-squared value.

I won’t go into detail about r-squared here, but to briefly summarize: In a linear regression model, r-squared measures the proportion of the variation in a model's dependent variable explained by the independent variables. Basically, r-squared gives a statistical measure of how well the regression line approximates the data. R-squared values usually range from 0 to 1 and the closer it gets to 1, the better it is said that the model performs as it accounts for a greater proportion of the variance (an r-squared value of 1 means a perfect fit of the data). When more variables are added, r-squared values typically increase. They can never decrease when adding a variable; and if the fit is not 100% perfect, then adding a variable that represents random data will increase the r-squared value with probability 1. The adjusted r-squared attempts to account for this phenomenon by adjusting the r-squared value based on the number of independent variables in the model.

The formula for the adjusted r-squared is:

Where:

n is the number of points in the data sample

k is the number of independent variables in the model excluding the constant

By taking the number of independent variables into consideration, the adjusted r-squared behaves different than r-squared; adding more variables doesn’t necessarily produce better fitting models. In many cases, more variables can often lead to lower adjusted r-squared values. In particular, if you add a variable representing random data, the expected change in the adjusted r-squared is 0.

As such, the adjusted r-squared has a slightly different interpretation than the r-squared. While r-squared is perceived to give an indication of the measure of fit for a chosen regression model, the adjusted r-squared is perceived more as a comparative tool that can be useful for picking variables and designing models that may require less predictors than other models. The science of “gaming” models is a broad topic, so I won’t go into any more detail here, but there’s lots of great information out there if you are looking to learn more (here’s a good place to start).

The following example adjusts a fitted model by adding or removing variables in order to find better adjusted r-squared values.

with(Statistics):

The Import command reads a datafile into a new DataFrame.

ExperimentalData := Import(FileTools:-JoinPath(["Excel", "ExperimentalData.xls"], base = datadir));

The dataset has seven variables: time and experimental readings for 6 various concentrations. Removing “time” from our variable set, the convert command converts the values in the DataFrame to a Matrix of values.

ExMat := convert( ExperimentalData, Matrix )[..,2..7];

We start by fitting a model that includes predicting variables for each of the columns of data. We mark “Concentration A” as our dependent variable.

Fit( C + C2*v + C3*w + C4*x + C5*y + C6*z, ExMat[..,2..6], ExMat[..,1], [v,w,x,y,z], summarize=embed ):

From the above, we can observe that both the r-squared and adjusted r-squared are reasonably high, however only one of the coefficient values has a significant p-value, C3.

Note: Maple shows all p-values less than 0.05 in bold.

Let's try to fit the data again, this time keeping the two coefficients with the lowest p-values and the intercept.

Fit( C + C3*v + C5*w, ExMat[..,[3,5]], ExMat[..,1], [v,w], summarize=embed ):

From the above, we can see that the r-squared value does go down, however the adjusted r-squared goes up! Let's fit the model one last time to see if removing C5 increases or decreases the adjusted r-squared.

Fit( C + C3*v, ExMat[..,3], ExMat[..,1], [v], summarize=embed ):

We can see that the final adjusted r-squared value is lower than the previous two, so we are probably better to keep the additional C5 coefficient value.

You can see this example as well as a couple of other examples of using the Fit command in the following video:

## Geometry of the Canada 150 logo maple le...

Maple

When we first started trying to use Maple to create a maple leaf like the one in the Canada 150 logo, we couldn’t find any references online to the exact geometry, so we went back to basics. With our trusty ruler and protractor, we mapped out the geometry of the maple leaf logo by hand.

Our first observation was that the maple leaf could be viewed as being comprised of 9 kites. You can read more about the meaning of these shapes on the Canada 150 site (where they refer to the shapes as diamonds).

We also observed that the individual kites had slightly different scales from one another. The largest kites were numbers 3, 5 and 7; we represented their length as 1 unit of length. Also, each of the kites seemed centred at the origin, but was rotated about the y-axis at a certain angle.

As such, we found the kites to have the following scales and rotations from the vertical axis:

Kites:

1, 9: 0.81 at +/-

2, 8: 0.77 at +/-

3, 5, 7: 1 at +/-, 0

4, 6: 0.93 at +/-

This can be visualized as follows:

To draw this in Maple we put together a simple procedure to draw each of the kites:

# Make a kite shape centred at the origin.
opts := thickness=4, color="#DC2828":
MakeKite := proc({scale := 1, rotation := 0})
local t, p, pts, x;

t := 0.267*scale;
pts := [[0, 0], [t, t], [0, scale], [-t, t], [0, 0]]:
p := plot(pts, opts);
if rotation<>0.0 then
p := plottools:-rotate(p, rotation);
end if;
return p;
end proc:

The main idea of this procedure is that we draw a kite using a standard list of points, which are scaled and rotated. Then to generate the sequence of plots:

shapes := MakeKite(rotation=-Pi/4),
MakeKite(scale=0.77, rotation=-2*Pi/5),

MakeKite(scale=0.81, rotation=-Pi/2),
MakeKite(scale=0.93, rotation=-Pi/8),
MakeKite(),
MakeKite(scale=0.93, rotation=Pi/8),
MakeKite(scale=0.81, rotation=Pi/2),
MakeKite(scale=0.77, rotation=2*Pi/5),
MakeKite(rotation=Pi/4),
plot([[0,-0.5], [0,0]], opts): #Add in a section for the maple leaf stem
plots:-display(shapes, scaling=constrained, view=[-1..1, -0.75..1.25], axes=box, size=[800,800]);

This looked pretty similar to the original logo, however the kites 2, 4, 6, and 8 all needed to be moved behind the other kites. This proved somewhat tricky, so we just simply turned on the point probe in Maple and drew in the connected lines to form these points.

shapes := MakeKite(rotation=-Pi/4),
plot([[-.55,.095],[-.733,.236],[-.49,.245]],opts),

MakeKite(scale=0.81, rotation=-Pi/2),
plot([[-.342,.536],[-.355,.859],[-.138,.622]],opts),
MakeKite(),
plot([[.342,.536],[.355,.859],[.138,.622]],opts),
MakeKite(scale=0.81, rotation=Pi/2),
plot([[.55,.095],[.733,.236],[.49,.245]],opts),
MakeKite(rotation=Pi/4),
plot([[0,-0.5], [0,0]], opts):
plots:-display(shapes, scaling=constrained, view=[-1..1, -0.75..1.25], axes=box, size=[800,800]);

## Fun with the Maple Leaf – Celebrate Cana...

Maple 2017

This Saturday is Canada’s 150th birthday. As you can imagine, the country has been paying a lot more attention to this year’s anniversary than our usual low key approach, and as a Canadian company, we at Maplesoft decided to join in the fun.

And what better way for Maplesoft to celebrate Canada’s birthday than to create a maple leaf in Maple!

So here is a maple leaf inspired by the Canada 150 logo, which was created by Ariana Cuvin, a student at the University of Waterloo and former co-op here at Maplesoft:

Here’s the code to reproduce this plot (more details can be found in this follow up post):

p:=thickness=5,color="#DC2828":
plots:-display(

plot([[-.216,-.216],[0,0],[-.216,.216],[-.81,0],[-.216,-.216]],p),
plot([[-.55,.095],[-.733,.236],[-.49,.245]],p),
plot([[-.376,0],[0,0],[0,.376],[-.705,.705],[-.376,0]],p),
plot([[-.342,.536],[-.355,.859],[-.138,.622]],p),
plot([[-.267,.267],[0,0],[.267,.267],[0,1],[-.267,.267]],p),
plot([[.342,.536],[.355,.859],[.138,.622]],p),
plot([[0.,.376],[0,0],[.376,0],[.705,.705],[0.,.376]],p),
plot([[.55,.095],[.733,.236],[.49,.245]],p),
plot([[.216,.216],[0,0],[.216,-.216],[.81,0],[.216,.216]],p),
plot([[0,-.5],[0,0]],p),

scaling=constrained,view=[-1..1,-.75..1.25],axes=box);

Know other ways to plot a maple leaf in Maple?  If so, please share them below - we’d love to see them!

 1 2 3 4 5 6 Page 2 of 6
﻿