Featured Post

A project that I have been working on is adding some functionality for Cluster Analysis to Maple (a small part of a much bigger project to increase Maple’s toolkit for exploratory data mining and data analysis). The launch of the MapleCloud package manager gave me a way to share my code for the project as it evolves, providing others with some useful new tools and hopefully gathering feedback (and collaborators) along the way.

At this point, there aren’t a lot of commands in the ClusterAnalysis package, but I have already hit upon several interesting applications. For example, while working on a command for plotting clusters of points, one problem I encountered was how to draw the minimal volume enclosing ellipsoid around a group (or cluster) of points. After doing some research, I stumbled upon Khachiyan’s Algorithm, which related to solving linear programming problems with rational data. The math behind this is definitely interesting, but I’m not going to spend any time on it here. For further reading, you can explore the following:

Khachiyan’s Algorithm had previously been applied in some other languages, but to the best of my knowledge, did not have any Maple implementations. As such, the following code is an implementation of Khachiyan’s Algorithm in 2-D, which could be extended to N-dimensional space rather easily.

This routine accepts an Nx2 dataset and outputs either a plot of the minimum volume enclosing ellipsoid (MVEE) or a list of results as described in the details for the ‘output’ option below.

MVEE( X :: DataSet, optional arguments, additional arguments passed to the plotting command );

The optional arguments are as follows:

  • tolerance : realcons;  specifies the convergence criterion
  • maxiterations : posint; specifies the maximum number of iterations
  • output : {identical(data,plot),list(identical(data,plot))}; specifies the output. If output includes plot, then a plot of the enclosing ellipsoid is returned. If output includes data, then the return includes is a list containing the matrix A, which defines the ellipsoid, the center of the ellipse, and the eigenvalues and eigenvectors that can be used to find the semi-axis coordinates and the angle of rotation, alpha, for the ellipse.
  • filled : truefalse; specifies if the returned plot should be filled or not

Code:

#Minimum Volume Enclosing Ellipsoid
MVEE := proc(XY, 
              {tolerance::positive:= 1e-4}, #Convergence Criterion
              {maxiterations::posint := 100},
              {output::{identical(data,plot),list(identical(data,plot))} := data},
              {filled::truefalse := false} 
            )

    local alpha, evalues, evectors, i, l_error, ldata, ldataext, M, maxvalindex, n, ncols, nrows, p1, semiaxes, stepsize, U, U1, x, X, y;
    local A, center, l_output; #Output

    if hastype(output, 'list') then
        l_output := output;
    else
        l_output := [output];
    end if;

    kernelopts(opaquemodules=false):

    ldata := Statistics:-PreProcessData(XY, 2, 'copy');

    nrows, ncols := upperbound(ldata);
    ldataext := Matrix([ldata, Vector[column](nrows, ':-fill' = 1)], 'datatype = float');

    if ncols <> 2 then
        error "expected 2 columns of data, got %1", ncols;
    end if;

    l_error := 1;

    U := Vector[column](1..nrows, 'fill' = 1/nrows);

    ##Khachiyan Algorithm##
    for n to maxiterations while l_error >= tolerance do

        X := LinearAlgebra:-Transpose(ldataext) . LinearAlgebra:-DiagonalMatrix(U) . ldataext;
        M := LinearAlgebra:-Diagonal(ldataext . LinearAlgebra:-MatrixInverse(X) . LinearAlgebra:-Transpose(ldataext));
        maxvalindex := max[index](map['evalhf', 'inplace'](abs, M));
        stepsize := (M[maxvalindex] - ncols - 1)/((ncols + 1) * (M[maxvalindex] - 1));
        U1 := (1 - stepsize) * U;
        U1[maxvalindex] := U1[maxvalindex] + stepsize;
        l_error := LinearAlgebra:-Norm(LinearAlgebra:-DiagonalMatrix(U1 - U));
        U := U1;

    end do;

    A := (1/ncols) * LinearAlgebra:-MatrixInverse(LinearAlgebra:-Transpose(ldata) . LinearAlgebra:-DiagonalMatrix(U) . ldata - (LinearAlgebra:-Transpose(ldata) . U) . LinearAlgebra:-Transpose((LinearAlgebra:-Transpose(ldata) . U)));
    center := LinearAlgebra:-Transpose(ldata) . U;
    evalues, evectors := LinearAlgebra:-Eigenvectors(A);
    evectors := evectors(.., sort[index](1 /~ (sqrt~(Re~(evalues))), `>`, ':-output' = ':-permutation'));
    semiaxes := sort(1 /~ (sqrt~(Re~(evalues))), `>`);
    alpha := arctan(Re(evectors[2,1]) / Re(evectors[1,1]));

    if l_output = [':-data'] then
        return A, center, evectors, evalues;
    elif has( l_output, ':-plot' ) then
            x := t -> center[1] + semiaxes[1] * cos(t) * cos(alpha) - semiaxes[2] * sin(t) * sin(alpha);
            y := t -> center[2] + semiaxes[1] * cos(t) * sin(alpha) + semiaxes[2] * sin(t) * cos(alpha);
            if filled then
                p1 := plots:-display(subs(CURVES=POLYGONS, plot([x(t), y(t), t = 0..2*Pi], ':-transparency' = 0.95, _rest)));
            else
                p1 := plot([x(t), y(t), t = 0..2*Pi], _rest);
            end if;
        return p1, `if`( has(l_output, ':-data'), op([A, center, evectors, evalues]), NULL );
    end if;

end proc:

 

You can run this as follows:

M:=Matrix(10,2,rand(0..3)):

plots:-display([MVEE(M,output=plot,filled,transparency=.3),
                plots:-pointplot(M, symbol=solidcircle,symbolsize=15)],
size=[0.5,"golden"]);

 

 

As it stands, this is not an export from the “work in progress” ClusterAnalysis package – it’s actually just a local procedure used by the ClusterPlot command. However, it seemed like an interesting enough application that it deserved its own post (and potentially even some consideration for inclusion in some future more geometry-specific package). Here’s an example of how this routine is used from ClusterAnalysis:

with(ClusterAnalysis);

X := Import(FileTools:-JoinPath(["datasets/iris.csv"], base = datadir));

kmeans_results := KMeans(X[[`Sepal Length`, `Sepal Width`]],
    clusters = 3, epsilon = 1.*10^(-7), initializationmethod = Forgy);

ClusterPlot(kmeans_results, style = ellipse);

 

 

The source code for this is stored on GitHub, here:

https://github.com/dskoog/Maple-ClusterAnalysis/blob/master/src/MVEE.mm

Comments and suggestions are welcomed.

 

If you don’t have a copy of the ClusterAnalysis package, you can install it from the MapleCloud window, or by running:

PackageTools:-Install(5629844458045440);

 

Featured Post

So I have recently finished up a project that took different sounds found in nature, and through the Spectrogram command, plotted the frequency of each sound over time with some really cool results!

https://www.maplesoft.com/applications/view.aspx?SID=154346 

The contrast between sounds produced by the weather such as tornadoes, thunder, and hail versus something as innocuous as a buzzing bee, a chorus of crickets, or a purring cat really shows the variance in the different sounds we hear in our day to day life, while also creating some very interesting imagery.

My personal favourite was the cricket chorus, producing a very ordered image with some really cool spikes through many different frequencies as the crickets chirped, as shown here:

Using this plot, we can do some interesting things, like count the number of chirps in 8 seconds, which turns out to be 18-18.5. Why Is this important? Well, there’s a law known as Dolbear’s Law(shown here: https://en.wikipedia.org/wiki/Dolbear%27s_law)  which uses the number of chirps in a minute for Fahrenheit, or 8 seconds for Celsius to calculate the temperature. Celsius is very simple, and just requires adding 5 to the number of chirps in 8 seconds, which gets us a temperature of 23C.

Tc= 5 + N8

For Fahrenheit, it’s a bit more complicated, as we need the chirps in a minute. This is around 132 chirps in our case. Putting that into the formula:

TF= 50 +((N60 – 40)/4)

Which gets us 73F, or 22.7C, so you can see that it works pretty well! Pretty cool, huh?

 

There was also some really cool images that were produced, like the thunder plot:

Which I personally really like due to the contrasting black and yellow spike that occurs. Overall this was a very fun project to do, getting to tweak the different colours and scales of each spectrogram, creating a story out of a sound. Hope you all enjoy it!



Error in DEplot3D

Maple asked by Fata123 10 Today