Markiyan Hirnyk

Markiyan Hirnyk
9 years, 30 days


These are Posts that have been published by Markiyan Hirnyk

I'd like to pay attention to the article of David Austin "How to Grow and Prune a Classification Tree"

Here is its introduction:

"

It's easy to collect data these days; making sense of it is more work. This article explains a construction in machine learning and data mining called a classification tree. Let's consider an example.

In the late 1970's, researchers at the University of California, San Diego Medical Center performed a study in which they monitored 215 patients following a heart attack. For each patient, 19 variables, such as age and blood pressure, were recorded. Patients were then grouped into two classes, depending on whether or not they survived more than 30 days following the heart attack.

Assuming the patients studied were representative of the more general population of heart attack patients, the researchers aimed to distill all this data into a simple test to identify new patients at risk of dying within 30 days of a heart attack.

By applying the algorithm described here, Breiman, Freidman, Olshen, and Stone, created a test consisting of only three questions---what is the patient's minimum systolic blood pressure within 24 hours of being admitted to the hospital, what is the patient's age, does the patient exhibit sinus tachycardia---to identify patients at risk. In spite of its simplicity, this test proved to be more accurate than any other known test. In addition, the importance of these three questions indicate that, among all 19 variables, these three factors play an important role in determining a patient's chance of surviving.

Besides medicine, these ideas are applicable to a wide range of problems, such as identifying which loan applicants are likely to default and which voters are likely to vote for a particular political party.

In what follows, we will describe the work of Breiman and his colleagues as set out in their seminal book Classification and Regression Trees. Theirs is a very rich story, and we will concentrate on only the essential ideas"

It would be interesting to compare this approach with discriminant analysis. Hope somebody of  the Maple developers will give a concrete example on this theme with Maple.

What is Groebner? That was asked in different forms several times in MaplePrimes and MathStackExchange (for example, see http://math.stackexchange.com/questions/3550/using-gr?bner-bases-for-solving-polynomial-equations ). In view of this I think the presented post on Groebner basis will be useful. This post consists of two parts: its mathematical background and examples of solutions of polynomial systems by hand and with Maple.

Let us start. Up to Wiki http://en.wikipedia.org/wiki/Gr%C3%B6bner_basis ,Groebner basis computation can be seen as a multivariate, non-linear generalization of both Euclid's algorithm for computing polynomial greatest common divisors, and Gaussian elimination for linear systems. This is implemented in Maple trough the Groebner package.
The simplest introduction to the topic I know is a well-written book of Ivan Arzhantsev (https://zbmath.org/?q=an:05864974) which includes the proofs of all the claimed theorems and the solutions of all the exercises. Here is its digest groebner.pdf done by me (The reader is assumed to be familiar with the ideal notion and ring notion (one may refresh her/his knowledge, looking in http://en.wikipedia.org/wiki/Ideal_%28ring_theory%29)). It should be noted that there is no easy reading about this serious matter.
Referring to the digest as appropriate, we solve the system
S:={a*b = c^2+c, a^2 =a+ b*c, a*c = b^2+b} by hand and with the Groebner package.
For the order a > b > c we construct its ideal
J(S):=<f1 = a*b-c^2-c,f2 = a^2-a-b*c, f3 = a*c-b^2-b>.
The link between f1 and f2 gives
f1*a-f2*b = (-c^2-c)*a + (a + b*c)*b = a*b -a*c + b^2*c -
a*c^2 =f4.
The reduction with f1 produces
f4 ->-a*c^2- a*c + b^2*c + c^2 +c =: f4.
Now the reduction with f3 produces
f4 -> -b^2- b - b*c +c^2 + c =:f4.
The link between f2 and f3 gives:
f2*c - f3*a = a*b +a*b^2 -a*c -b*c^2 = f5.
The reduction with f1 produces
f5 -> -a*c + c*b +c^2 +c =:f5.
The reduction with f3 produces
f5 -> -b^2 -b + c*b +c^2 +c =:f5.
The reduction with f4 produces
f5 -> 2b*c =: f5.
The link between f1 and f3
f1*c - f3*b = b^3 + b^2 -c^3 -c^2=:f6.
The reduction with f4 produces
f6 -> 2b*c + 2b*c^2 -2c^3 -2c^2=:f6.
At last, we reduce f6 by f5, obtaining f6:= -2c^3 -2c^2.
We see the minimal reduced Groebner basis of S consists of
a^2 -a -b*c, -b^2 -b- b*c +c^2 +c, -2c^3 - 2c^2.
Now we find the solution set of the system under consideration. The equation -2c^3 - 2c^2 = 0 implies
c=0, c=0, c=-1. The the equation -b^2 - b - b*c +c^2 + c = 0 gives
b = 0 , b = -1, b = 0, b = -1, b = 0, b = 0 respectively.
At last, knowing b and c, we find a from a^2 -a -b*c = 0.
Hence,
[{a = 0, b = 0, c = 0}, {a = 1, b = 0, c = 0}, {a = 0, b = -1, c = 0}], [{a = 0, b = 0, c = 0}, {a = 1, b = 0, c = 0}, {a = 0, b = -1, c = 0}], [{a = 0, b = 0, c = -1}].
The solution of the system under consideration by the Groebner package is somewhat different because Maple does not find the minimal reduced Groebner basis directly.

 

with(Groebner):

[b*c, a*c-c^2-c, b^2-c^2+b-c, a*b-c^2-c, a^2-a, c^3+c^2]

(1)

GB2 := remove(has, GB1, a);

[b*c, b^2-c^2+b-c, c^3+c^2]

(2)

GB3 := Basis(GB2, lexdeg([b, c]))

[b*c, b^2-c^2+b-c, c^3+c^2]

(3)

op(remove(has, GB3, {b}))

c^3+c^2

(4)

solc := solve(c^3+c^2);

-1, 0, 0

(5)

solb := [seq(op(map(`union`, [solve(eval(GB3, c = i), {b})], {c = i})), i = solc)]

[{b = 0, c = -1}, {b = -1, c = 0}, {b = 0, c = 0}, {b = -1, c = 0}, {b = 0, c = 0}]

(6)

sol := seq(op(map(`union`, [solve(eval(GB1, i))], i)), i = solb)

{a = 0, b = 0, c = -1}, {a = 0, b = -1, c = 0}, {a = 0, b = 0, c = 0}, {a = 1, b = 0, c = 0}, {a = 0, b = -1, c = 0}, {a = 0, b = 0, c = 0}, {a = 1, b = 0, c = 0}

(7)

NULL

S2 := {a*c-b^2-b, a*b-c^2-c, a^2-b*c+a}:

GB1 := Basis(S2, lexdeg([a, b, c]))

[a*c+b*c+c^2+c, b^2+b*c+c^2+b+c, a*b-c^2-c, a^2-b*c+a]

(8)

GB2 := remove(has, GB1, a)

[b^2+b*c+c^2+b+c]

(9)

sol1 := solve(GB2, b)

{b = -(1/2)*c-1/2+(1/2)*(-3*c^2-2*c+1)^(1/2)}, {b = -(1/2)*c-1/2-(1/2)*(-3*c^2-2*c+1)^(1/2)}

(10)

map(proc (c) options operator, arrow; `union`(c, sol1[1]) end proc, map(proc (C) options operator, arrow; solve(C, {a}) end proc, eval(S2, sol1[1])))

{{a = 2*c*(c+1)/(-c-1+(-3*c^2-2*c+1)^(1/2)), b = -(1/2)*c-1/2+(1/2)*(-3*c^2-2*c+1)^(1/2)}, {a = -1/2-(1/2)*(1+2*c*(-3*c^2-2*c+1)^(1/2)-2*c^2-2*c)^(1/2), b = -(1/2)*c-1/2+(1/2)*(-3*c^2-2*c+1)^(1/2)}, {a = -1/2+(1/2)*(1+2*c*(-3*c^2-2*c+1)^(1/2)-2*c^2-2*c)^(1/2), b = -(1/2)*c-1/2+(1/2)*(-3*c^2-2*c+1)^(1/2)}, {a = -(1/2)*c-1/2-(1/2)*(-3*c^2-2*c+1)^(1/2), b = -(1/2)*c-1/2+(1/2)*(-3*c^2-2*c+1)^(1/2)}}

(11)

``

 

Download groebner.mw

I'd like to pay attention to the recent article "The Misfortunes of a Trio of Mathematicians Using Computer Algebra Systems. Can We Trust in Them?"

In particular, the authors consider the integral

int(abs(exp(2*Pi*Ix)+exp(2*Pi*I*y)),[x=0..1,y=0..1]),

stating "Both Mathematica and Maple return zero as the answer to this calculation. Yet this cannot be correct, because the integrand is clearly positive and nonzero in the indicated region". Unfortunately, they give only the Mathematica command to this end.

Of course, the integral under consideration is complicated so the the simple-minded trials

int(evalc(abs(exp((2*Pi*I)*x)+exp((2*Pi*I)*y))), [x = 0 .. 1, y = 0 .. 1]);

and

VectorCalculus:-int(evalc(abs(exp((2*Pi*I)*x)+exp((2*Pi*I)*y))), [x,y]=Rectangle( 0 .. 1, 0 .. 1));

fail. However,this can be found with Maple (I think with Mathematica too.) in such a way.

 

A := evalc(abs(exp((2*Pi*I)*x)+exp((2*Pi*I)*y)))

((cos(2*Pi*x)+cos(2*Pi*y))^2+(sin(2*Pi*x)+sin(2*Pi*y))^2)^(1/2)

(1)

NULL

B := simplify(A, trig)

(2*cos(2*Pi*x)*cos(2*Pi*y)+2+2*sin(2*Pi*x)*sin(2*Pi*y))^(1/2)

(2)

op(B)[1]

2*cos(2*Pi*x)*cos(2*Pi*y)+2+2*sin(2*Pi*x)*sin(2*Pi*y)

(3)

combine(op(B)[1], x)

2*cos(2*Pi*x-2*Pi*y)+2

(4)

C := eval(B, op(B)[1] = combine(op(B)[1], x))

(2*cos(2*Pi*x-2*Pi*y)+2)^(1/2)

(5)

int(C, [x = 0 .. 1, y = 0 .. 1])

4/Pi

(6)

``

 

Download int.mw

 

 

I would like to pay attention to a series of applications by Samir Khan
http://www.maplesoft.com/applications/view.aspx?SID=153600
http://www.maplesoft.com/applications/view.aspx?SID=153599
http://www.maplesoft.com/applications/view.aspx?SID=153596
http://www.maplesoft.com/applications/view.aspx?SID=153598
My congratulations to the author on his work well done. New capacities of Global Optimization Toolbox are spectacular. For example, in the first application  an optimization
problem in 101 variables under 5050 nonlinear  constraints
(other than 202 bounds) is solved.
I think it requires a very powerful comp and much time.
I tried that  problem for n=20 with the good old DirectSearch
on my comp (4 GB RAM, Pentium Dual-Core CPU E5700@3GHz) by

soln2 := DirectSearch:-GlobalSearch(rc, {cons1, cons2, rc >= 0,
seq(`and`(vars[i] >= -70, vars[i] <= 70), i = 1 .. 2*n), rc <= 70},
variables = vars, method = quadratic, number = 140, solutions = 1,
evaluationlimit = 20000)

and obtained not so bad rc=69.9609360106765 (whereas www.packomania.com gives rc=58.4005674790451137175957) in about one hour.

Packing_by_DS.mw
For n=50 the memory of my comp cannot allocate calculations or the obtained result by the Search command is far away from the one in packomania.

 

1 2 3 4 5 6 7 Page 1 of 8