I'd like to pay attention to the article of David Austin "How to Grow and Prune a Classification Tree"

Here is its introduction:

"

It's easy to collect data these days; making sense of it is more work. This article explains a construction in machine learning and data mining called a classification tree. Let's consider an example.

In the late 1970's, researchers at the University of California, San Diego Medical Center performed a study in which they monitored 215 patients following a heart attack. For each patient, 19 variables, such as age and blood pressure, were recorded. Patients were then grouped into two classes, depending on whether or not they survived more than 30 days following the heart attack.

Assuming the patients studied were representative of the more general population of heart attack patients, the researchers aimed to distill all this data into a simple test to identify new patients at risk of dying within 30 days of a heart attack.

By applying the algorithm described here, Breiman, Freidman, Olshen, and Stone, created a test consisting of only three questions---what is the patient's minimum systolic blood pressure within 24 hours of being admitted to the hospital, what is the patient's age, does the patient exhibit sinus tachycardia---to identify patients at risk. In spite of its simplicity, this test proved to be more accurate than any other known test. In addition, the importance of these three questions indicate that, among all 19 variables, these three factors play an important role in determining a patient's chance of surviving.

Besides medicine, these ideas are applicable to a wide range of problems, such as identifying which loan applicants are likely to default and which voters are likely to vote for a particular political party.

In what follows, we will describe the work of Breiman and his colleagues as set out in their seminal book Classification and Regression Trees. Theirs is a very rich story, and we will concentrate on only the essential ideas"

It would be interesting to compare this approach with discriminant analysis. Hope somebody of  the Maple developers will give a concrete example on this theme with Maple.


Please Wait...