acer

32313 Reputation

29 Badges

19 years, 313 days
Ontario, Canada

Social Networks and Content at Maplesoft.com

MaplePrimes Activity


These are replies submitted by acer

The term least squares is used to refer to a method for solving various different problems. Roughly, it means minimizing a sum of squares (usually of differences).

In this case, you indicated that you wanted to use it as a method for finding a line of best fit. The two choices of routine that I showed can both serve this purpose of fitting a line to data. The results they returned are both the equations of a line, ie. p*t+q , which is the form you requested. (I couldn't make it p*x+q because you had already assigned to the name x.)

But there is also, for example, least squares as a means of solving an overdetermined system of linear equations. Indeed, this can be the way that the abovementioned fitting computation can be done, behind the scenes. If you really wanted to, you could figure out how to use your data to construct such an overdetermined linear system, and then call Optimization:-LSSolve on it, and then re-interpret the Vector result to get the equation of the line. I guessed that you'd prefer having one of those two fitting routines do all that bookkeeping for you.

acer

The term least squares is used to refer to a method for solving various different problems. Roughly, it means minimizing a sum of squares (usually of differences).

In this case, you indicated that you wanted to use it as a method for finding a line of best fit. The two choices of routine that I showed can both serve this purpose of fitting a line to data. The results they returned are both the equations of a line, ie. p*t+q , which is the form you requested. (I couldn't make it p*x+q because you had already assigned to the name x.)

But there is also, for example, least squares as a means of solving an overdetermined system of linear equations. Indeed, this can be the way that the abovementioned fitting computation can be done, behind the scenes. If you really wanted to, you could figure out how to use your data to construct such an overdetermined linear system, and then call Optimization:-LSSolve on it, and then re-interpret the Vector result to get the equation of the line. I guessed that you'd prefer having one of those two fitting routines do all that bookkeeping for you.

acer

As already mentioned above, the lowess option of ScatterPlot does a form of weighted least squares. And a Vector of weights may be provided to NonlinearFit. It may be useful to think about the differences of these two approaches. An interesting issue is the possible availability of the fitted function and all its computed parameter values.

The way to supply weights to NonlinearFit is clear from its help-page which describes the weights option for this. I don't quite inderstand how those weights are then used, as weights don't seem to be an option for Optimization:-LSSolve. I understand that in weighted least squares problems with data errors it is usual for such weights to be taken using variance of the data. But I don't know exactly how the Maple solver works here. What I suspect is that xerrors and yerrors optional parameters of ScatterPlot may be used to compute weights to be passed on to NonlinearFit. I haven't confirmed this.

It's not clear from the ScatterPlot help-page exactly how the weights for lowess smoothing are chosen. Its three options related to the lowess smoothing are degree, robust, and lowess. It's not clear from that help-page in what way (if any) the xerrors or yerrors options may tie into weighting. I suspect that the don't relate at all. And then there is the question of whether a formulaic fitting result is wanted, since the lowess method will not make that available. The lowess method uses a series of weighted least squares for different points, where weights are used to modify the influence of near neighboring points (rather than to correct for measurement uncertainty directly). I now believe that this is not what the original poster wants.

So here's a question. When passing xerrors and yerrors data to ScatterPlot, when supplied with the fit option, is estimated variance of that extra data used to produce the weights which are then passed along to NonlinearFit? Tracing the Maple computation in the debugger might show whether this is true. If it is, then it may be possible to extract the method for doing it "by hand". In such a way, it may be possible to extract the parameter values that result from the nonlinear fit.

I know that, when calling ScatterPlot with the fit option, Statistics:-NonlinearFit is called, and that Optimization:-LSSolve is also called. It remains to figure out exactly how xerrors and yerrors are used, and whether they modify the above to produce weights for NonlinearFit.

acer

As already mentioned above, the lowess option of ScatterPlot does a form of weighted least squares. And a Vector of weights may be provided to NonlinearFit. It may be useful to think about the differences of these two approaches. An interesting issue is the possible availability of the fitted function and all its computed parameter values.

The way to supply weights to NonlinearFit is clear from its help-page which describes the weights option for this. I don't quite inderstand how those weights are then used, as weights don't seem to be an option for Optimization:-LSSolve. I understand that in weighted least squares problems with data errors it is usual for such weights to be taken using variance of the data. But I don't know exactly how the Maple solver works here. What I suspect is that xerrors and yerrors optional parameters of ScatterPlot may be used to compute weights to be passed on to NonlinearFit. I haven't confirmed this.

It's not clear from the ScatterPlot help-page exactly how the weights for lowess smoothing are chosen. Its three options related to the lowess smoothing are degree, robust, and lowess. It's not clear from that help-page in what way (if any) the xerrors or yerrors options may tie into weighting. I suspect that the don't relate at all. And then there is the question of whether a formulaic fitting result is wanted, since the lowess method will not make that available. The lowess method uses a series of weighted least squares for different points, where weights are used to modify the influence of near neighboring points (rather than to correct for measurement uncertainty directly). I now believe that this is not what the original poster wants.

So here's a question. When passing xerrors and yerrors data to ScatterPlot, when supplied with the fit option, is estimated variance of that extra data used to produce the weights which are then passed along to NonlinearFit? Tracing the Maple computation in the debugger might show whether this is true. If it is, then it may be possible to extract the method for doing it "by hand". In such a way, it may be possible to extract the parameter values that result from the nonlinear fit.

I know that, when calling ScatterPlot with the fit option, Statistics:-NonlinearFit is called, and that Optimization:-LSSolve is also called. It remains to figure out exactly how xerrors and yerrors are used, and whether they modify the above to produce weights for NonlinearFit.

acer

I see what you are after, now. As far as I know the x- and y-errors are not used in the fitting calculation, even when using the lowess (weighted least squares) smoothing. But it seems (now, to me) that you are after a statistical (or stochastic) model, and not the sort of deterministic formulaic model that NonlinearFit gives.

The sort of regression analysis of time series data that you describe (and which was hinted at in the image URL you posted) isn't implemented directly in Maple as far as I know. If you have access to a numeric library like NAG then you might be able to get what you are after using a GARCH process or similar from their g13 routines.

Do you have a URL for that Origin software? I am curious about what they might document, for any routine of theirs which does what you describe.

acer

I see what you are after, now. As far as I know the x- and y-errors are not used in the fitting calculation, even when using the lowess (weighted least squares) smoothing. But it seems (now, to me) that you are after a statistical (or stochastic) model, and not the sort of deterministic formulaic model that NonlinearFit gives.

The sort of regression analysis of time series data that you describe (and which was hinted at in the image URL you posted) isn't implemented directly in Maple as far as I know. If you have access to a numeric library like NAG then you might be able to get what you are after using a GARCH process or similar from their g13 routines.

Do you have a URL for that Origin software? I am curious about what they might document, for any routine of theirs which does what you describe.

acer

No. evalf(3/4) will give as many zeros as makes sense at the current Digits setting.

I suspect that your fundamental difficulty lies in thinking that 0.75 is somehow the best (exact, judging by your followup) floating-point representation of 3/4 the exact rational. What I tried to explain earlier is that 0.75 is merely one of many possible representations of an approximation to an exact value. It is not, in itself, exact.

What I tried to argue was that in some sense the number of trailing zeros is an indicator of how accurately the system knows the floating-point value. I'm not actually saying that this is why Maple behaves this way. (It isn't, really. That's why the explanation breaks down for 3/4. as your first example. To get such careful accuracy and error handling one would have to go to a special package such as the two I mentioned above.) But this behaviour for conversion (approximation) of exact rationals via evalf can be somewhat useful, because it has a somewhat natural interpretation in terms of accuracy.

The idea is that when you write 0.75000 you are claiming something about the accuracy of the approximation -- namely that it is accurate to within 0.000005 (or half an ulp). Similarly, writing 0.7500000000 makes an even stronger claim about the accuracy. So, if you start off with the exact value 3/4 then how many zeros should it get, for its floating-point approximation? There's not much sense in giving it more zeros that is justified by the current working precision, and so Maple gives a number of trailing zeros that reflects the current value of Digits (depending on how many nonzero leading digits precede them, of course).

acer

No. evalf(3/4) will give as many zeros as makes sense at the current Digits setting.

I suspect that your fundamental difficulty lies in thinking that 0.75 is somehow the best (exact, judging by your followup) floating-point representation of 3/4 the exact rational. What I tried to explain earlier is that 0.75 is merely one of many possible representations of an approximation to an exact value. It is not, in itself, exact.

What I tried to argue was that in some sense the number of trailing zeros is an indicator of how accurately the system knows the floating-point value. I'm not actually saying that this is why Maple behaves this way. (It isn't, really. That's why the explanation breaks down for 3/4. as your first example. To get such careful accuracy and error handling one would have to go to a special package such as the two I mentioned above.) But this behaviour for conversion (approximation) of exact rationals via evalf can be somewhat useful, because it has a somewhat natural interpretation in terms of accuracy.

The idea is that when you write 0.75000 you are claiming something about the accuracy of the approximation -- namely that it is accurate to within 0.000005 (or half an ulp). Similarly, writing 0.7500000000 makes an even stronger claim about the accuracy. So, if you start off with the exact value 3/4 then how many zeros should it get, for its floating-point approximation? There's not much sense in giving it more zeros that is justified by the current working precision, and so Maple gives a number of trailing zeros that reflects the current value of Digits (depending on how many nonzero leading digits precede them, of course).

acer

The comments so far are very welcome.

I'll add one or two myself.

I was originally thinking just of a quick data sheet that could assist people who are aquiring systems for running Maple (some individual machines, some for Maple computer labs, etc). Comments by Roman and Jacques bring home the point that a good performance suite would allow tracking of Maple's performance across releases. That could be useful information.

A few more subtleties of a benchmark suite: Some parts could have tasks split by restart (or, maybe better, by wholly fresh session), to minimize interference amongst tasks as memory allocation grew and memeory management costs took effect. OK.

But some other parts might deliberately involve lots of tasks together, because that might get closer to typical usage.

There is also the question of Maple performance over  very long continuous durations, possibly with large memory allocation. There's an active thread in comp.soft-sys.math.maple related to this.

Speaking of memory, there is also the question of memory fragmentation. Maple seems to not do best when contiguous memory hardware rtables are allocated and then unreferenced (ie. when they become collectible as garbage). The collected memory is not always available in larger free blocks, in current Maple, due to fragmentation of the freed space. I have heard proposals that garbage collection in Maple might be altered so as to also move memory chunks that were still in use. Such memory packing might release larger free contiguous blocks. Some similarity might be had with respect to Matlab's `pack` command, as far as the final effect (if not the means) goes of such memory consolidation.

The more I think about, the more I see that the two purposes would be better having very different and completely distinct sources: one simple set of codes to show the relative performance of Maple across OS/hardware, and another more sophisticated suite for long-term measurement of Maple as it develops.

acer

Hi Bill,

Estimated relative performance of Maple across different operating systems and architectures is one of the ideas behind this post about a possible Maple benchmark. The question that I mostly had in mind when posting that was: which platform does Maple perform best upon?

Others noted that there could be some good other benefits too. It might illustrate how different releases of Maple performed, on the same configuration. That could lead to insight about what's improved, what's deteriorated, and where subsequent improvement efforts for Maple itself could be well spent.

So maybe it would help you a bit, if we could summarize some of the differences in Maple on various platforms.

trunc(evalhf(Digits)) is 14 on MS-Windows, and 15 elsewhere. That's the cut-off value of the Digits Maple environment variable, above which quite a bit of modern Maple will use software floats. Below that threshold those parts of Maple (LinearAlgebra, Optimization, Statistics, evalf/Int) can use hardware double-precision floats and compute much faster (and without producing so much software garbage, which must be managed). It's also the working precision at which Maple's faster floating-point `evalhf` interpreter operates. So,, on MS-Windows, Maple's cut-off for these things is 1 decimal digit of precision less. This cut-off value doesn't really affect exact symbolic computations, though.

I have heard reports, but never seen hard figures, that MS-Windows is by far the leading platform for sales of Maple. This makes sense to me, especially with respect to my subjective experiences of slightly superior "look & feel" of Maple on Windows.

Some high-difficulty (non-officially supported) tweaking of Maple is easier on Linux. See here, and here.

This is an appropriate moment to mention that Maple's Classic graphical user interface is not available on OSX for Maple 10 & 11. It is not officially supported on 64bit Linux either, but here is a post that shows how it can be done.

You might also be interested in this post, which briefly discusses some performance differences between 32bit and 64bit Maple running on the same Linux machine and OS. That also arose briefly in this thread, and is something else into which a good Maple benchmark suite might provide insight.

I've noticed that running multiple invocations of Maple (all of Maple, separatey and concurrently, not just multiple worksheets or Documents) is handled very much better by Linux than by Windows. Also, I have seen Windows and OSX machines suffer more when hit hard by highly resource intensive Maple calculations. For example, when running on a network, the Windows and OSX machines seem much more likely to lose remote drive mounts and network services. Those are operating system distinctions, and not apsects of Maple implementation. They may, or may not, matter to you.

I'll state my subjective preference: a 64bit Linux distribution that is supported by Maplesoft and from which one can also install optional 32bit runtime OS components. For Maple 11, the 64bit SuSE 10 distributuon might be OK, though I have not used it. On the hardware side, I'd go for a machine that could run such an OS, either multi-core Athlon64 (X2) or Intel Core2 Duo.

acer

There is actually such a software-float backup to STATS_MaplePoissonRandomSample_HW. But it may not work very quickly.

> MPFloat("libstatsmp.so"); true > a2:=define_external("STATS_MaplePoissonRandomSample_MP", MAPLE,LIB = "libstatsmp.so"); a2 := proc() option call_external, define_external("STATS_MaplePoissonRandomSample_MP", MAPLE, LIB = "libstatsmp.so"); call_external(0, 182955163792, 0, args) end proc

In another reply below, I showed how some answer might be obtained even when the parameter was greater than this cut-off point, by setting UseHardwareFloats=false or by having Digits be greater than trunc(evalhf(Digits)). But it works slowly then, and only seven digits of information are ever returned (which isn't so useful, I think).

 

acer

There is actually such a software-float backup to STATS_MaplePoissonRandomSample_HW. But it may not work very quickly.

> MPFloat("libstatsmp.so"); true > a2:=define_external("STATS_MaplePoissonRandomSample_MP", MAPLE,LIB = "libstatsmp.so"); a2 := proc() option call_external, define_external("STATS_MaplePoissonRandomSample_MP", MAPLE, LIB = "libstatsmp.so"); call_external(0, 182955163792, 0, args) end proc

In another reply below, I showed how some answer might be obtained even when the parameter was greater than this cut-off point, by setting UseHardwareFloats=false or by having Digits be greater than trunc(evalhf(Digits)). But it works slowly then, and only seven digits of information are ever returned (which isn't so useful, I think).

 

acer

If I'm not interpreting the PLOT structures incorrectly, then what happens is this: The feasible region is initially constructed as a POLYGON structure that fills the whole space (and is then colored). This gets used as the background. And then the complement of what is actually feasible is formed into (at least) one POLYGON structure, and is used as the foreground. Ie., the complement is laid over the feasible region, and so acts as a mask (so that only the actual feasible region shows  through).

If it were done the opposite way then wouldn't it be very easy to merge such inequal plots? That is to say: fill the background or create a POLYGON which fills in entirely as background, then form a POLYGON of only the actual feasible region and color it. Make the feasible region the foreground not the background.

If done that way, then each `inequal` plot produces a background (filling) POLYGON as well as a foreground feasible POLYGON. To display more than one such plot, one only has to remove all but the first background filler. Eg. with p1 and p2,  one simply uses both op(p1) and op(p2) where the background POLYGON is removed from op(p2). Probably display(op(p1),op([2..-1],p2)) would then suffice.

Overlaps would occur naturally, in any order of the arguments to `display`.

Please pardon me, if I'm wrong.

acer

If I'm not interpreting the PLOT structures incorrectly, then what happens is this: The feasible region is initially constructed as a POLYGON structure that fills the whole space (and is then colored). This gets used as the background. And then the complement of what is actually feasible is formed into (at least) one POLYGON structure, and is used as the foreground. Ie., the complement is laid over the feasible region, and so acts as a mask (so that only the actual feasible region shows  through).

If it were done the opposite way then wouldn't it be very easy to merge such inequal plots? That is to say: fill the background or create a POLYGON which fills in entirely as background, then form a POLYGON of only the actual feasible region and color it. Make the feasible region the foreground not the background.

If done that way, then each `inequal` plot produces a background (filling) POLYGON as well as a foreground feasible POLYGON. To display more than one such plot, one only has to remove all but the first background filler. Eg. with p1 and p2,  one simply uses both op(p1) and op(p2) where the background POLYGON is removed from op(p2). Probably display(op(p1),op([2..-1],p2)) would then suffice.

Overlaps would occur naturally, in any order of the arguments to `display`.

Please pardon me, if I'm wrong.

acer

Here's a kludge, for just this one example.

It works by substituting the name `2` for the number 2 in the two fractions 1/2 and -1/2. (That's the non-technical explanation. Experts may nit-pick...) The 2D fraction layout works better for pure names in the denominator. Conversion back can be accomplished using expand().

> restart:

> sol:=solve({x^2+b*x+c},{x});

     /                        (1/2)\    /                        (1/2)\ 
     |      1     1 / 2      \     |    |      1     1 / 2      \     | 
    < x = - - b + - \b  - 4 c/      >, < x = - - b - - \b  - 4 c/      >
     |      2     2                |    |      2     2                | 
     \                             /    \                             / 


> op(simplify(subs(-1/2=-1/`2`,1/2=1/`2`,[sol])));

           /                    /1\\    /                    /1\\ 
           |                    |-||    |                    |-|| 
           |                    \2/|    |                    \2/| 
          <           / 2      \    >  <           / 2      \    >
           |      b - \b  - 4 c/   |    |      b + \b  - 4 c/   | 
           |x = - -----------------| ,  |x = - -----------------| 
           \              2        /    \              2        / 


> expand(%);

                          /                    /1\\ 
                          |                    |-|| 
                          |                    \2/| 
                         <           / 2      \    >
                          |      b   \b  - 4 c/   | 
                          |x = - - + -------------| 
                          \      2         2      / 


Notice that it also makes the sqrt display as power-1/2, which you may not enjoy. That too could be side-stepped, but the code might then look too involved.

acer

First 541 542 543 544 545 546 547 Last Page 543 of 591