sand15

1329 Reputation

15 Badges

11 years, 19 days

MaplePrimes Activity


These are replies submitted by sand15

@C_R 

Concening the "model selection" stuff: I'm working on it.
Same thing for a detailed comment about regression/interpolation and the different associated points of views.

It will take me a few to finalize this, see you soon.

@C_R 

You write "My requirement: An algorithm (not me) that searches for a model that fits a given data set best."

I understand what you likely have in mind saying this but I believe a little clarification is needed.

Indeed the term fit is unclear.
If you consider @Ronan proposal, CurveFitting will give you a continuous function which passes exactly to your (101) points, giving then a perfect representation of tha data.
If you consider 
@dharr proposal instead, a statistical fit (linear or not) will give you a continuous function which is close, in some sense (often in the least squares sense but other norms, or metrics, can be used) your data points (note that any statical linear model with 101 parameters passes exactly by yout 101 points).

This poses the question of what a good fit is.
From the representation point of view interpolation is perfect because the sum of the residuals "data minus model values" is 0, but classical interpolation methods have generally no generalization capabilities. This means that they may be absolutely incapable to predict what the experimental output would be for a new input, even if this input is close to an input in the data set used to build the interpolation model.

The only noticeable exception is Gaussian Process Emulation, aka Kriging, which is a statistical method that unfortunately Maple locates in the CurveFitting package.
This method enables controlling the prediction error out of the points used to construct (fit) the model.

The least squares point of view is a strategy where we satisfy ourselves with a model which simply passes reasonably close to the data points. As we do not force the model to "bend" itself to passes through these points we generally obtain smoother graphs than with interpolation (let's say with much smaller overshoots).
But this is to the detriment of the fidelity to the data for the residual sums of squre is no longer equal to zero as it was with interpolation models.

Advancet least squares point of view try to realize a balance between the fidelity to the data and a contained generalization error (a thinkg Kriging does naturally... provided it is correctly tunned, which is another issue). Stepwise regression, LASSO regression (you can browse those terms on Wikipedia if you are interested) are two methods among many to realize this balance.

So the core question is indeed "What a good fit is?"

_________________________________________________________________________

Let's skip this philosophical question and let's come back to your problem.

You wrote "My best guess for the dataset above is a periodic function with a modulated argument.".

Does this idea of function with a modulated argument come from your knowledge of the 
physical process which generates your data, or it is something which came to you by observing a scatterplot of these data?
This question does matter, because in the first case we must try and fir a function of some given structure, while in the second case one may feel free to propose something else (let's say, my best couldn't be yours).

In the attached file I propose you to fit, in the least squares sense, rational fractions to your data.
Indeed those models have two strong interesting properties: they are quite versatile, and they are linear under a proper reparameterization.
In the attached file the rational fraction R(x) to fit is coded [P, Q] where:

.

  • If P (resp. Q) is a nonegative integer, then numer(R(x)) (resp. denom(R(x)) ) is a dense polynomial of degree P (resp. Q) with indeterminate x.
     
  • If P (resp. Q) is a list of nonnegative integer, then numer(R(x)) (resp. denom(R(x)) ) is a sparse polynomial whose monomials have degrees in P (resp. Q)


You will see these rational fractions can fit your data reasonnably well provided the degrees P and Q are high enough

RationalFractionFit.mw

 

@C_R 

You write "My requirement: An algorithm (not me) that searches for a model that fits a given data set best."

I understand what you likely have in mind saying this but I believe a little clarification is needed.

Indeed the term fit is unclear.
If you consider @Ronan proposal, CurveFitting will give you a continuous function which passes exactly to your (101) points, giving then a perfect representation of tha data.
If you consider 
@dharr proposal instead, a statistical fit (linear or not) will give you a continuous function which is close, in some sense (often in the least squares sense but other norms, or metrics, can be used) your data points (note that any statical linear model with 101 parameters passes exactly by yout 101 points).

This poses the question of what a good fit is.
From the representation point of view interpolation is perfect because the sum of the residuals "data minus model values" is 0, but classical interpolation methods have generally no generalization capabilities. This means that they may be absolutely incapable to predict what the experimental output would be for a new input, even if this input is close to an input in the data set used to build the interpolation model.

The only noticeable exception is Gaussian Process Emulation, aka Kriging, which is a statistical method that unfortunately Maple locates in the CurveFitting package.
This method enables controlling the prediction error out of the points used to construct (fit) the model.

The least squares point of view is a strategy where we satisfy ourselves with a model which simply passes reasonably close to the data points. As we do not force the model to "bend" itself to passes through these points we generally obtain smoother graphs than with interpolation (let's say with much smaller overshoots).
But this is to the detriment of the fidelity to the data for the residual sums of squre is no longer equal to zero as it was with interpolation models.

Advancet least squares point of view try to realize a balance between the fidelity to the data and a contained generalization error (a thinkg Kriging does naturally... provided it is correctly tunned, which is another issue). Stepwise regression, LASSO regression (you can browse those terms on Wikipedia if you are interested) are two methods among many to realize this balance.

So the core question is indeed "What a good fit is?"

_________________________________________________________________________

Let's skip this philosophical question and let's come back to your problem.

You wrote "My best guess for the dataset above is a periodic function with a modulated argument.".

Does this idea of function with a modulated argument come from your knowledge of the 
physical process which generates your data, or it is something which came to you by observing a scatterplot of these data?
This question does matter, because in the first case we must try and fir a function of some given structure, while in the second case one may feel free to propose something else (let's say, my best couldn't be yours).

In the attached file I propose you to fit, in the least squares sense, rational fractions to your data.
Indeed those models have two strong interesting properties: they are quite versatile, and they are linear under a proper reparameterization.
In the attached file the rational fraction R(x) to fit is coded [P, Q] where:

.

  • If P (resp. Q) is a nonegative integer, then numer(R(x)) (resp. denom(R(x)) ) is a dense polynomial of degree P (resp. Q) with indeterminate x.
     
  • If P (resp. Q) is a list of nonnegative integer, then numer(R(x)) (resp. denom(R(x)) ) is a sparse polynomial whose monomials have degrees in P (resp. Q)


You will see these rational fractions can fit your data reasonnably well provided the degrees P and Q are high enough

RationalFractionFit.mw

 

@dharr 

I missed this fundamental point that the equations are evidently not independent.

@salim-barzani

Apologies for what I wrote in my initial answer: I should have paid more attention to the way your 17 equations were built and not speak from a general point of view as @dharr commented.

Both of you: here is a little attempt to provide an analysys of the solutions @dharr got. A few points

  • All the 28 solutions contain at least one equality of the form 'p = p' where p is any of the 8 parameters.
  • All the 28 solutions contain at least one equality of the form 'p = 0' where p is any of the 8 parameters.
  • All the 28 solutions depend on both epsilon, k, h.
  • I suggest droping the 'explicit' option in solve to get more synthetic representations of the solutions

analysis_sand15.mw

Read carefully the attached file  plume_work_sand15.mw to see what questions your problem rises and understand wfy pdsolve returns an error.

This said, my bet is your problem constitutes a model for atmospheric pollution due to the release of a pollutant plume.
Am I right?
If it is so those models are generally transient ones as yours, assuming X and Y are space co-ordinates is a stationnary one... and this is the main thing I don't understand.
See for instance this paper.

If you took your equations from some article can you provide us with?

If you solve a stationnary problem then, obviously, your boundary conditions are not correct (read the attached file)

@dharr 

I hadn't gone that far.
You should convert your comment into an answer so that I can vote for it.

@WD0HHU 

for your comment

@WD0HHU  @salim-barzani @acer

In case you would be interested in plotting Re(solnum) or Im(solnum) the same I did here is a worksheet that could suit you (please adapt it using the @acer's correction for Maple 2025)

plot-help_sand15_(2).mw

Here is the pattern which replicates periodically in the z direction (explanations within the attached worksheet)

And its z-periodization

@acer By the way, I realized that abs is not mandatory.

@WD0HHU 

I use Maple 2015 and it is very likely that some functions do not operate identically with Maple 2025.
Can you send me the your Maple 2025 worksheet in order I try to fix these issues?

More specificaly I want a worksheet which contains the ourputs of these commands:

core  := indets(solnum, function)[];

acore := expand(core) assuming y::real, z::real;

indets(acore, function);

test :=select(has, %, cos);

T1   := op(test);
T2   := test[];

U1   := op(T1);
U2   := op(T2);

coeff(U1, z);

It's clear that if z_period is not a numeric value all which follows will generate an error or another

simplify/trig worked with Maple 2015.

restart

kernelopts(version)

`Maple 2015.2, APPLE UNIVERSAL OSX, Dec 20 2015, Build ID 1097895`

(1)

A := sqrt(1-cos(x)^2)

(1-cos(x)^2)^(1/2)

(2)

simplify(A, trig)

(sin(x)^2)^(1/2)

(3)
 

 

Download simplify.mw

By the way, the csgn(sin(x))*sin(x) result is obtained with Maple 2015 as the result of 

simplify(convert(A, sin))

@acer 

You're absolutely right about the remember option.

I had thought of using but finally evaded the question because I don't really understand what the procedure "remembers" does when used.
Whatever the results you get using it are impressive. I will treasure your program.

If you don't mind me using some of your time I gave two related questions to ask you

  1. Why is the parameters option not allowed for bvp: is it a technical impossibility or simply something that has never been coded?
  2. A lot of people use the output=listprocedure option instead of the default one.
    What is the advantage to using it beyond the fact that more quantities are directly available?

@KIRAN SAJJAN

Don't you think the least you could have done would have been to tell me if you were satisfied with what I did?
That said, you are accustomed to this type of attitude.

@Kitonum 

I vote up.

1 2 3 4 5 6 7 Last Page 1 of 36