**Why this post**

This work was intended to be a simple reply to a question asked a few days ago.

At some point, I realised that the approach I was using could have a more general interest which, in my opinion, was worth a post.

In a few words, this post is about solving an algebra problem using a method originally designed to tackle statistical problems.

**The Context**

Recently @raj2018 submitted a question I'm going to resume this way:

Let **S(phi ; beta, f)** a function of **phi** parameterized by **beta** and **f**.

Here is the graph of **S(phi ; 0.449, 0.19) ** @raj2018 provided

@raj2018 then asked how we can find other values** (A, B**) of values for (**beta, f)** such that the graph of **S(phi, A, B)** has the same aspect of the graph above.

More precisely, let **phi_0** the largest strictly negative value of **phi** such that **S(phi_0, A, B) = 0**.

Then **S(phi, A, B)** must be negative (strictly negative?) in the open interval **(phi_0, 0)**, and must have exactly 3 extrema in this range.

I will said the point ** (A, B) **is admissible is **S(phi, A, B)** verifies thess conditions

The expression of **S(phi, A, B) **is that complex that it is likely impossible to find an (several?, all?) admissible point using analytic developments.

**The approach**

When I began thinking to this problem I early thought to find the entire domain of admissible points: was it something possible, at least with some reasonable accuracy?

Quite rapidly I draw an analogy with an other type of problems whose solution is part of my job: the approximate construction of the probability density function (PDF) of multivariate random variables (obviously this implies that no analytical expression of this PDF is available). This is a very classical problem in Bayesian Statistics, for instance when we have to construt an approximation of a posterior PDF.

To stick with this example and put aside the rare situations where this PDF can be derived analytically, building a posterior PDF is largely based on specific numerical methods.

The iconic one is known under the generic name MCMC which stands for **Markov Chain Monte Carlo**.

Why am I speaking about **MCMC** or PDF or even random variables?

Let us consider some multivariate random variable **R** whose PDF as a constant on some bounded domain **D** and is equal to 0 elsewhere. **R** is then a uniform random variable with support **supp(R) = D**.

Assuming the domain **Adm** of admissible **(beta, f) **is bounded, we may think of it as the support of some uniform random variable. Following this analogy we may expect to use some **MCMC** method to "__build the PDF of the bivariate random variable __**(beta, f**)", otherwise stated "__to capture the boundary of __**Adm**".

The simplest **MCMC** method is the Metropolis-Hastings algorithm (**MH**).

In a few simple words **MH** builds a Markov chain this way:

- Let us assume that the chain already contains elements
** e**_{1}, ..., e_{n}.

Let *f* some suitable "fitness" function (whose nature is of no importance right now).
- A potential new element
**c**_{ }is randomly picked in some neighborhood or **e**_{n}.
- If the ratio
*f *(**c**) / *f *(**e**_{n}) is larger than 1, we decide to put **c**_{ }into the chain (thus **e**_{n+1 }= c) otherwise we leave it to chance to decide whether or not **c**_{ }iis put into the chain.

If chance decides the contrary, then **e**_{n} is duclicated (thus **e**_{n+1 }= e_{n}).

**MH** is not the most efficient **MCMC** algorithm but it is efficient enough for what we want to achieve.

The main difficulty here is that there is no natural way to build the fitness function *f* , mainly because the equivalent random variable I talked about is a purely abstract construction.

A preliminary observation is that if **S(phi, beta, f) < 0** whatever **phi in (phi_0, 0)**, then** S** has an odd number of extrema in **(phi_0, 0)**. The simplest way to find these extrema is to search for the zeros of the derivative **S'** of **S** with respect to** phi**, while discardinq those where the second derivative can reveal "false" extrema where both **S''** of **S** is null (__I emphasize this last point because I didn't account for it in attached file__).

The algorithm designed in this file probably misses a few points for not checking if **S''=0**, but it is important to keep in mind that we don't want a complete identification of **Adm** but just the capture of its boundary.

Unless we are extremely unlucky there is only a very small chance that omitting to check if **S''=0** will deeply modify this boundary.

**How to define function ***f* ?

What we want is that *f* (**c**) / *f *(**e**_{n}) represents the probability to decide wether **c **is an admissible point or not. In a Markov chain this ratio represents how better or worse** c **is relatively to **e**_{n}, and this is essential for the chain to be a true Markov chain.

But as our aim is not to build a **true Markov chain **but simply a chain which looks like a Markov chain, we we can take some liberties and replace *f* (**c**) / *f *(**e**_{n}) by some function *g*(**c**) which quantifies the propability for **c **to be an admissible couple. So we want that *g*(**c**) = 1 if **S(phi, c)** has exactly **M=3** negative extrema and *g*(**c**) < 1 if **M <> 3**.

The previous algorihm transforms into:

- Let us assume that the chain already contains elements
** e**_{1}, ..., e_{n}.

Let *g* a function which the propability that element is admissible
- A potential new element
**c**_{ }is randomly picked in some neighborhood or **e**_{n}.
- If the ratio
*g*(**c**) is larger than 1, we decide to put **c**_{ }into the chain (thus **e**_{n+1 }= c) otherwise we leave it to chance to decide whether or not **c**_{ }iis put into the chain.

If chance decides the contrary, then **e**_{n} is duclicated (thus **e**_{n+1 }= e_{n}).

This algorithm can also be seen as a kind of genetic algorithm.

A possible choice is *g*(c)**= exp(-|3-M|)**.

In the attached file I use instead the expression *g*(c) = (M + 1) / 4 fo several reasons:

- It is less sharp at
**M=3** and thus enables more often to put **c** into the chain, which increases its exploratory capabilities.
- The case
**M > 3**, which no preliminary investigation was able to uncover, is by construction eliminated in the procedure Extrema which use an early stopping strategy (if as soon as more than M=3 negative extrema are found the procedure stops).

The algorithm I designed basically relies upon two stages:

- The first one is aimed to construct a "long" Markov-like chain ("long" and not long because Markov chains are usually much longer than those I use).

There are two goals here:
- Check if
**Adm** is or not simply-connected or not (if it has holes or not).
- Find a first set of admissible points that can be used as starting points for subsequent chains.

- Run several independent Markov-like chains from a reduced set of admissible points.

The way this reduced set is constructed depends on the goal to achieve:
- One may think of adding points among those already known in order to assess the connectivity of
**Adm**,
- or refinining the boundary of
**Adm**.

These two concurent objectives are mixed in an ad hoc way depending on the observation of the results already in hand.

We point here an important feature of **MCMC** methods: behind their apparent algorithmic simplicity, it is common that high-quality results can only be obtained efficiently at the cost of problem-dependent tuning.

A last word to say that after several trials and failures I found it simpler to reparameterize the problems in** **terms of **(phi_0, f)** instead of **(beta, f)**.

**Codes and results**

**Choice** *g*(c) = (M + 1) / 4

The code : Extrema_and_MCMC.mw

To access the full results I got load this **m file** (do not bother its extension, Mapleprimes doesn't enable uploading **m files**) MCMC_20231209_160249.mw (save it and change it's extension in to **m** instead **mw**)

**EDITED:** **choice ***g*(c)**= exp(-|3-M|)**

Here are the files contzining the code and the results:

Extrema_and_MCMC_g2.mw

MCMC_20231211_084053.mw

To ease the comparison of the two sets of results I used the same random seeds inn both codes.

Comparing the results got around the first admissible point is straightforward.

It's more complex for @raj2018's solution because the first step of the algorithim (drawing of a sibgle chain of length 1000) finds six times more admissible point with *g*(c)**= exp(-|3-M|) **than with *g*(c) = (M + 1) / 4.