This post stems from this Question to which the author has never taken the time to give any answer whatsoever.

To help the reader understand what this is all about, I reproduce an abriged version of this question

I have the following data ... [and I want to]  create a cumulative histogram with corresponding polygon employing this same information...

The data the author refers to is a collection of decimal numbers.

The term "histogram" has a very well meaning in Statistics, without entering into technical details, let us say an histogram is an estimator of a Probability Density Function (continuous random variable) or of a mass function (discrete random variable), see for instance Freedman & Diaconis.

The expression "cumulative histogram" is more recent, see for instance Wiki for a quick explanation. Shortly a cumulative histogram can be seen as an approximation of the Cumulative Density Function (CDF) of the random variable whose the sample at hand is drawn from.

In fact there exists an alternative concept named ECDF (Empirical Cumulative Distribution Function) which has been around for a long time and which is already an estimator of the CDF.
Personally I am always surprised, given the many parameters it depends upon (anchors, number of bins, binwidth selection method, ...), when someone wants to draw a cumulative histogram: Why not draw instead the ECDF, a more objective estimator, even simpler to build than the cumulative histogram, and which does not use any parameter (that people often tune to get a pretty image instead of having a reliable estimator)? 

Anyway, I have done a little bit of work arround the OP's question, and it ended in a procedure named Hodgepodge (surely not a very explicit name but I was lacking inspiration) which enables plotting (if asked) several informations in addition to the required cumulative histogram:

  • The histogram of the raw data for the same list of bin bounds.
  • The kernel density estimator of this raw-data-histogram.
  • The ECDF of the data.

Here is an example of data

and here is what procedure Hodgepodge.mw  can display when all the graphics are requested


Please Wait...