Aggregate statistics are calculated by splitting the rows of a DataFrame by each factor in a given column into subsets and computing summary statistics for each of these subsets.

The following is a short example of how the Aggregate command is used to compute aggregate statistics for a DataFrame with housing data:

To begin, we construct a DataFrame with housing data: The first column has number of bedrooms, the second has the area in square feet, the third has price.

bedrooms := <3, 4, 2, 4, 3, 2, 2, 3, 4, 4, 2, 4, 4, 3, 3>:

area := <1130, 1123, 1049, 1527, 907, 580, 878, 1075,

1040, 1295, 1100, 995, 908, 853, 856>:

price := <114700, 125200, 81600, 127400, 88500, 59500, 96500, 113300,

104400, 136600, 80100, 128000, 115700, 94700, 89400>:

HouseSalesData := DataFrame([bedrooms, area, price], columns = [Bedrooms, Area, Price]);

Note that the Bedrooms column has three distinct levels: 2, 3, and 4.

convert(HouseSalesData[Bedrooms], set);

The following returns the mean of all other columns for each distinct level in the column, **Bedrooms**:

Aggregate(HouseSalesData, Bedrooms);

Adding the **columns** option controls which columns are returned.

Aggregate(HouseSalesData, Bedrooms, **columns** = [Price])

Additionally, the **tally** option returns a tally for each of the levels.

Aggregate(HouseSalesData, Bedrooms, **tally**)

The **function **option allows for the specification of any command that can be applied to a DataSeries. For example, the Statistics:-Median command computes the median for each of the levels of Bedrooms.

Aggregate(HouseSalesData, Bedrooms, **function** = Statistics:-Median);

By default, Aggregate uses the SplitByColumn command to creates a separate sub-DataFrame for every discrete level in the column given by bycolumn.

with(Statistics);

ByRooms := SplitByColumn(HouseSalesData, Bedrooms);

We can create box plots of the price for subgroups of sales defined by number of bedrooms.

BoxPlot( map( (m)->m[Price], ByRooms),

deciles=false,

datasetlabels=["2 bdrms", "3 bdrms", "4 bdrms"],

color=["Red", "Purple", "Blue"]);

I have recorded a short video that walks through this example here: https://youtu.be/e0pqCMyO3ks

The worksheet for this example can be downloaded here: Aggregate.mw