209. Cumulative

Average of every observation since the start of the series. The expanding-window endpoint of the moving-averages axis.

Forecast: — predict the long-run average.

209.0.1. Behavior

The cumulative average can also be written recursively:

The second form makes the dynamics clear:

So the effective smoothing parameter of the cumulative average is decreasing over time. As more data arrives, the cumulative average becomes increasingly insensitive to new observations.

209.0.2. Comparison to other moving averages

The relationship between , , and effective memory:

209.0.3. When to use

Don’t use it as a forecast for a non-stationary series. Cumulative gives equal weight to data from years ago and yesterday — useless if the level has shifted.

209.0.4. Connection to MLE

The cumulative average is the maximum likelihood estimator of the population mean for an i.i.d. process. So if you genuinely believe the data are i.i.d. with constant mean, cumulative average is optimal — best you can do.

If you don’t believe i.i.d. (which is true of almost all real time series), use SES, ETS, or ARIMA instead.

Example

Given:

1 2 3 4 5 6
10 20 30 20 12 24

Iterate:

1 10 10/1 = 10.00 1.00
2 20 (10+20)/2 = 15.00 0.50
3 30 (10+20+30)/3 = 20.00 0.33
4 20 (10+20+30+20)/4 = 20.00 0.25
5 12 (10+20+30+20+12)/5 = 18.40 0.20
6 24 (10+20+30+20+12+24)/6 = 19.33 0.17

Notice:

  • stabilizes around 19–20 even as new data wiggles around.
  • The effective shrinks: by , the cumulative average reacts to a new observation with weight only 0.17. By , weight 0.01 — essentially frozen.
  • This means the cumulative average is a late indicator. If the underlying mean shifts at , it’ll take many observations before catches up.