Granular detail from a summary

Posted by

We had an interesting question on AskTom a few days ago.  Given a set of 12 values (forecasts in this case), one for each month of the year , can we manufacture a set of weekly forecasts for the same period.  Now it is perhaps a little dubious to “invent” detailed data out of summarised data, but we can come up with a reasonable algorithm for doing so.

We could simply divide each month by 4 because there are ~4 weeks per month, but we still face the challenge of mapping 48 values into 52/53 weeks of the year.

We could also sum all the months, and carve it into 52 weekly chunks, but in doing so, we lose the peaks and troughs that a monthly forecast is meant to represent.

So can we do better ?

Let’s assume that the forecast for a month implies an even distribution of that value over each day in that month.  Once we take the forecast down to the daily level, we should then be able to aggregate that into weeks without too much trouble.  So here’s how we did in SQL

Firstly, our monthly forecasts:


SQL> create table t ( mth date, fcast int );

Table created.

SQL> insert into t
  2  select add_months(date '2016-01-01',rownum-1), dbms_random.value(1000,5000)
  3  from dual
  4  connect by level <= 12;

12 rows created.

SQL> select * from t order by 1;

MTH            FCAST
--------- ----------
01-JAN-16       2241
01-FEB-16       1903
01-MAR-16       4153
01-APR-16       4466
01-MAY-16       3039
01-JUN-16       2914
01-JUL-16       4751
01-AUG-16       2925
01-SEP-16       1369
01-OCT-16       3330
01-NOV-16       3747
01-DEC-16       4236

12 rows selected.

To generate every day of the year, we can use the normal “trick” with DUAL:



SQL> with
  2  every_day as
  3    ( select date '2016-01-01'+rownum-1 d
  4      from dual
  5      connect by level <= to_number(to_char(date '2016-12-31','DDD'))
  6    )
  7  select * from every_day;

D
---------
01-JAN-16
02-JAN-16
03-JAN-16
04-JAN-16
05-JAN-16
06-JAN-16
...
24-DEC-16
25-DEC-16
26-DEC-16
27-DEC-16
28-DEC-16
29-DEC-16
30-DEC-16
31-DEC-16

366 rows selected.

We can use standard date functions to determine how many days are in each month:


SQL> with
  2  every_day as
  3    ( select date '2016-01-01'+rownum-1 d
  4      from dual
  5      connect by level <= to_number(to_char(date '2016-12-31','DDD'))
  6    ),
  7  date_ranges as
  8    ( select mth, last_day(mth) mth_end, to_number(to_char(last_day(mth),'DD')) no_of_days, fcast
  9      from t
 10    )
 11  select * from date_ranges
 12  order by 1;

MTH       MTH_END   NO_OF_DAYS      FCAST
--------- --------- ---------- ----------
01-JAN-16 31-JAN-16         31       2241
01-FEB-16 29-FEB-16         29       1903
01-MAR-16 31-MAR-16         31       4153
01-APR-16 30-APR-16         30       4466
01-MAY-16 31-MAY-16         31       3039
01-JUN-16 30-JUN-16         30       2914
01-JUL-16 31-JUL-16         31       4751
01-AUG-16 31-AUG-16         31       2925
01-SEP-16 30-SEP-16         30       1369
01-OCT-16 31-OCT-16         31       3330
01-NOV-16 30-NOV-16         30       3747
01-DEC-16 31-DEC-16         31       4236

12 rows selected.

And with that, now generate a daily forecast:


SQL>
SQL> with
  2  every_day as
  3    ( select date '2016-01-01'+rownum-1 d
  4      from dual
  5      connect by level <= to_number(to_char(date '2016-12-31','DDD'))
  6    ),
  7  date_ranges as
  8    ( select mth, last_day(mth) mth_end, to_number(to_char(last_day(mth),'DD')) no_of_days, fcast
  9      from t
 10    ),
 11  apportioned_forecast as
 12   (  select every_day.d, date_ranges.fcast, date_ranges.no_of_days, date_ranges.fcast / date_ranges.no_of_days daily_amount
 13      from  every_day,
 14            date_ranges
 15       where every_day.d between date_ranges.mth and date_ranges.mth_end
 16   )
 17  select * from  apportioned_forecast
 18  order by 1;

D              FCAST NO_OF_DAYS DAILY_AMOUNT
--------- ---------- ---------- ------------
01-JAN-16       2241         31   72.2903226
02-JAN-16       2241         31   72.2903226
03-JAN-16       2241         31   72.2903226
04-JAN-16       2241         31   72.2903226
05-JAN-16       2241         31   72.2903226
06-JAN-16       2241         31   72.2903226
07-JAN-16       2241         31   72.2903226
08-JAN-16       2241         31   72.2903226
09-JAN-16       2241         31   72.2903226
10-JAN-16       2241         31   72.2903226
11-JAN-16       2241         31   72.2903226
12-JAN-16       2241         31   72.2903226
13-JAN-16       2241         31   72.2903226
14-JAN-16       2241         31   72.2903226
15-JAN-16       2241         31   72.2903226
16-JAN-16       2241         31   72.2903226
17-JAN-16       2241         31   72.2903226
...
21-DEC-16       4236         31   136.645161
22-DEC-16       4236         31   136.645161
23-DEC-16       4236         31   136.645161
24-DEC-16       4236         31   136.645161
25-DEC-16       4236         31   136.645161
26-DEC-16       4236         31   136.645161
27-DEC-16       4236         31   136.645161
28-DEC-16       4236         31   136.645161
29-DEC-16       4236         31   136.645161
30-DEC-16       4236         31   136.645161
31-DEC-16       4236         31   136.645161

366 rows selected.

With our forecast data now at a day-based granularity, we simply sum the data by week:


SQL> with
  2  every_day as
  3    ( select date '2016-01-01'+rownum-1 d
  4      from dual
  5      connect by level <= to_number(to_char(date '2016-12-31','DDD'))
  6    ),
  7  date_ranges as
  8    ( select mth, last_day(mth) mth_end, to_number(to_char(last_day(mth),'DD')) no_of_days, fcast
  9      from t
 10    ),
 11  apportioned_forecast as
 12   (  select every_day.d, date_ranges.fcast, date_ranges.no_of_days
 13      from  every_day,
 14            date_ranges
 15       where every_day.d between date_ranges.mth and date_ranges.mth_end
 16   )
 17  select trunc(d,'IW') wk, round(sum(fcast/no_of_days),2) wk_fcast
 18  from   apportioned_forecast
 19  group by trunc(d,'IW')
 20  order by 1;

WK          WK_FCAST
--------- ----------
28-DEC-15     216.87
04-JAN-16     506.03
11-JAN-16     506.03
18-JAN-16     506.03
25-JAN-16     506.03
01-FEB-16     459.34
08-FEB-16     459.34
15-FEB-16     459.34
22-FEB-16     459.34
29-FEB-16     869.43
07-MAR-16     937.77
14-MAR-16     937.77
21-MAR-16     937.77
28-MAR-16     982.47
04-APR-16    1042.07
11-APR-16    1042.07
18-APR-16    1042.07
25-APR-16     991.23
02-MAY-16     686.23
09-MAY-16     686.23
16-MAY-16     686.23
23-MAY-16     686.23
30-MAY-16     681.73
06-JUN-16     679.93
13-JUN-16     679.93
20-JUN-16     679.93
27-JUN-16     848.31
04-JUL-16    1072.81
11-JUL-16    1072.81
18-JUL-16    1072.81
25-JUL-16    1072.81
01-AUG-16     660.48
08-AUG-16     660.48
15-AUG-16     660.48
22-AUG-16     660.48
29-AUG-16      465.6
05-SEP-16     319.43
12-SEP-16     319.43
19-SEP-16     319.43
26-SEP-16     443.01
03-OCT-16     751.94
10-OCT-16     751.94
17-OCT-16     751.94
24-OCT-16     751.94
31-OCT-16     856.82
07-NOV-16      874.3
14-NOV-16      874.3
21-NOV-16      874.3
28-NOV-16     921.28
05-DEC-16     956.52
12-DEC-16     956.52
19-DEC-16     956.52
26-DEC-16     819.87

53 rows selected.

SQL>

One comment

  1. “Let’s assume that the forecast for a month implies an even distribution of that value over each day in that month.”

    That’s a HUGE assumption. And probably not typical, as it also implies changes to the daily average occurred all at once on the month boundaries.

    Realistically (where I like to live), it might be the best one can do.

Got some thoughts? Leave a comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.