We had an interesting question on AskTom a few days ago. Given a set of 12 values (forecasts in this case), one for each month of the year , can we manufacture a set of weekly forecasts for the same period. Now it is perhaps a little dubious to “invent” detailed data out of summarised data, but we can come up with a reasonable algorithm for doing so.
We could simply divide each month by 4 because there are ~4 weeks per month, but we still face the challenge of mapping 48 values into 52/53 weeks of the year.
We could also sum all the months, and carve it into 52 weekly chunks, but in doing so, we lose the peaks and troughs that a monthly forecast is meant to represent.
So can we do better ?
Let’s assume that the forecast for a month implies an even distribution of that value over each day in that month. Once we take the forecast down to the daily level, we should then be able to aggregate that into weeks without too much trouble. So here’s how we did in SQL
Firstly, our monthly forecasts:
SQL> create table t ( mth date, fcast int );
Table created.
SQL> insert into t
2 select add_months(date '2016-01-01',rownum-1), dbms_random.value(1000,5000)
3 from dual
4 connect by level <= 12;
12 rows created.
SQL> select * from t order by 1;
MTH FCAST
--------- ----------
01-JAN-16 2241
01-FEB-16 1903
01-MAR-16 4153
01-APR-16 4466
01-MAY-16 3039
01-JUN-16 2914
01-JUL-16 4751
01-AUG-16 2925
01-SEP-16 1369
01-OCT-16 3330
01-NOV-16 3747
01-DEC-16 4236
12 rows selected.
To generate every day of the year, we can use the normal “trick” with DUAL:
SQL> with
2 every_day as
3 ( select date '2016-01-01'+rownum-1 d
4 from dual
5 connect by level <= to_number(to_char(date '2016-12-31','DDD'))
6 )
7 select * from every_day;
D
---------
01-JAN-16
02-JAN-16
03-JAN-16
04-JAN-16
05-JAN-16
06-JAN-16
...
24-DEC-16
25-DEC-16
26-DEC-16
27-DEC-16
28-DEC-16
29-DEC-16
30-DEC-16
31-DEC-16
366 rows selected.
We can use standard date functions to determine how many days are in each month:
SQL> with
2 every_day as
3 ( select date '2016-01-01'+rownum-1 d
4 from dual
5 connect by level <= to_number(to_char(date '2016-12-31','DDD'))
6 ),
7 date_ranges as
8 ( select mth, last_day(mth) mth_end, to_number(to_char(last_day(mth),'DD')) no_of_days, fcast
9 from t
10 )
11 select * from date_ranges
12 order by 1;
MTH MTH_END NO_OF_DAYS FCAST
--------- --------- ---------- ----------
01-JAN-16 31-JAN-16 31 2241
01-FEB-16 29-FEB-16 29 1903
01-MAR-16 31-MAR-16 31 4153
01-APR-16 30-APR-16 30 4466
01-MAY-16 31-MAY-16 31 3039
01-JUN-16 30-JUN-16 30 2914
01-JUL-16 31-JUL-16 31 4751
01-AUG-16 31-AUG-16 31 2925
01-SEP-16 30-SEP-16 30 1369
01-OCT-16 31-OCT-16 31 3330
01-NOV-16 30-NOV-16 30 3747
01-DEC-16 31-DEC-16 31 4236
12 rows selected.
And with that, now generate a daily forecast:
SQL>
SQL> with
2 every_day as
3 ( select date '2016-01-01'+rownum-1 d
4 from dual
5 connect by level <= to_number(to_char(date '2016-12-31','DDD'))
6 ),
7 date_ranges as
8 ( select mth, last_day(mth) mth_end, to_number(to_char(last_day(mth),'DD')) no_of_days, fcast
9 from t
10 ),
11 apportioned_forecast as
12 ( select every_day.d, date_ranges.fcast, date_ranges.no_of_days, date_ranges.fcast / date_ranges.no_of_days daily_amount
13 from every_day,
14 date_ranges
15 where every_day.d between date_ranges.mth and date_ranges.mth_end
16 )
17 select * from apportioned_forecast
18 order by 1;
D FCAST NO_OF_DAYS DAILY_AMOUNT
--------- ---------- ---------- ------------
01-JAN-16 2241 31 72.2903226
02-JAN-16 2241 31 72.2903226
03-JAN-16 2241 31 72.2903226
04-JAN-16 2241 31 72.2903226
05-JAN-16 2241 31 72.2903226
06-JAN-16 2241 31 72.2903226
07-JAN-16 2241 31 72.2903226
08-JAN-16 2241 31 72.2903226
09-JAN-16 2241 31 72.2903226
10-JAN-16 2241 31 72.2903226
11-JAN-16 2241 31 72.2903226
12-JAN-16 2241 31 72.2903226
13-JAN-16 2241 31 72.2903226
14-JAN-16 2241 31 72.2903226
15-JAN-16 2241 31 72.2903226
16-JAN-16 2241 31 72.2903226
17-JAN-16 2241 31 72.2903226
...
21-DEC-16 4236 31 136.645161
22-DEC-16 4236 31 136.645161
23-DEC-16 4236 31 136.645161
24-DEC-16 4236 31 136.645161
25-DEC-16 4236 31 136.645161
26-DEC-16 4236 31 136.645161
27-DEC-16 4236 31 136.645161
28-DEC-16 4236 31 136.645161
29-DEC-16 4236 31 136.645161
30-DEC-16 4236 31 136.645161
31-DEC-16 4236 31 136.645161
366 rows selected.
With our forecast data now at a day-based granularity, we simply sum the data by week:
SQL> with
2 every_day as
3 ( select date '2016-01-01'+rownum-1 d
4 from dual
5 connect by level <= to_number(to_char(date '2016-12-31','DDD'))
6 ),
7 date_ranges as
8 ( select mth, last_day(mth) mth_end, to_number(to_char(last_day(mth),'DD')) no_of_days, fcast
9 from t
10 ),
11 apportioned_forecast as
12 ( select every_day.d, date_ranges.fcast, date_ranges.no_of_days
13 from every_day,
14 date_ranges
15 where every_day.d between date_ranges.mth and date_ranges.mth_end
16 )
17 select trunc(d,'IW') wk, round(sum(fcast/no_of_days),2) wk_fcast
18 from apportioned_forecast
19 group by trunc(d,'IW')
20 order by 1;
WK WK_FCAST
--------- ----------
28-DEC-15 216.87
04-JAN-16 506.03
11-JAN-16 506.03
18-JAN-16 506.03
25-JAN-16 506.03
01-FEB-16 459.34
08-FEB-16 459.34
15-FEB-16 459.34
22-FEB-16 459.34
29-FEB-16 869.43
07-MAR-16 937.77
14-MAR-16 937.77
21-MAR-16 937.77
28-MAR-16 982.47
04-APR-16 1042.07
11-APR-16 1042.07
18-APR-16 1042.07
25-APR-16 991.23
02-MAY-16 686.23
09-MAY-16 686.23
16-MAY-16 686.23
23-MAY-16 686.23
30-MAY-16 681.73
06-JUN-16 679.93
13-JUN-16 679.93
20-JUN-16 679.93
27-JUN-16 848.31
04-JUL-16 1072.81
11-JUL-16 1072.81
18-JUL-16 1072.81
25-JUL-16 1072.81
01-AUG-16 660.48
08-AUG-16 660.48
15-AUG-16 660.48
22-AUG-16 660.48
29-AUG-16 465.6
05-SEP-16 319.43
12-SEP-16 319.43
19-SEP-16 319.43
26-SEP-16 443.01
03-OCT-16 751.94
10-OCT-16 751.94
17-OCT-16 751.94
24-OCT-16 751.94
31-OCT-16 856.82
07-NOV-16 874.3
14-NOV-16 874.3
21-NOV-16 874.3
28-NOV-16 921.28
05-DEC-16 956.52
12-DEC-16 956.52
19-DEC-16 956.52
26-DEC-16 819.87
53 rows selected.
SQL>
“Let’s assume that the forecast for a month implies an even distribution of that value over each day in that month.”
That’s a HUGE assumption. And probably not typical, as it also implies changes to the daily average occurred all at once on the month boundaries.
Realistically (where I like to live), it might be the best one can do.