Whether you call it “Kris Kringle” or “Secret Santa“, there’s much pleasure in giving small gifts to people over the Christmas period. So rather than fully fledged blog posts, I’ll wrap up this year with a series of daily quick tips on the Oracle Database. Ho Ho Ho !
If you are ever looking for a quick and easy way to get an assessment of the distribution of data in a column, a useful function is the WIDTH_BUCKET function
SQL> select width_bucket(amount_sold,20,500,10), amount_sold
2 from sh.sales
3 where rownum <= 20;
WIDTH_BUCKET(AMOUNT_SOLD,20,500,10) AMOUNT_SOLD
----------------------------------- -----------
4 198.97
0 13.95
3 143.18
1 61.22
3 160.37
0 10.15
11 632.19
1 32.4
0 9.54
1 51.96
1 30.53
2 112.8
3 117.02
2 99.52
6 288.16
1 55.87
0 7.13
11 628.89
2 95.33
2 95.66
Lets explore the parameters:
WIDTH_BUCKET(amount_sold,20,500,10) means
- Look at AMOUNT_SOLD values
- Divide those values into 10 buckets, ranging from 20 to 500, so in this case, 10 buckets each of size 48
- For amounts between 20 and 20+48, this is bucket 1
- For amounts between 20+48 and 20+48*2, this is bucket 2, and so forth
- Anything less than 20, gets assigned bucket 0
- Anything greater than 500, gets assigned bucket 11 (being the number of allowed buckets + 1 )
Typically, you will see WIDTH_BUCKET used to do some aggregation, for example:
SQL> select width_bucket(amount_sold,20,500,10) bucket, count(*)
2 from sh.sales
3 group by bucket
4 order by 1;
BUCKET COUNT(*)
---------- ----------
0 252397
1 503825
2 37715
3 29381
4 23127
5 7123
6 6394
7 534
10 1241
11 57106
10 rows selected.
which gives us nice information about the distribution of the data, eg, ~252,000 values below 20, ~57,000 values above 500 and so on.
In case you are wondering how I managed to do a GROUP BY using the column alias, then 23ai brings that nice feature into play. More details on that in the video below
An alternative height balanced histogram of the data could be found using NTILE if that is more your preference
SQL> select decile, min(amount_sold), max(amount_sold), count(*)
2 from (
3 select ntile(10) over (order by amount_sold) as decile , amount_sold
4 from sh.sales
5 )
6 group by decile
7 order by 1;
DECILE MIN(AMOUNT_SOLD) MAX(AMOUNT_SOLD) COUNT(*)
---------- ---------------- ---------------- ----------
1 6.4 9.85 91885
2 9.85 12.95 91885
3 12.95 21.23 91885
4 21.23 26.45 91884
5 26.45 34.24 91884
6 34.24 46.72 91884
7 46.72 50.79 91884
8 50.79 62.45 91884
9 62.45 176.86 91884
10 176.86 1782.72 91884
10 rows selected.
Ho Ho Ho… Merry Christmas!




Got some thoughts? Leave a comment