I’ve been doing a series on Analytics here but if you need to group data into ranges, you don’t necessarily need a fancy analytic. The function WIDTH_BUCKET can easily determine histogram groupings within a table.
SQL> SELECT CUST_ID, CUST_LAST_NAME, CUST_CREDIT_LIMIT,
2 WIDTH_BUCKET(CUST_CREDIT_LIMIT, 100, 5000, 10) credit_bucket
3 FROM sh.customers
4 WHERE country_id = 52787
5 ORDER BY CUST_CREDIT_LIMIT;
CUST_ID CUST_LAST_NAME CUST_CREDIT_LIMIT CREDIT_BUCKET
---------- -------------------- ----------------- -------------
44282 Justice 1500 3
50671 Sandoval 1500 3
26284 Lotto 1500 3
4827 Kessel 1500 3
27671 Lin 1500 3
17284 Fellows 1500 3
9671 Rittenour 1500 3
35283 Gatewood 1500 3
9670 Oppy 1500 3
35284 Thomas 1500 3
18670 Callihan 1500 3
18671 Sager 1500 3
102218 Hornick 1500 3
26283 Geiss 1500 3
45827 Rowe 1500 3
42898 Haske 3000 6
5519 Ballenger 3000 6
33898 Batterton 3000 6
46518 Orm 3000 6
24899 Lightfoot 3000 6
47211 Cartwright 3000 6
20744 Gravel 3000 6
102723 Myczkowski 3000 6
101478 Bishop 3000 6
1050 Overton 3000 6
700 Burnns 3000 6
1023 Newcomer 3000 6
38744 Tazelar 3000 6
29744 Durby 3000 6
41514 Burgess 5000 11
45828 Wood 5000 11
102129 Cay 5000 11
40128 Cain 5000 11
23515 Figgens 5000 11
32514 Lengel 5000 11
101613 Spivak 7000 11
470 Sandstrum 7000 11
763 Dutton 7000 11
36667 Capps 7000 11
446 Jeffreys 7000 11
42302 Everrett 7000 11
46519 East 7000 11
708 Door 7000 11
102178 Bloom 7000 11
123 Rockwell 7000 11
104403 Fenton 7000 11
13133 Colven 7000 11
27666 Remler 7000 11
4133 Carr 9000 11
10362 Tate 9000 11
19362 Rosenblum 9000 11
28362 Titus 9000 11
28458 Kohler 9000 11
6208 Vail 9000 11
2749 Kimball 9000 11
15208 Trimmer 9000 11
11748 Stokley 9000 11
24208 Baley 9000 11
26975 Baer 9000 11
35975 Grubb 9000 11
44974 Grier 9000 11
529 Barone 9000 11
2750 Jansen 10000 11
103571 Kennedy 10000 11
346 Elliott 10000 11
47209 Lieberman 10000 11
34590 Barden 10000 11
28469 Adams 10000 11
6209 Crocker 10000 11
43589 Eppling 10000 11
100761 Zwolinsky 10000 11
15209 Fernandez 10000 11
100824 Roy 11000 11
103845 Moy 15000 11
35402 Elkin 15000 11
75 rows selected.
where the 4 arguments are:
- Expression to be evaluated
- Minimum (start) value
- Maximum (end) value
- Number of buckets in the distribution
Notice that if you extend outside the min and max value, the width bucket returns a value one less or one more then the bucket count.
Technically I’d expect data to be grouped in 10 groups (4th param). I understand it indicates “maximum buckets”, but it would be interesting to see why Oracle decided to group them in bucket# 3, 6, 11 and not others.