Work colleague Chris Saxon published a post about GROUP BY not guaranteeing that a sort would occur, and concluded (correctly) that the only way to ensure that a query result will be sorted is to have an ORDER BY.
Modern versions of the database will use a hashing mechanism to perform a GROUP BY. If you want a more detailed dive into that, check out this video
In the comments section of Chris’s post, someone mentioned “in a prior release a sort would be issued” which was a reference to a time before we started using the GROUP BY HASH operation. However, even without a hash operation, the statement is incorrect.
It would have been correct if we swapped “would” for “could”, namely, that in most situations a sort would be performed. The problem was, developers back in the day would assume that the sort was hence mandatory and thus use a GROUP BY in lieu of a sort which is when all the trouble started.
But “most” is not the same as “always”. Lets look at a couple of examples, which go all the way back to Oracle 7 thus proving you could never blindly assume a sort would occur.
Example 1: We GROUP BY but do not sort.
SQL> create table tab ( pk1 int,
2 pk2 int,
3 c1 int,
4 c2 int,
5 primary key(pk1,pk2) );
Table created.
SQL>
SQL> insert into tab values (1, 1, 2, 2 );
1 row created.
SQL> insert into tab values (2, 1, 1, 1 );
1 row created.
SQL>
SQL> create index tab_idx on tab(pk1,c1,pk2,c2);
Index created.
SQL>
SQL> set autotrace traceonly explain
SQL> select c1, c2, pk1, pk2, count(*)
2 from tab
3 group by c1, c2, pk1, pk2;
Execution Plan
----------------------------------------------------------
Plan hash value: 1837474081
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 104 | 1 (0)| 00:00:01 |
| 1 | SORT GROUP BY NOSORT| | 2 | 104 | 1 (0)| 00:00:01 |
| 2 | INDEX FULL SCAN | TAB_IDX | 2 | 104 | 1 (0)| 00:00:01 |
--------------------------------------------------------------------------------
You can see the existence an operation called GROUP BY NOSORT, which as the name suggests…did not do a sort.
Examples 2: We don’t even do a GROUP BY !
SQL> create table tab ( pk1 int,
2 c1 int,
3 c2 int,
4 primary key(pk1) );
Table created.
SQL>
SQL> insert into tab values (1, 2, 2 );
1 row created.
SQL> insert into tab values (2, 1, 1 );
1 row created.
SQL> create index tab_idx on tab(pk1,c1,c2);
Index created.
SQL>
SQL> set autotrace traceonly explain
SQL> select c1, c2, pk1, count(*)
2 from tab
3 group by c1, c2, pk1;
Execution Plan
----------------------------------------------------------
Plan hash value: 41970856
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 78 | 1 (0)| 00:00:01 |
| 1 | INDEX FULL SCAN | TAB_IDX | 2 | 78 | 1 (0)| 00:00:01 |
----------------------------------------------------------------------------
Bottom Line – If you want something sorted, you must have an ORDER BY
Ho Ho Ho … Merry Christmas!




Leave a reply to Connor McDonald Cancel reply