It’s a simple requirement. We want to transform this:
SQL> select deptno, ename
2 from emp
3 order by 1,2;
DEPTNO ENAME
---------- ----------
10 CLARK
10 KING
10 MILLER
20 ADAMS
20 FORD
20 JONES
20 SCOTT
20 SMITH
30 ALLEN
30 BLAKE
30 JAMES
30 MARTIN
30 TURNER
30 WARD
into this:
DEPTNO MEMBERS
---------- -------------------------------------
10 CLARK,KING,MILLER
20 SMITH,JONES,SCOTT,ADAMS,FORD
30 ALLEN,WARD,MARTIN,BLAKE,TURNER,JAMES
Dinosaurs like myself remember the terrible trouble we used to go to in order to solve this seemingly simple problem. We could use the MODEL clause,
SQL> select deptno , rtrim(ename,',') enames
2 from ( select deptno,ename,rn
3 from emp
4 model
5 partition by (deptno)
6 dimension by (
7 row_number() over
8 (partition by deptno order by ename) rn
9 )
10 measures (cast(ename as varchar2(40)) ename)
11 rules
12 ( ename[any]
13 order by rn desc = ename[cv()]||','||ename[cv()+1])
14 )
15 where rn = 1
16 order by deptno;
DEPTNO ENAMES
---------- ----------------------------------------
10 CLARK,KING,MILLER
20 ADAMS,FORD,JONES,SCOTT,SMITH
30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD
or we could use hierarchical trickery,
SQL> select deptno,
2 substr(max(sys_connect_by_path(ename, ',')), 2) members
3 from (select deptno, ename,
4 row_number ()
5 over (partition by deptno order by empno) rn
6 from emp)
7 start with rn = 1
8 connect by prior rn = rn - 1
9 and prior deptno = deptno
10 group by deptno
11 /
DEPTNO MEMBERS
---------- ---------------------------------------------------------
30 ALLEN,WARD,MARTIN,BLAKE,TURNER,JAMES
20 SMITH,JONES,SCOTT,ADAMS,FORD
10 CLARK,KING,MILLER
or we could build our own aggregation routine from the ground up,
SQL> create or replace type string_agg_type as object
2 (
3 total varchar2(4000),
4
5 static function
6 ODCIAggregateInitialize(sctx IN OUT string_agg_type )
7 return number,
8
9 member function
10 ODCIAggregateIterate(self IN OUT string_agg_type ,
11 value IN varchar2 )
12 return number,
13
14 member function
15 ODCIAggregateTerminate(self IN string_agg_type,
16 returnValue OUT varchar2,
17 flags IN number)
18 return number,
19
20 member function
21 ODCIAggregateMerge(self IN OUT string_agg_type,
22 ctx2 IN string_agg_type)
23 return number
24 );
25 /
Or we had some sort of personality disorder then we could resort to manipulating some XML via XMLDB.
SQL> select deptno,
2 xmltransform
3 ( sys_xmlagg
4 ( sys_xmlgen(ename)
5 ),
6 xmltype
7 (
8 '<?xml version="1.0"?><xsl:stylesheet version="1.0"
9 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
10 <xsl:template match="/">
11 <xsl:for-each select="/ROWSET/ENAME">
12 <xsl:value-of select="text()"/>,</xsl:for-each>
13 </xsl:template>
14 </xsl:stylesheet>'
15 )
16 ).getstringval() members
17 from emp
18 group by deptno;
DEPTNO MEMBERS
---------- --------------------------------------------------------
10 CLARK,MILLER,KING,
20 SMITH,FORD,ADAMS,SCOTT,JONES,
30 ALLEN,JAMES,TURNER,BLAKE,MARTIN,WARD,
Thankfully all of these were solved once we made it to 11g with the simple LISTAGG function
SQL> select deptno,
2 listagg( ename, ',')
3 within group (order by empno) members
4 from emp
5 group by deptno;
DEPTNO MEMBERS
---------- -----------------------------------------
10 CLARK,KING,MILLER
20 SMITH,JONES,SCOTT,ADAMS,FORD
30 ALLEN,WARD,MARTIN,BLAKE,TURNER,JAMES
I’ve always liked LISTAGG because the function reads like the business requirement we are trying to meet:
“AGG-regate a LIST of ENAMEs in sequence of EMPNO within the grouping of DEPTNO”
But there has been once problem with LISTAGG since it’s arrival in 11g, and that is what to do with duplicate data. Duplicates can either just look messy, for example, when I swap out ENAME for JOB:
SQL> select deptno
2 listagg(job,',') within group ( order by job) as jobs
3 from scott.emp
4 group by deptno
5 order by 1;
DEPTNO JOBS
--------- --------------------------------------------------
10 CLERK,MANAGER,PRESIDENT
20 ANALYST,ANALYST,CLERK,CLERK,MANAGER
30 CLERK,MANAGER,SALESMAN,SALESMAN,SALESMAN,SALESMAN
Or they could be even worse in the situation where the number of duplicates results in data exceeding allowing database limits. In the example below, there are hundreds of objects each with the same object type within a schema, and hence the aggregation blows the length limit for a varchar2.
SQL> select owner
2 listagg(object_type,',') within group
3 ( order by object_id ) as types
4 from all_ojects
5 group by owner
6 order by 1;
ERROR:
ORA-01499: result of string concatenation is too long
With 19c, our (technical) prayers have been answered with the long awaited arrival of the DISTINCT extension to the syntax.
SQL> select deptno,
2 listagg(distinct job,',') within group ( order by job ) as jobs
3 from scott.emp
4 group by deptno
5 order by 1;
DEPTNO JOBS
---------- ------------------------------------------------------------
10 CLERK,MANAGER,PRESIDENT
20 ANALYST,CLERK,MANAGER
30 CLERK,MANAGER,SALESMAN
3 rows selected.
SQL> select owner,
2 listagg(distinct object_type,',') within group ( order by object_type ) as types
3 from all_objects
4 group by owner
5 order by 1;
OWNER
------------------------------
TYPES
------------------------------------------------------------------------------------------------------------
------------
APPQOSSYS
SYNONYM,TABLE
AUDSYS
INDEX PARTITION,LIBRARY,PACKAGE,PACKAGE BODY,TABLE,TABLE PARTITION,VIEW
CTXSYS
FUNCTION,INDEX,INDEXTYPE,LIBRARY,OPERATOR,PACKAGE,PACKAGE BODY,PROCEDURE,SEQUENCE,TABLE,TYPE,TYPE BODY,VIEW
etc
Now only if LISTAGG could be extended to support generating CLOBs…
Hello Connor,
Here are 3 pre-19c solutions for “LISTAGG DISTINCT”:
select deptno, listagg(job, ‘,’) within group (order by job) jobs
from (
select deptno,
case row_number() over (partition by deptno, job order by 1)
when 1 then job
end job
from scott.emp
)
group by deptno
order by deptno
/
DEPTNO JOBS
———————————
10 CLERK,MANAGER,PRESIDENT
20 ANALYST,CLERK,MANAGER
30 CLERK,MANAGER,SALESMAN
3 rows selected.
select deptno,
( select listagg(column_value,’,’) within group (order by column_value)
from table(v.t_jobs)
) jobs
from (
select deptno,
collect(distinct job) t_jobs
from scott.emp
group by deptno
) v
order by deptno
/
DEPTNO JOBS
———————————
10 CLERK,MANAGER,PRESIDENT
20 ANALYST,CLERK,MANAGER
30 CLERK,MANAGER,SALESMAN
3 rows selected.
select deptno,
regexp_replace(listagg(job,’,’) within group (order by job), ‘([^,]+)(,\1)+’, ‘\1’) jobs
from scott.emp
group by deptno
order by deptno
/
DEPTNO JOBS
———————————
10 CLERK,MANAGER,PRESIDENT
20 ANALYST,CLERK,MANAGER
30 CLERK,MANAGER,SALESMAN
3 rows selected.
At least for self-brain-development, it sometimes help to be a “slower-upgrader” 🙂
Cheers & Best Regards,
Iudith Mentzel
Hi,
https://community.oracle.com/ideas/12533
As the last entries suggest there are still some rough spots :
I.e. handling (possibly with an exception) the case where the DISTINCT column(s) aren’t the leading ORDER BY ones.
regards,
Hi,
About my previous comment (although I can’t see it yet?) :
I just checked it and LISTAGG-DISTINCT works Ok with non-matching ORDER BY :
https://livesql.oracle.com/apex/livesql/s/hxn75ii2xjqfqhn0c1wxpu2zx
regards,
The “can’t see it yet” is because WordPress seems to randomly pick comments and says they need moderation from me. I’ve not yet worked out the rhyme or reason why
Hi,
Wild guess : could be web-links in the body?
regards,
Hello,
Just as a curiosity, it also works without a WITHIN GROUP clause at all,
and the results appear ordered by the DISTINCT column:
select d.dname,
listagg (DISTINCT e.job,’, ‘ on overflow truncate with count) jobs
from scott.dept d, scott.emp e
where d.deptno = e.deptno
group by d.dname
/
DNAME JOBS
————————————————-
ACCOUNTING CLERK, MANAGER, PRESIDENT
RESEARCH ANALYST, CLERK, MANAGER
SALES CLERK, MANAGER, SALESMAN
3 rows selected.
I don’t think that is related to DISTINCT. I think in 18 we just allowed an optional WITHIN GROUP which effectively means you don’t care about the ordering of elements, eg
SQL> select banner from v$version where rownum = 1;
BANNER
——————————————————————————
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 – 64bit Production
SQL> select d.dname,
2 listagg(e.job,’,’) jobs
3 from scott.dept d, scott.emp e
4 where d.deptno = e.deptno
5 group by d.dname
6 /
listagg (e.job,’,’) jobs
*
ERROR at line 2:
ORA-02000: missing WITHIN keyword
SQL> select banner from v$version where rownum = 1;
BANNER
———————————————————————-
Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 – Production
SQL> select d.dname,
2 listagg (e.job,’,’) jobs
3 from scott.dept d, scott.emp e
4 where d.deptno = e.deptno
5 group by d.dname
6 /
DNAME JOBS
————– ——————————————————-
RESEARCH MANAGER,ANALYST,CLERK,CLERK,ANALYST
SALES SALESMAN,SALESMAN,SALESMAN,CLERK,MANAGER,SALESMAN
ACCOUNTING MANAGER,PRESIDENT,CLERK
Hi,
Seems to be unsorted with or without DISTINCT (with none or mismatched ORDER BY) which is Ok. Probably the same algorithm as GROUP BY which stopped sorting decades ago 😉
Would need a lot of meticulous experimentation to see if Oracle simply ignores mismatched ORDER BYs or tries to honor them as much as possible.
regards,
Hello,
I don’t think that having a DISTINCT with a mismatched ORDER BY does make too much sense logically speaking.
For example, in a SELECT DISTINCT you cannot order by something that is NOT part
of the SELECT list.
So, even it technically works and even it Oracle is picking up the distinct values in the mismatched ORDER that you specify, that ORDER becomes irrelevant for the final output.
Cheers & Best Regards,
Iudith
Having Distinct is really a cool feature. But can we get rid of this error in 19c?
ERROR:
ORA-01499: result of string concatenation is too long
I’ve never really understood the requirement. Why do I want >32k of data on a single “line”. Seems odd to me