# LISTAGG hits prime time

Posted by

It’s a simple requirement. We want to transform this:

```
SQL> select deptno, ename
2  from   emp
3  order by 1,2;

DEPTNO ENAME
---------- ----------
10 CLARK
10 KING
10 MILLER
20 FORD
20 JONES
20 SCOTT
20 SMITH
30 ALLEN
30 BLAKE
30 JAMES
30 MARTIN
30 TURNER
30 WARD
```

into this:

```
DEPTNO MEMBERS
---------- -------------------------------------
10 CLARK,KING,MILLER
30 ALLEN,WARD,MARTIN,BLAKE,TURNER,JAMES
```

Dinosaurs like myself remember the terrible trouble we used to go to in order to solve this seemingly simple problem. We could use the MODEL clause,

```
SQL> select deptno , rtrim(ename,',') enames
2  from ( select deptno,ename,rn
3         from emp
4         model
5         partition by (deptno)
6         dimension by (
7            row_number() over
8             (partition by deptno order by ename) rn
9               )
10         measures (cast(ename as varchar2(40)) ename)
11         rules
12         ( ename[any]
13             order by rn desc = ename[cv()]||','||ename[cv()+1])
14         )
15   where rn = 1
16   order by deptno;

DEPTNO ENAMES
---------- ----------------------------------------
10 CLARK,KING,MILLER
30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD
```

or we could use hierarchical trickery,

```
SQL> select deptno,
2         substr(max(sys_connect_by_path(ename, ',')), 2) members
3  from (select deptno, ename,
4               row_number ()
5                   over (partition by deptno order by empno) rn
6        from emp)
8  connect by prior rn = rn - 1
9  and prior deptno = deptno
10  group by deptno
11  /

DEPTNO MEMBERS
---------- ---------------------------------------------------------
30 ALLEN,WARD,MARTIN,BLAKE,TURNER,JAMES
10 CLARK,KING,MILLER
```

or we could build our own aggregation routine from the ground up,

```
SQL> create or replace type string_agg_type as object
2  (
3     total varchar2(4000),
4
5     static function
6          ODCIAggregateInitialize(sctx IN OUT string_agg_type )
7          return number,
8
9     member function
10          ODCIAggregateIterate(self IN OUT string_agg_type ,
11                               value IN varchar2 )
12          return number,
13
14     member function
15          ODCIAggregateTerminate(self IN string_agg_type,
16                                 returnValue OUT  varchar2,
17                                 flags IN number)
18          return number,
19
20     member function
21          ODCIAggregateMerge(self IN OUT string_agg_type,
22                             ctx2 IN string_agg_type)
23          return number
24  );
25  /
```

Or we had some sort of personality disorder then we could resort to manipulating some XML via XMLDB.

```
SQL> select deptno,
2     xmltransform
3     ( sys_xmlagg
4        ( sys_xmlgen(ename)
5        ),
6       xmltype
7       (
8         '<?xml version="1.0"?><xsl:stylesheet version="1.0"
9           xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
10            <xsl:template match="/">
11              <xsl:for-each select="/ROWSET/ENAME">
12                <xsl:value-of select="text()"/>,</xsl:for-each>
13            </xsl:template>
14          </xsl:stylesheet>'
15       )
16    ).getstringval() members
17  from emp
18  group by deptno;

DEPTNO MEMBERS
---------- --------------------------------------------------------
10 CLARK,MILLER,KING,
30 ALLEN,JAMES,TURNER,BLAKE,MARTIN,WARD,

```

Thankfully all of these were solved once we made it to 11g with the simple LISTAGG function

```
SQL> select deptno,
2         listagg( ename, ',')
3              within group (order by empno) members
4  from   emp
5  group  by deptno;

DEPTNO MEMBERS
---------- -----------------------------------------
10 CLARK,KING,MILLER
30 ALLEN,WARD,MARTIN,BLAKE,TURNER,JAMES
```

I’ve always liked LISTAGG because the function reads like the business requirement we are trying to meet:

“AGG-regate a LIST of ENAMEs in sequence of EMPNO within the grouping of DEPTNO”

But there has been once problem with LISTAGG since it’s arrival in 11g, and that is what to do with duplicate data. Duplicates can either just look messy, for example, when I swap out ENAME for JOB:

```
SQL> select deptno
2          listagg(job,',') within group ( order by job) as jobs
3   from   scott.emp
4   group by deptno
5   order by 1;

DEPTNO JOBS
--------- --------------------------------------------------
10 CLERK,MANAGER,PRESIDENT
20 ANALYST,ANALYST,CLERK,CLERK,MANAGER
30 CLERK,MANAGER,SALESMAN,SALESMAN,SALESMAN,SALESMAN
```

Or they could be even worse in the situation where the number of duplicates results in data exceeding allowing database limits. In the example below, there are hundreds of objects each with the same object type within a schema, and hence the aggregation blows the length limit for a varchar2.

```
SQL> select owner
2          listagg(object_type,',') within group
3              ( order by object_id ) as types
4   from   all_ojects
5   group by owner
6   order by 1;
ERROR:
ORA-01499: result of string concatenation is too long

```

With 19c, our (technical) prayers have been answered with the long awaited arrival of the DISTINCT extension to the syntax.

```
SQL> select deptno,
2        listagg(distinct job,',') within group ( order by job ) as jobs
3  from   scott.emp
4  group by deptno
5  order by 1;

DEPTNO JOBS
---------- ------------------------------------------------------------
10 CLERK,MANAGER,PRESIDENT
20 ANALYST,CLERK,MANAGER
30 CLERK,MANAGER,SALESMAN

3 rows selected.

SQL> select owner,
2        listagg(distinct object_type,',') within group ( order by object_type ) as types
3  from   all_objects
4  group by owner
5  order by 1;

OWNER
------------------------------
TYPES
------------------------------------------------------------------------------------------------------------
------------
APPQOSSYS
SYNONYM,TABLE

AUDSYS
INDEX PARTITION,LIBRARY,PACKAGE,PACKAGE BODY,TABLE,TABLE PARTITION,VIEW

CTXSYS
FUNCTION,INDEX,INDEXTYPE,LIBRARY,OPERATOR,PACKAGE,PACKAGE BODY,PROCEDURE,SEQUENCE,TABLE,TYPE,TYPE BODY,VIEW

etc

```

1. Rumpi Gravenstein says:

Now only if LISTAGG could be extended to support generating CLOBs…

2. Mentzel Iudith says:

Hello Connor,

Here are 3 pre-19c solutions for “LISTAGG DISTINCT”:

select deptno, listagg(job, ‘,’) within group (order by job) jobs
from (
select deptno,
case row_number() over (partition by deptno, job order by 1)
when 1 then job
end job
from scott.emp
)
group by deptno
order by deptno
/

DEPTNO JOBS
———————————
10 CLERK,MANAGER,PRESIDENT
20 ANALYST,CLERK,MANAGER
30 CLERK,MANAGER,SALESMAN

3 rows selected.

select deptno,
( select listagg(column_value,’,’) within group (order by column_value)
from table(v.t_jobs)
) jobs
from (
select deptno,
collect(distinct job) t_jobs
from scott.emp
group by deptno
) v
order by deptno
/

DEPTNO JOBS
———————————
10 CLERK,MANAGER,PRESIDENT
20 ANALYST,CLERK,MANAGER
30 CLERK,MANAGER,SALESMAN

3 rows selected.

select deptno,
regexp_replace(listagg(job,’,’) within group (order by job), ‘([^,]+)(,\1)+’, ‘\1’) jobs
from scott.emp
group by deptno
order by deptno
/

DEPTNO JOBS
———————————
10 CLERK,MANAGER,PRESIDENT
20 ANALYST,CLERK,MANAGER
30 CLERK,MANAGER,SALESMAN

3 rows selected.

At least for self-brain-development, it sometimes help to be a “slower-upgrader” ðŸ™‚

Cheers & Best Regards,
Iudith Mentzel

3. Racer I. says:

Hi,

https://community.oracle.com/ideas/12533

As the last entries suggest there are still some rough spots :

I.e. handling (possibly with an exception) the case where the DISTINCT column(s) aren’t the leading ORDER BY ones.

regards,

4. Racer I. says:

Hi,

About my previous comment (although I can’t see it yet?) :

I just checked it and LISTAGG-DISTINCT works Ok with non-matching ORDER BY :

regards,

1. The “can’t see it yet” is because WordPress seems to randomly pick comments and says they need moderation from me. I’ve not yet worked out the rhyme or reason why

1. Racer I. says:

Hi,

Wild guess : could be web-links in the body?

regards,

2. Mentzel Iudith says:

Hello,
Just as a curiosity, it also works without a WITHIN GROUP clause at all,
and the results appear ordered by the DISTINCT column:

select d.dname,
listagg (DISTINCT e.job,’, ‘ on overflow truncate with count) jobs
from scott.dept d, scott.emp e
where d.deptno = e.deptno
group by d.dname
/

DNAME JOBS
————————————————-
ACCOUNTING CLERK, MANAGER, PRESIDENT
RESEARCH ANALYST, CLERK, MANAGER
SALES CLERK, MANAGER, SALESMAN

3 rows selected.

1. I don’t think that is related to DISTINCT. I think in 18 we just allowed an optional WITHIN GROUP which effectively means you don’t care about the ordering of elements, eg

SQL> select banner from v\$version where rownum = 1;

BANNER
——————————————————————————
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 – 64bit Production

SQL> select d.dname,
2 listagg(e.job,’,’) jobs
3 from scott.dept d, scott.emp e
4 where d.deptno = e.deptno
5 group by d.dname
6 /
listagg (e.job,’,’) jobs
*
ERROR at line 2:
ORA-02000: missing WITHIN keyword

SQL> select banner from v\$version where rownum = 1;

BANNER
———————————————————————-
Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 – Production

SQL> select d.dname,
2 listagg (e.job,’,’) jobs
3 from scott.dept d, scott.emp e
4 where d.deptno = e.deptno
5 group by d.dname
6 /

DNAME JOBS
————– ——————————————————-
RESEARCH MANAGER,ANALYST,CLERK,CLERK,ANALYST
SALES SALESMAN,SALESMAN,SALESMAN,CLERK,MANAGER,SALESMAN
ACCOUNTING MANAGER,PRESIDENT,CLERK

1. Racer I. says:

Hi,

Seems to be unsorted with or without DISTINCT (with none or mismatched ORDER BY) which is Ok. Probably the same algorithm as GROUP BY which stopped sorting decades ago ðŸ˜‰
Would need a lot of meticulous experimentation to see if Oracle simply ignores mismatched ORDER BYs or tries to honor them as much as possible.

regards,

1. Mentzel Iudith says:

Hello,
I don’t think that having a DISTINCT with a mismatched ORDER BY does make too much sense logically speaking.
For example, in a SELECT DISTINCT you cannot order by something that is NOT part
of the SELECT list.
So, even it technically works and even it Oracle is picking up the distinct values in the mismatched ORDER that you specify, that ORDER becomes irrelevant for the final output.
Cheers & Best Regards,
Iudith

5. Having Distinct is really a cool feature. But can we get rid of this error in 19c?

ERROR:
ORA-01499: result of string concatenation is too long

1. I’ve never really understood the requirement. Why do I want >32k of data on a single “line”. Seems odd to me

This site uses Akismet to reduce spam. Learn how your comment data is processed.