Chinaunix首页 | 论坛 | 博客
  • 博客访问: 1373037
  • 博文数量: 172
  • 博客积分: 0
  • 博客等级: 民兵
  • 技术积分: 3831
  • 用 户 组: 普通用户
  • 注册时间: 2011-03-30 13:00
个人简介

About me:Oracle ACE pro,optimistic,passionate and harmonious. Focus on ORACLE,MySQL and other database programming,peformance tuning,db design, j2ee,Linux/AIX,Architecture tech,etc

文章分类

全部博文(172)

文章存档

2024年(27)

2023年(28)

2022年(43)

2020年(62)

2014年(3)

2013年(9)

分类: Oracle

2020-06-30 09:11:46

1. ORACLE 19C LISTAGG函数功能增强


1.1 LISTAGG函数简介

LISTAGG函数是11g R2引入用于合并字符串(列转行),可替换wmsys.wm_concat功能,而且效率更好wm_concat函数是自定义聚集函数,而且是undocument的,从12C开始已经去掉,从11g R2开始,有字符串合并的需求,最好使用LISTAGG替代,以获得更好的性能。

 

但是,LISTAGG函数在11g R2中有个缺点,就是不能直接DISTINCT,在ORACLE 19C中,ORACLE给它增加了DISTINCT功能,这样可以剔除重复的字符串合并19C的完整LISTAGG语法如下:



 从语法图上可以看出,LISTAGG函数可以是普通的组函数,也可以用于分析函数,并且12C开始增加了OVERFLOW语法,用于字符串过长的处理。

 

  

1.2 使用LISTAGG函数实现字符串合并

 

如下例所示:

 

需求:对emp表,按照部门分组,按逗号合并部门员工名。

使用LISTAGG实现如下:

 

 

select deptno, listagg(ename,',') within group(order by deptno) as enames

  from emp

  group by deptno

order by deptno;

 

DEPTNO ENAMES

------------------------------------------------------------------------------

    10 CLARK,KING,MILLER

    20 ADAMS,FORD,JONES,SCOTT,SMITH

    30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD

 

下面给DEPTNO=10的部门插入2个重复名字的员工,如下所示:

 

INSERT INTO emp VALUES (8000,'KING','ANALYST',7782,to_date('1983-1-1','yyyy-mm-dd'),2000,NULL,10);

INSERT INTO emp VALUES (9000,'KING','MANADER',7782,to_date('1984-5-1','yyyy-mm-dd'),2500,NULL,10);

COMMIT;

 

再次查询,发现有重复的数据:

SQL> select deptno, listagg(ename,',') within group(order by deptno) as enames

  2    from emp

  3    group by deptno

  4  order by deptno;

 

DEPTNO ENAMES

------ --------------------------------------------------------------------------------

    10 CLARK,KING,KING,KING,MILLER

    20 ADAMS,FORD,JONES,SCOTT,SMITH

    30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD

 

 
 
一般遇到这种有重复数据的情况,需要剔除重复数据,在19C之前,需要先剔除重复数据:

 

--红色部分使用分析函数剔重,当然这里直接distinct也可以,结果与最前面的一致

select deptno, listagg(ename, ',') within group(order by deptno) as enames

  from (select deptno,

               ename,

               row_number() over(partition by deptno, ename order by empno) rn

          from emp)

 where rn = 1

 group by deptno

 order by deptno;

 

或者直接DISTINCT :

select deptno, listagg(ename, ',') within group(order by deptno) as enames

  from (select distinct deptno,

               ename

          from emp)

 group by deptno

 order by deptno;

 

DEPTNO ENAMES

-------------------------------------------------------------------------------

10 CLARK,KING,MILLER

20 ADAMS,FORD,JONES,SCOTT,SMITH

30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD

 

 

 

  到了19CLISTAGG提供了直接DISTINCT的功能,可以很简单地实现剔除重复数据,然后合并,如下所示:

 

select deptno, listagg(distinct ename,',') within group(order by deptno) as enames

  from emp

 group by deptno

 order by deptno;

 

DEPTNO ENAMES

-------------------------------------------------------------------------------

10 CLARK,KING,MILLER

20 ADAMS,FORD,JONES,SCOTT,SMITH

30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD

 

      

  默认情况下包含所有数据(ALL不用写默认),也就是LISTAGG(ALL column),如下所示:

select deptno, listagg(all ename,',') within group(order by deptno) as enames

  from emp

 group by deptno

 order by deptno;

 

1.3 WM_CONCAT函数与LISTAGG函数比较

如果使用wm_concat,则是:

select deptno, wm_concat(distinct ename) as enames

  from emp

 group by deptno

 order by deptno;

 

DEPTNO ENAMES

------ --------------------------------------------------------------------------------

    10 CLARK,KING,MILLER

    20 ADAMS,FORD,JONES,SCOTT,SMITH

    30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD

 

wm_concat这个undocument函数可以直接distinct,但是它与LISTAGG比缺点很明显,一是性能差,而是undocument函数,比较如下:

 

1)wm_concat可以distinct去除重复的(用于分析函数,distinct不能带order by),19C之前的listagg不可以,可以先剔除重复,然后做listagg,19C之后的listagg可以distinct。

2)wm_concat不能保证排序,listagg可以保证排序.

3.WMSYS.WM_CONCAT是undocument的函数,最好不要使用,要使用也用自定义分析函数

 

1.4  使用ON OVERFLOW处理字符串长度溢出问题

LISTAGG函数在12.2开始,合并长度最多32767字节,依赖于MAX_STRING_SIZE参数,如下所示:

1)  如果MAX_STRING_SIZE=EXTEND,则对于VARCHAR2RAW类型,最多返回32767字节

2)  如果MAX_STRING_SIZE=STANDARD,则对于VARCHAR2最多4000字节,对于RAW类型最多2000字节

 

那么在合并的字符串超过限制,溢出时,默认报错,但是12.2引入了ON OVERFLOW可以截断处理,如下:

 

ON OVERFLOW TRUNCATE默认对溢出数据后面用…(count),如下所示:

 

SELECT deptno, LISTAGG(ename, ','  ON OVERFLOW TRUNCATE) WITHIN GROUP (ORDER BY deptno) AS enames

FROM  (select deptno,ename from emp,(select level from dual connect by level<1000))

GROUP BY deptno

ORDER BY deptno;

 

DEPTNO ENAMES

------ ----------------------------------------------------------------------------------------------------

    10 CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLAR

       省略

       ARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,

       CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,...(4334)

 

    20 ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAM

       S,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,AD

       AMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,

       省略

       ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,...(4334)

 

    30 ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLE

       N,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,AL

       LEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,

      省略

       LEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,

       ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,...(5333)

 

其他用法如下:

--1.改变省略的格式,换成~~~

SELECT deptno, LISTAGG(ename, ',' ON OVERFLOW TRUNCATE '~~~') WITHIN GROUP (ORDER BY deptno) AS enames

FROM  (select deptno,ename from emp,(select level from dual connect by level<1000))

GROUP BY deptno

ORDER BY deptno;

 

--2.通过WINTHOUT COUNT省略掉计数

SELECT deptno, LISTAGG(ename, ',' ON OVERFLOW TRUNCATE WITHOUT COUNT) WITHIN GROUP (ORDER BY deptno) AS enames

FROM  (select deptno,ename from emp,(select level from dual connect by level<1000))

GROUP BY deptno

ORDER BY deptno;

 

--3.默认格式ON OVERFLOW ERROW,溢出则报错ORA-01489: result of string concatenation is too long

 

SELECT deptno, LISTAGG(ename, ',' ON OVERFLOW ERROR) WITHIN GROUP (ORDER BY deptno) AS enames

FROM  (select deptno,ename from emp,(select level from dual connect by level<1000))

GROUP BY deptno

ORDER BY deptno;

 

1.5 使用XMLAGG返回CLOB处理字符串合并超长需求

针对超过字符串长度溢出的问题,如果要完整显示,可以使用如下方式:

1) 自定义wm_concat返回值类型为CLOB

2) 使用XMLAGG函数获取CLOB值

下面使用XMLAGG函数处理超长合并字符串问题,如下所示:

 

SELECT deptno,  RTRIM(xmlagg(xmlelement(c, ename || ',')

ORDER BY deptno).extract('//text()').getclobval(),

             ',') AS enames

FROM  (select deptno,ename from emp,(select level from dual connect by level<1000))

GROUP BY deptno

ORDER BY deptno;

 

总结:ORACLE 19C针对LISTAGG增加了很多改进,在以后有字符串合并需求(列转行)的时候,要优先使用LISTAGG,而不是WM_CONCAT,对于自定义聚集函数,性能较差,而且功能也没有LISTAGG强大。

阅读(3262) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~