GaussDB(DWS)迁移 – oracle兼容 — CONNECT BY迁移

CONNECT BY是Oracle实现递归处理的SQL写法,DWS不支持CONNECT BY语法,但是有类似的WITH RECURSIVE语法可以实现递归逻辑,下面通过几个case描述CONNECT BY迁移到DWS的改写语法。

注意:

  1. case4中涉及connect_by_root、sys_connect_by_path、伪劣level相关逻辑
  2. case4中涉及CONNECT BY前有WHERE条件的场景,此场景及其特殊,请详细查看解析信息

0. 预置对象定义

DROP TABLE test_cb;

CREATE TABLE test_cb(id int,parent_id int);
INSERT INTO test_cb VALUES(0,'');
INSERT INTO test_cb VALUES(1,0);
INSERT INTO test_cb VALUES(11,1);
INSERT INTO test_cb VALUES(12,1);
INSERT INTO test_cb VALUES(111,11);
INSERT INTO test_cb VALUES(2,0);
INSERT INTO test_cb VALUES(21,2);
INSERT INTO test_cb VALUES(22,2);
INSERT INTO test_cb VALUES(222,22);
INSERT INTO test_cb VALUES(211,21);
INSERT INTO test_cb VALUES(212,21);

1. CONNECT BY PRIOR

PRIOR在等值递归条件左侧,等值条件左侧的作为驱动递归的条件

1.1 Oracle原始语句

SELECT
    id, parent_id,level lvl 
FROM test_cb 
CONNECT BY PRIOR id=parent_id 
START WITH parent_id IS NULL 
ORDER BY id;

1.2 DWS改写语句

WITH RECURSIVE tmp_cb AS(
    SELECT 
        id,parent_id,1 AS level 
    FROM test_cb
    WHERE parent_id IS NULL

    UNION ALL

    SELECT 
        a.id, a.parent_id, b.level+1 AS level 
    FROM test_cb a,tmp_cb b 
    WHERE a.parent_id=b.id
)
SELECT * 
FROM tmp_cb
ORDER BY id;

2. CONNECT BY .. PRIOR

PRIOR在等值递归条件右侧,等值条件右侧的作为驱动递归的条件。

2.1 oracle原始语句

SELECT
    id, parent_id,level lvl 
FROM test_cb 
CONNECT BY id = PRIOR parent_id 
START WITH parent_id =22 
ORDER BY id;

2.2 DWS改写语句

WITH RECURSIVE tmp_cb AS(
    SELECT 
        id,parent_id,1 AS level 
    FROM test_cb
    WHERE parent_id = 22

    UNION ALL

    SELECT 
        a.id, a.parent_id, b.level+1 AS level 
    FROM test_cb a,tmp_cb b 
    WHERE a.id = b.parent_id
)
SELECT * 
FROM tmp_cb
ORDER BY id;

3. PRIOR DBMS_RANDOM.VALUE

3.1 oracle原始语句

-- 因为自身与自身递归,CONNECT BY PRIOR ID=ID会报循环错误,因此为了欺骗ORACLE,我每次递归的条件是
-- 没有循环的,增加PRIOR DBMS_RANDOM.VALUE IS NOT NULL,如下:
WITH t AS (
  SELECT 1 id,5 times FROM dual 
  UNION ALL
  SELECT 2,3 FROM dual
 )
SELECT id, level FROM t
CONNECT BY PRIOR id=id AND LEVEL<=times AND PRIOR DBMS_RANDOM.VALUE IS NOT NULL
ORDER BY id, level;

3.2 DWS改写语句

WITH RECURSIVE t_recur AS(
    WITH t AS (
        SELECT 1 id,5 times FROM dual 
        UNION ALL
        SELECT 2,3 FROM dual
    )

    SELECT id, times, 1 AS level FROM t
    UNION ALL
    SELECT
       r.id,
       r.times,
       (r.level + 1) AS level
    FROM t_recur r
    INNER JOIN t ON r.id = t.id AND r.level + 1 <= r.times
)

SELECT id, level from t_recur
ORDER BY id, level;

4. CONNECT BY NOCYCLE

在递归语法中,单递归条件成环时,执行会陷入死循环,导致验证的执行问题。针对这种场景,在Oracle语法中当指定了NOCYCLE关键字时,可以自动识别这种场景打破这种循环。与此相关的还有一个伪劣CONNECT_BY_ISCYCLE,当伪劣CONNECT_BY_ISCYCLE值为1时,表示当前节点再往下递归就会触发死循环。

INSERT INTO test_cb VALUES(0, 211);
INSERT INTO test_cb VALUES(2, 212);

4.1 oracle原始语句

SELECT
    connect_by_root(id) id_root,
    id,parent_id,
    sys_connect_by_path(i.id, '->') as id_path,
    sys_connect_by_path(i.parent_id, '->') as parent_id_path,  
    level AS instrument_level
FROM test_cb i
WHERE id >= 0 AND parent_id >= 0
CONNECT BY nocycle PRIOR id = parent_id
START WITH id = 2 AND parent_id = 0
ORDER BY instrument_level, parent_id, id 
;

oracle中的执行计划

GaussDB(DWS)迁移 - oracle兼容 -- CONNECT BY迁移

关键点:

1)语句特征:SQL语句中WHERE字句包含非关联条件

2)执行特征:WHERE字句中的非关联条件计算的优先级低于CONNECT BY,即CONNECT BY执行之后才会进行这些条件的过滤,如上id=2的FILTER条件

4.2 DWS等价改写逻辑

WITH RECURSIVE t_recur AS (
    SELECT
        id AS id_root,
        id,parent_id,
        '->' || i.id AS id_path,
        '->' || i.parent_id AS parent_id_path,
        1 AS instrument_level,
        '#' || id || '#' AS check_val 
    FROM test_cb i
    WHERE id = 2 AND parent_id = 0

    UNION ALL

    SELECT
        r.id_root,
        i.id, i.parent_id,
        r.id_path || '->' || i.id AS id_path,
        r.parent_id_path || '->' || i.parent_id AS parent_id_path,
        (r.instrument_level + 1) AS instrument_level,
        check_val || i.id || '#'  AS check_val
    FROM t_recur r
    INNER JOIN test_cb i ON r.id = i.parent_id AND position('#' || i.id || '#' in check_val) = 0
)

SELECT 
    t.id_root, t.id, t.parent_id, t.id_path, t.parent_id_path, t.instrument_level
FROM t_recur t
WHERE t.id >= 0 AND t.parent_id >= 0
ORDER BY t.instrument_level, t.parent_id, t.id;

5. 伪劣CONNECT_BY_ISLEAF

5.1 oracle原始语句

SELECT
    connect_by_root(id) id_root,
    id,parent_id,
    sys_connect_by_path(i.id, '->') as id_path,
    sys_connect_by_path(i.parent_id, '->') as parent_id_path,  
    level AS instrument_level,
    connect_by_isleaf isleaf
FROM test_cb i
WHERE id >= 0 AND parent_id >= 0
CONNECT BY nocycle PRIOR id = parent_id
START WITH id = 2 AND parent_id = 0
ORDER BY instrument_level, parent_id, id 
;

5.2 DWS等价改写逻辑

WITH RECURSIVE t_recur AS (
    SELECT
        id AS id_root,
        id,parent_id,
        '->' || i.id AS id_path,
        '->' || i.parent_id AS parent_id_path,
        1 AS instrument_level,
        '#' || id || '#' AS check_val
    FROM test_cb i
    WHERE id = 2 AND parent_id = 0

    UNION ALL

    SELECT
        r.id_root,
        i.id, i.parent_id,
        r.id_path || '->' || i.id AS id_path,
        r.parent_id_path || '->' || i.parent_id AS parent_id_path,
        (r.instrument_level + 1) AS instrument_level,
        check_val || i.id || '#'  AS check_val
    FROM t_recur r
    INNER JOIN test_cb i ON r.id = i.parent_id AND position('#' || i.id || '#' in check_val) = 0
)

SELECT 
    t.id_root, t.id, t.parent_id, t.id_path, t.parent_id_path, t.instrument_level,
    (position(reverse('#' || id || '#') in reverse(check_val)) = 1 -- 当前只出现在check_val的尾部
      AND NOT EXISTS(SELECT 1 FROM test_cb i WHERE t.id = i.parent_id)  -- 没有父节点
    )::int AS isleaf
FROM t_recur t
WHERE t.id >= 0 AND t.parent_id >= 0
ORDER BY t.instrument_level, t.parent_id, t.id;

原创文章,作者:奋斗,如若转载,请注明出处:https://blog.ytso.com/tech/bigdata/317456.html

(0)
上一篇 43秒前
下一篇 2021年11月14日 18:13

相关推荐

发表回复

登录后才能评论