本篇内容介绍了“PostgreSQL中Review subquery_planner函数的实现逻辑是什么”的有关知识,在实际案例的操作过程中,不少人都会遇到这样的困境,接下来就让小编带领大家学习一下如何处理这些情况吧!希望大家仔细阅读,能够学有所成!
一、源码解读
subquery_planner函数由函数standard_planner调用,生成最终的结果Relation(成本最低),其输出作为生成实际执行计划的输入,在此函数中会调用grouping_planner执行主要的计划过程
/*-------------------- * subquery_planner * Invokes the planner on a subquery. We recurse to here for each * sub-SELECT found in the query tree. * 对子查询进行执行规划。对于查询树中的每个子查询(sub-SELECT),都会递归此处理过程。 * * glob is the global state for the current planner run. * parse is the querytree produced by the parser & rewriter. * parent_root is the immediate parent Query's info (NULL at the top level). * hasRecursion is true if this is a recursive WITH query. * tuple_fraction is the fraction of tuples we expect will be retrieved. * tuple_fraction is interpreted as explained for grouping_planner, below. * glob-当前计划器运行的全局状态。 * parse-由解析器和重写器生成的查询树querytree。 * parent_root是父查询的信息(如为顶层则为空)。 * hasRecursion-如果这是一个带查询的递归,值为T。 * tuple_fraction-扫描元组的比例。tuple_fraction在grouping_planner中详细解释。 * * Basically, this routine does the stuff that should only be done once * per Query object. It then calls grouping_planner. At one time, * grouping_planner could be invoked recursively on the same Query object; * that's not currently true, but we keep the separation between the two * routines anyway, in case we need it again someday. * 基本上,这个函数包含完成了每个Query只需要执行一次的任务。 * 该函数调用grouping_planner一次。在同一个Query上,每次递归grouping_planner都调用一次; * 当然,这不是通常的情况,但我们仍然保持这两个例程(subquery_planner和grouping_planner)之间的分离, * 以防有一天我们再次需要它。 * * subquery_planner will be called recursively to handle sub-Query nodes * found within the query's expressions and rangetable. * 函数subquery_planner将被递归调用,以处理表达式和RTE中的子查询节点。 * * Returns the PlannerInfo struct ("root") that contains all data generated * while planning the subquery. In particular, the Path(s) attached to * the (UPPERREL_FINAL, NULL) upperrel represent our conclusions about the * cheapest way(s) to implement the query. The top level will select the * best Path and pass it through createplan.c to produce a finished Plan. * 返回PlannerInfo struct(“root”),它包含在计划子查询时生成的所有数据。 * 特别地,访问路径附加到(UPPERREL_FINAL, NULL) 上层关系中,以代表优化器已找到查询成本最低的方法. * 顶层将选择最佳路径并将其通过createplan.c传递以制定一个已完成的计划。 *-------------------- */ /* 输入: glob-PlannerGlobal parse-Query结构体指针 parent_root-父PlannerInfo Root节点 hasRecursion-是否递归? tuple_fraction-扫描Tuple比例 输出: PlannerInfo指针 */ PlannerInfo * subquery_planner(PlannerGlobal *glob, Query *parse, PlannerInfo *parent_root, bool hasRecursion, double tuple_fraction) { PlannerInfo *root;//返回值 List *newWithCheckOptions;// List *newHaving;//Having子句 bool hasOuterJoins;//是否存在Outer Join? RelOptInfo *final_rel;// ListCell *l;//临时变量 /* Create a PlannerInfo data structure for this subquery */ //创建一个规划器数据结构:PlannerInfo root = makeNode(PlannerInfo);//构造返回值 root->parse = parse; root->glob = glob; root->query_level = parent_root ? parent_root->query_level + 1 : 1; root->parent_root = parent_root; root->plan_params = NIL; root->outer_params = NULL; root->planner_cxt = CurrentMemoryContext; root->init_plans = NIL; root->cte_plan_ids = NIL; root->multiexpr_params = NIL; root->eq_classes = NIL; root->append_rel_list = NIL; root->rowMarks = NIL; memset(root->upper_rels, 0, sizeof(root->upper_rels)); memset(root->upper_targets, 0, sizeof(root->upper_targets)); root->processed_tlist = NIL; root->grouping_map = NULL; root->minmax_aggs = NIL; root->qual_security_level = 0; root->inhTargetKind = INHKIND_NONE; root->hasRecursion = hasRecursion; if (hasRecursion) root->wt_param_id = SS_assign_special_param(root); else root->wt_param_id = -1; root->non_recursive_path = NULL; root->partColsUpdated = false; /* * If there is a WITH list, process each WITH query and build an initplan * SubPlan structure for it. * 如果有一个WITH链表,使用查询处理每个链表,并为其构建一个initplan子计划结构。 */ if (parse->cteList) SS_process_ctes(root);//处理With 语句 /* * Look for ANY and EXISTS SubLinks in WHERE and JOIN/ON clauses, and try * to transform them into joins. Note that this step does not descend * into subqueries; if we pull up any subqueries below, their SubLinks are * processed just before pulling them up. * 查找WHERE和JOIN/ON子句中的ANY/EXISTS子句,并尝试将它们转换为JOIN。 * 注意,此步骤不会下降为子查询;如果我们上拉子查询,它们的SubLinks将在调出它们上拉前被处理。 */ if (parse->hasSubLinks) pull_up_sublinks(root); //上拉子链接 /* * Scan the rangetable for set-returning functions, and inline them if * possible (producing subqueries that might get pulled up next). * Recursion issues here are handled in the same way as for SubLinks. * 扫描RTE中的set-returning函数, * 如果可能,内联它们(生成下一个可能被上拉的子查询)。 * 这里递归问题的处理方式与SubLinks相同。 */ inline_set_returning_functions(root);// /* * Check to see if any subqueries in the jointree can be merged into this * query. * 检查连接树中的子查询是否可以合并到该查询中(上拉子查询) */ pull_up_subqueries(root);//上拉子查询 /* * If this is a simple UNION ALL query, flatten it into an appendrel. We * do this now because it requires applying pull_up_subqueries to the leaf * queries of the UNION ALL, which weren't touched above because they * weren't referenced by the jointree (they will be after we do this). * 如果这是一个简单的UNION ALL查询,则将其ftatten为appendrel结构。 * 我们现在这样做是因为它需要对UNION ALL的叶子查询应用pull_up_subqueries, * 上面没有涉及到这些查询,因为它们没有被jointree引用(在我们这样做之后它们将被引用)。 */ if (parse->setOperations) flatten_simple_union_all(root);//扁平化处理UNION ALL /* * Detect whether any rangetable entries are RTE_JOIN kind; if not, we can * avoid the expense of doing flatten_join_alias_vars(). Also check for * outer joins --- if none, we can skip reduce_outer_joins(). And check * for LATERAL RTEs, too. This must be done after we have done * pull_up_subqueries(), of course. * 检测是否有任何RTE中的元素是RTE_JOIN类型;如果没有,可以避免执行refin_join_alias_vars()的开销。 * 检查外部连接——如果没有,可以跳过reduce_outer_join()函数。同样的,我们会检查LATERAL RTEs。 * 当然,这必须在我们完成pull_up_subqueries()调用之后完成。 */ //判断RTE中是否存在RTE_JOIN? root->hasJoinRTEs = false; root->hasLateralRTEs = false; hasOuterJoins = false; foreach(l, parse->rtable) { RangeTblEntry *rte = lfirst_node(RangeTblEntry, l); if (rte->rtekind == RTE_JOIN) { root->hasJoinRTEs = true; if (IS_OUTER_JOIN(rte->jointype)) hasOuterJoins = true; } if (rte->lateral) root->hasLateralRTEs = true; } /* * Preprocess RowMark information. We need to do this after subquery * pullup (so that all non-inherited RTEs are present) and before * inheritance expansion (so that the info is available for * expand_inherited_tables to examine and modify). * 预处理RowMark信息。 * 我们需要在子查询上拉(以便所有非继承的RTEs都存在)和继承展开之后完成 * (以便expand_inherited_tables可以使用这个信息来检查和修改)。 */ //预处理RowMark信息 preprocess_rowmarks(root); /* * Expand any rangetable entries that are inheritance sets into "append * relations". This can add entries to the rangetable, but they must be * plain base relations not joins, so it's OK (and marginally more * efficient) to do it after checking for join RTEs. We must do it after * pulling up subqueries, else we'd fail to handle inherited tables in * subqueries. * 将继承集的任何可范围条目展开为“append relations”。 * 将相关的relation添加到RTE中,但它们必须是纯基础关系而不是连接, * 因此在检查连接RTEs之后执行它是可以的(而且更有效)。 * 我们必须在启动子查询后执行,否则我们将无法在子查询中处理继承表。 */ //展开继承表 expand_inherited_tables(root); /* * Set hasHavingQual to remember if HAVING clause is present. Needed * because preprocess_expression will reduce a constant-true condition to * an empty qual list ... but "HAVING TRUE" is not a semantic no-op. * 如果存在HAVING子句,则务必设置hasHavingQual属性。 * 因为preprocess_expression将把constant-true条件减少为空的条件qual列表… * 但是,“HAVING TRUE”并没有语义错误。 */ //是否存在Having表达式 root->hasHavingQual = (parse->havingQual != NULL); /* Clear this flag; might get set in distribute_qual_to_rels */ //清除hasPseudoConstantQuals标记,该标记可能在distribute_qual_to_rels函数中设置 root->hasPseudoConstantQuals = false; /* * Do expression preprocessing on targetlist and quals, as well as other * random expressions in the querytree. Note that we do not need to * handle sort/group expressions explicitly, because they are actually * part of the targetlist. * 对targetlist和quals以及querytree中的其他随机表达式进行表达式预处理。 * 注意,我们不需要显式地处理sort/group表达式,因为它们实际上是targetlist的一部分。 */ //预处理表达式:targetList(投影列) parse->targetList = (List *) preprocess_expression(root, (Node *) parse->targetList, EXPRKIND_TARGET); /* Constant-folding might have removed all set-returning functions */ //Constant-folding 可能已经把set-returning函数去掉 if (parse->hasTargetSRFs) parse->hasTargetSRFs = expression_returns_set((Node *) parse->targetList); newWithCheckOptions = NIL; foreach(l, parse->withCheckOptions)//witch Check Options { WithCheckOption *wco = lfirst_node(WithCheckOption, l); wco->qual = preprocess_expression(root, wco->qual, EXPRKIND_QUAL); if (wco->qual != NULL) newWithCheckOptions = lappend(newWithCheckOptions, wco); } parse->withCheckOptions = newWithCheckOptions; //返回列信息returningList parse->returningList = (List *) preprocess_expression(root, (Node *) parse->returningList, EXPRKIND_TARGET); //预处理条件表达式 preprocess_qual_conditions(root, (Node *) parse->jointree); //预处理Having表达式 parse->havingQual = preprocess_expression(root, parse->havingQual, EXPRKIND_QUAL); //窗口函数 foreach(l, parse->windowClause) { WindowClause *wc = lfirst_node(WindowClause, l); /* partitionClause/orderClause are sort/group expressions */ wc->startOffset = preprocess_expression(root, wc->startOffset, EXPRKIND_LIMIT); wc->endOffset = preprocess_expression(root, wc->endOffset, EXPRKIND_LIMIT); } //Limit子句 parse->limitOffset = preprocess_expression(root, parse->limitOffset, EXPRKIND_LIMIT); parse->limitCount = preprocess_expression(root, parse->limitCount, EXPRKIND_LIMIT); //On Conflict子句 if (parse->onConflict) { parse->onConflict->arbiterElems = (List *) preprocess_expression(root, (Node *) parse->onConflict->arbiterElems, EXPRKIND_ARBITER_ELEM); parse->onConflict->arbiterWhere = preprocess_expression(root, parse->onConflict->arbiterWhere, EXPRKIND_QUAL); parse->onConflict->onConflictSet = (List *) preprocess_expression(root, (Node *) parse->onConflict->onConflictSet, EXPRKIND_TARGET); parse->onConflict->onConflictWhere = preprocess_expression(root, parse->onConflict->onConflictWhere, EXPRKIND_QUAL); /* exclRelTlist contains only Vars, so no preprocessing needed */ } //集合操作(AppendRelInfo) root->append_rel_list = (List *) preprocess_expression(root, (Node *) root->append_rel_list, EXPRKIND_APPINFO); //RTE /* Also need to preprocess expressions within RTEs */ foreach(l, parse->rtable) { RangeTblEntry *rte = lfirst_node(RangeTblEntry, l); int kind; ListCell *lcsq; if (rte->rtekind == RTE_RELATION) { if (rte->tablesample) rte->tablesample = (TableSampleClause *) preprocess_expression(root, (Node *) rte->tablesample, EXPRKIND_TABLESAMPLE);//数据表采样语句 } else if (rte->rtekind == RTE_SUBQUERY)//子查询 { /* * We don't want to do all preprocessing yet on the subquery's * expressions, since that will happen when we plan it. But if it * contains any join aliases of our level, those have to get * expanded now, because planning of the subquery won't do it. * That's only possible if the subquery is LATERAL. * 我们还不想对子查询的表达式进行预处理,因为这将在计划时发生。 * 但是,如果它包含当前级别的任何连接别名,那么现在就必须扩展这些别名, * 因为子查询的计划无法做到这一点。只有在子查询是LATERAL的情况下才有可能。 */ if (rte->lateral && root->hasJoinRTEs) rte->subquery = (Query *) flatten_join_alias_vars(root, (Node *) rte->subquery); } else if (rte->rtekind == RTE_FUNCTION)//函数 { /* Preprocess the function expression(s) fully */ //预处理函数表达式 kind = rte->lateral ? EXPRKIND_RTFUNC_LATERAL : EXPRKIND_RTFUNC; rte->functions = (List *) preprocess_expression(root, (Node *) rte->functions, kind); } else if (rte->rtekind == RTE_TABLEFUNC)//TABLE FUNC { /* Preprocess the function expression(s) fully */ kind = rte->lateral ? EXPRKIND_TABLEFUNC_LATERAL : EXPRKIND_TABLEFUNC; rte->tablefunc = (TableFunc *) preprocess_expression(root, (Node *) rte->tablefunc, kind); } else if (rte->rtekind == RTE_VALUES)//VALUES子句 { /* Preprocess the values lists fully */ kind = rte->lateral ? EXPRKIND_VALUES_LATERAL : EXPRKIND_VALUES; rte->values_lists = (List *) preprocess_expression(root, (Node *) rte->values_lists, kind); } /* * Process each element of the securityQuals list as if it were a * separate qual expression (as indeed it is). We need to do it this * way to get proper canonicalization of AND/OR structure. Note that * this converts each element into an implicit-AND sublist. * 处理securityQuals列表的每个元素,就好像它是一个单独的qual表达式(事实也是如此)。 * 之所以这样做,是因为需要获得适当的规范化AND/OR结构。 * 注意,这将把每个元素转换为隐含的子列表。 */ foreach(lcsq, rte->securityQuals) { lfirst(lcsq) = preprocess_expression(root, (Node *) lfirst(lcsq), EXPRKIND_QUAL); } } /* * Now that we are done preprocessing expressions, and in particular done * flattening join alias variables, get rid of the joinaliasvars lists. * They no longer match what expressions in the rest of the tree look * like, because we have not preprocessed expressions in those lists (and * do not want to; for example, expanding a SubLink there would result in * a useless unreferenced subplan). Leaving them in place simply creates * a hazard for later scans of the tree. We could try to prevent that by * using QTW_IGNORE_JOINALIASES in every tree scan done after this point, * but that doesn't sound very reliable. * 现在,已经完成了预处理表达式,特别是扁平化连接别名变量,现在可以去掉joinaliasvars链表了。 * 它们不再匹配树中其他部分中的表达式,因为我们没有在那些链表中预处理表达式 * (而且是不希望这样做,例如,在那里展开一个SubLink将导致无用的未引用的子计划)。 * 把它们放在链表中只会给以后扫描树造成问题。 * 我们可以在这之后的每一次树扫描中使用QTW_IGNORE_JOINALIASES来防止这种情况,虽然这听起来不太可靠。 */ if (root->hasJoinRTEs) { foreach(l, parse->rtable) { RangeTblEntry *rte = lfirst_node(RangeTblEntry, l); rte->joinaliasvars = NIL; } } /* * In some cases we may want to transfer a HAVING clause into WHERE. We * cannot do so if the HAVING clause contains aggregates (obviously) or * volatile functions (since a HAVING clause is supposed to be executed * only once per group). We also can't do this if there are any nonempty * grouping sets; moving such a clause into WHERE would potentially change * the results, if any referenced column isn't present in all the grouping * sets. (If there are only empty grouping sets, then the HAVING clause * must be degenerate as discussed below.) * 在某些情况下,我们可能想把“HAVING”条件转移到WHERE子句中。 * 如果HAVING子句包含聚合(显式的)或易变volatile函数(因为每个GROUP只执行一次HAVING子句),就不能这样做。 * 如果有任何非空GROUPING SET,也不能这样做; * 如果在所有GROUPING SET中没有出现任何引用列,将这样的子句移动到WHERE可能会改变结果。 * (如果只有空的GROUP SET分组集,则可以按照下面讨论的那样简化HAVING子句->WHERE中。) * * Also, it may be that the clause is so expensive to execute that we're * better off doing it only once per group, despite the loss of * selectivity. This is hard to estimate short of doing the entire * planning process twice, so we use a heuristic: clauses containing * subplans are left in HAVING. Otherwise, we move or copy the HAVING * clause into WHERE, in hopes of eliminating tuples before aggregation * instead of after. * 而且,执行子句的成本非常高,所以最好每组只执行一次,尽管这样会导致选择性selectivity。 * 如果不把整个规划过程重复一遍,这是很难估计的,因此我们使用启发式的方法: * 包含子计划的条款在HAVING的后面。 * 否则,我们将把HAVING子句移动到WHERE中,希望在聚合之前而不是聚合之后消除元组。 * * If the query has explicit grouping then we can simply move such a * clause into WHERE; any group that fails the clause will not be in the * output because none of its tuples will reach the grouping or * aggregation stage. Otherwise we must have a degenerate (variable-free) * HAVING clause, which we put in WHERE so that query_planner() can use it * in a gating Result node, but also keep in HAVING to ensure that we * don't emit a bogus aggregated row. (This could be done better, but it * seems not worth optimizing.) * 如果查询有显式分组,那么可以简单地将这样的子句移动到WHERE中; * 任何失败的GROUP子句都不会出现在输出中,因为它的元组不会到达分组或聚合阶段。 * 否则,我们必须有一个退化的(无变量的)HAVING子句,把它放在WHERE中, * 以便query_planner()可以在一个控制结果节点中使用它,但同时还要确保不会发出一个伪造的聚合行。 * (这本来可以做得更好,但似乎不值得继续深入优化。) * * Note that both havingQual and parse->jointree->quals are in * implicitly-ANDed-list form at this point, even though they are declared * as Node *. * 请注意,现在不管是qual还是parse->jointree->quals,即使它们被声明为节点 *, * 但它们在这个点上都是都是隐式的链表形式。 */ newHaving = NIL; foreach(l, (List *) parse->havingQual) { Node *havingclause = (Node *) lfirst(l); if ((parse->groupClause && parse->groupingSets) || contain_agg_clause(havingclause) || contain_volatile_functions(havingclause) || contain_subplans(havingclause)) { /* keep it in HAVING */ newHaving = lappend(newHaving, havingclause); } else if (parse->groupClause && !parse->groupingSets) { /* move it to WHERE */ parse->jointree->quals = (Node *) lappend((List *) parse->jointree->quals, havingclause); } else { /* put a copy in WHERE, keep it in HAVING */ parse->jointree->quals = (Node *) lappend((List *) parse->jointree->quals, copyObject(havingclause)); newHaving = lappend(newHaving, havingclause); } } parse->havingQual = (Node *) newHaving; /* Remove any redundant GROUP BY columns */ //移除多余的GROUP BY 列 remove_useless_groupby_columns(root); /* * If we have any outer joins, try to reduce them to plain inner joins. * This step is most easily done after we've done expression * preprocessing. * 如果存在外连接,则尝试将它们转换为普通的内部连接。 * 在我们完成表达式预处理之后,这个步骤相对容易完成。 */ if (hasOuterJoins) reduce_outer_joins(root); /* * Do the main planning. If we have an inherited target relation, that * needs special processing, else go straight to grouping_planner. * 执行主要的计划过程。 * 如果存在继承的目标关系,则需要特殊处理,否则直接执行grouping_planner。 */ if (parse->resultRelation && rt_fetch(parse->resultRelation, parse->rtable)->inh) inheritance_planner(root); else grouping_planner(root, false, tuple_fraction); /* * Capture the set of outer-level param IDs we have access to, for use in * extParam/allParam calculations later. * 获取我们可以访问的outer-level的参数IDs,以便稍后在extParam/allParam计算中使用。 */ SS_identify_outer_params(root); /* * If any initPlans were created in this query level, adjust the surviving * Paths' costs and parallel-safety flags to account for them. The * initPlans won't actually get attached to the plan tree till * create_plan() runs, but we must include their effects now. * 如果在此查询级别中创建了initplan,则调整现存的访问路径成本和并行安全标志,以反映这些成本。 * 在create_plan()运行之前,initPlans实际上不会被附加到计划树中,但是我们现在必须包含它们的效果。 */ final_rel = fetch_upper_rel(root, UPPERREL_FINAL, NULL); SS_charge_for_initplans(root, final_rel); /* * Make sure we've identified the cheapest Path for the final rel. (By * doing this here not in grouping_planner, we include initPlan costs in * the decision, though it's unlikely that will change anything.) * 确保我们已经为最终的关系确定了成本最低的路径 * (我们没有在grouping_planner中这样做,而是在最终决定中加入了initPlan的成本,尽管这不太可能改变任何事情)。 */ set_cheapest(final_rel); return root; }
“PostgreSQL中Review subquery_planner函数的实现逻辑是什么”的内容就介绍到这里了,感谢大家的阅读。如果想了解更多行业相关的知识可以关注亿速云网站,小编将为大家输出更多高质量的实用文章!
原创文章,作者:kepupublish,如若转载,请注明出处:https://blog.ytso.com/205055.html