ok
Direktori : /opt/alt/postgresql11/usr/share/doc/alt-postgresql11-9.2.24/html/ |
Current File : //opt/alt/postgresql11/usr/share/doc/alt-postgresql11-9.2.24/html/explicit-joins.html |
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> <HTML ><HEAD ><TITLE >Controlling the Planner with Explicit JOIN Clauses</TITLE ><META NAME="GENERATOR" CONTENT="Modular DocBook HTML Stylesheet Version 1.79"><LINK REV="MADE" HREF="mailto:pgsql-docs@postgresql.org"><LINK REL="HOME" TITLE="PostgreSQL 9.2.24 Documentation" HREF="index.html"><LINK REL="UP" TITLE="Performance Tips" HREF="performance-tips.html"><LINK REL="PREVIOUS" TITLE="Statistics Used by the Planner" HREF="planner-stats.html"><LINK REL="NEXT" TITLE="Populating a Database" HREF="populate.html"><LINK REL="STYLESHEET" TYPE="text/css" HREF="stylesheet.css"><META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"><META NAME="creation" CONTENT="2017-11-06T22:43:11"></HEAD ><BODY CLASS="SECT1" ><DIV CLASS="NAVHEADER" ><TABLE SUMMARY="Header navigation table" WIDTH="100%" BORDER="0" CELLPADDING="0" CELLSPACING="0" ><TR ><TH COLSPAN="5" ALIGN="center" VALIGN="bottom" ><A HREF="index.html" >PostgreSQL 9.2.24 Documentation</A ></TH ></TR ><TR ><TD WIDTH="10%" ALIGN="left" VALIGN="top" ><A TITLE="Statistics Used by the Planner" HREF="planner-stats.html" ACCESSKEY="P" >Prev</A ></TD ><TD WIDTH="10%" ALIGN="left" VALIGN="top" ><A HREF="performance-tips.html" ACCESSKEY="U" >Up</A ></TD ><TD WIDTH="60%" ALIGN="center" VALIGN="bottom" >Chapter 14. Performance Tips</TD ><TD WIDTH="20%" ALIGN="right" VALIGN="top" ><A TITLE="Populating a Database" HREF="populate.html" ACCESSKEY="N" >Next</A ></TD ></TR ></TABLE ><HR ALIGN="LEFT" WIDTH="100%"></DIV ><DIV CLASS="SECT1" ><H1 CLASS="SECT1" ><A NAME="EXPLICIT-JOINS" >14.3. Controlling the Planner with Explicit <TT CLASS="LITERAL" >JOIN</TT > Clauses</A ></H1 ><P > It is possible to control the query planner to some extent by using the explicit <TT CLASS="LITERAL" >JOIN</TT > syntax. To see why this matters, we first need some background. </P ><P > In a simple join query, such as: </P><PRE CLASS="PROGRAMLISTING" >SELECT * FROM a, b, c WHERE a.id = b.id AND b.ref = c.id;</PRE ><P> the planner is free to join the given tables in any order. For example, it could generate a query plan that joins A to B, using the <TT CLASS="LITERAL" >WHERE</TT > condition <TT CLASS="LITERAL" >a.id = b.id</TT >, and then joins C to this joined table, using the other <TT CLASS="LITERAL" >WHERE</TT > condition. Or it could join B to C and then join A to that result. Or it could join A to C and then join them with B — but that would be inefficient, since the full Cartesian product of A and C would have to be formed, there being no applicable condition in the <TT CLASS="LITERAL" >WHERE</TT > clause to allow optimization of the join. (All joins in the <SPAN CLASS="PRODUCTNAME" >PostgreSQL</SPAN > executor happen between two input tables, so it's necessary to build up the result in one or another of these fashions.) The important point is that these different join possibilities give semantically equivalent results but might have hugely different execution costs. Therefore, the planner will explore all of them to try to find the most efficient query plan. </P ><P > When a query only involves two or three tables, there aren't many join orders to worry about. But the number of possible join orders grows exponentially as the number of tables expands. Beyond ten or so input tables it's no longer practical to do an exhaustive search of all the possibilities, and even for six or seven tables planning might take an annoyingly long time. When there are too many input tables, the <SPAN CLASS="PRODUCTNAME" >PostgreSQL</SPAN > planner will switch from exhaustive search to a <I CLASS="FIRSTTERM" >genetic</I > probabilistic search through a limited number of possibilities. (The switch-over threshold is set by the <A HREF="runtime-config-query.html#GUC-GEQO-THRESHOLD" >geqo_threshold</A > run-time parameter.) The genetic search takes less time, but it won't necessarily find the best possible plan. </P ><P > When the query involves outer joins, the planner has less freedom than it does for plain (inner) joins. For example, consider: </P><PRE CLASS="PROGRAMLISTING" >SELECT * FROM a LEFT JOIN (b JOIN c ON (b.ref = c.id)) ON (a.id = b.id);</PRE ><P> Although this query's restrictions are superficially similar to the previous example, the semantics are different because a row must be emitted for each row of A that has no matching row in the join of B and C. Therefore the planner has no choice of join order here: it must join B to C and then join A to that result. Accordingly, this query takes less time to plan than the previous query. In other cases, the planner might be able to determine that more than one join order is safe. For example, given: </P><PRE CLASS="PROGRAMLISTING" >SELECT * FROM a LEFT JOIN b ON (a.bid = b.id) LEFT JOIN c ON (a.cid = c.id);</PRE ><P> it is valid to join A to either B or C first. Currently, only <TT CLASS="LITERAL" >FULL JOIN</TT > completely constrains the join order. Most practical cases involving <TT CLASS="LITERAL" >LEFT JOIN</TT > or <TT CLASS="LITERAL" >RIGHT JOIN</TT > can be rearranged to some extent. </P ><P > Explicit inner join syntax (<TT CLASS="LITERAL" >INNER JOIN</TT >, <TT CLASS="LITERAL" >CROSS JOIN</TT >, or unadorned <TT CLASS="LITERAL" >JOIN</TT >) is semantically the same as listing the input relations in <TT CLASS="LITERAL" >FROM</TT >, so it does not constrain the join order. </P ><P > Even though most kinds of <TT CLASS="LITERAL" >JOIN</TT > don't completely constrain the join order, it is possible to instruct the <SPAN CLASS="PRODUCTNAME" >PostgreSQL</SPAN > query planner to treat all <TT CLASS="LITERAL" >JOIN</TT > clauses as constraining the join order anyway. For example, these three queries are logically equivalent: </P><PRE CLASS="PROGRAMLISTING" >SELECT * FROM a, b, c WHERE a.id = b.id AND b.ref = c.id; SELECT * FROM a CROSS JOIN b CROSS JOIN c WHERE a.id = b.id AND b.ref = c.id; SELECT * FROM a JOIN (b JOIN c ON (b.ref = c.id)) ON (a.id = b.id);</PRE ><P> But if we tell the planner to honor the <TT CLASS="LITERAL" >JOIN</TT > order, the second and third take less time to plan than the first. This effect is not worth worrying about for only three tables, but it can be a lifesaver with many tables. </P ><P > To force the planner to follow the join order laid out by explicit <TT CLASS="LITERAL" >JOIN</TT >s, set the <A HREF="runtime-config-query.html#GUC-JOIN-COLLAPSE-LIMIT" >join_collapse_limit</A > run-time parameter to 1. (Other possible values are discussed below.) </P ><P > You do not need to constrain the join order completely in order to cut search time, because it's OK to use <TT CLASS="LITERAL" >JOIN</TT > operators within items of a plain <TT CLASS="LITERAL" >FROM</TT > list. For example, consider: </P><PRE CLASS="PROGRAMLISTING" >SELECT * FROM a CROSS JOIN b, c, d, e WHERE ...;</PRE ><P> With <TT CLASS="VARNAME" >join_collapse_limit</TT > = 1, this forces the planner to join A to B before joining them to other tables, but doesn't constrain its choices otherwise. In this example, the number of possible join orders is reduced by a factor of 5. </P ><P > Constraining the planner's search in this way is a useful technique both for reducing planning time and for directing the planner to a good query plan. If the planner chooses a bad join order by default, you can force it to choose a better order via <TT CLASS="LITERAL" >JOIN</TT > syntax — assuming that you know of a better order, that is. Experimentation is recommended. </P ><P > A closely related issue that affects planning time is collapsing of subqueries into their parent query. For example, consider: </P><PRE CLASS="PROGRAMLISTING" >SELECT * FROM x, y, (SELECT * FROM a, b, c WHERE something) AS ss WHERE somethingelse;</PRE ><P> This situation might arise from use of a view that contains a join; the view's <TT CLASS="LITERAL" >SELECT</TT > rule will be inserted in place of the view reference, yielding a query much like the above. Normally, the planner will try to collapse the subquery into the parent, yielding: </P><PRE CLASS="PROGRAMLISTING" >SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;</PRE ><P> This usually results in a better plan than planning the subquery separately. (For example, the outer <TT CLASS="LITERAL" >WHERE</TT > conditions might be such that joining X to A first eliminates many rows of A, thus avoiding the need to form the full logical output of the subquery.) But at the same time, we have increased the planning time; here, we have a five-way join problem replacing two separate three-way join problems. Because of the exponential growth of the number of possibilities, this makes a big difference. The planner tries to avoid getting stuck in huge join search problems by not collapsing a subquery if more than <TT CLASS="VARNAME" >from_collapse_limit</TT > <TT CLASS="LITERAL" >FROM</TT > items would result in the parent query. You can trade off planning time against quality of plan by adjusting this run-time parameter up or down. </P ><P > <A HREF="runtime-config-query.html#GUC-FROM-COLLAPSE-LIMIT" >from_collapse_limit</A > and <A HREF="runtime-config-query.html#GUC-JOIN-COLLAPSE-LIMIT" >join_collapse_limit</A > are similarly named because they do almost the same thing: one controls when the planner will <SPAN CLASS="QUOTE" >"flatten out"</SPAN > subqueries, and the other controls when it will flatten out explicit joins. Typically you would either set <TT CLASS="VARNAME" >join_collapse_limit</TT > equal to <TT CLASS="VARNAME" >from_collapse_limit</TT > (so that explicit joins and subqueries act similarly) or set <TT CLASS="VARNAME" >join_collapse_limit</TT > to 1 (if you want to control join order with explicit joins). But you might set them differently if you are trying to fine-tune the trade-off between planning time and run time. </P ></DIV ><DIV CLASS="NAVFOOTER" ><HR ALIGN="LEFT" WIDTH="100%"><TABLE SUMMARY="Footer navigation table" WIDTH="100%" BORDER="0" CELLPADDING="0" CELLSPACING="0" ><TR ><TD WIDTH="33%" ALIGN="left" VALIGN="top" ><A HREF="planner-stats.html" ACCESSKEY="P" >Prev</A ></TD ><TD WIDTH="34%" ALIGN="center" VALIGN="top" ><A HREF="index.html" ACCESSKEY="H" >Home</A ></TD ><TD WIDTH="33%" ALIGN="right" VALIGN="top" ><A HREF="populate.html" ACCESSKEY="N" >Next</A ></TD ></TR ><TR ><TD WIDTH="33%" ALIGN="left" VALIGN="top" >Statistics Used by the Planner</TD ><TD WIDTH="34%" ALIGN="center" VALIGN="top" ><A HREF="performance-tips.html" ACCESSKEY="U" >Up</A ></TD ><TD WIDTH="33%" ALIGN="right" VALIGN="top" >Populating a Database</TD ></TR ></TABLE ></DIV ></BODY ></HTML >