Chapter 7 explains the SQL query plan

This chapter describes the language and terminology used in the InterSystems SQL query access plan generated by ShowPlan.

Table stored in the map

SQL tables are stored as a set of mappings. Each table has a master map that contains all the data in the table; Tables can also have other mappings, such as index maps and bitmaps. Each map can be portrayed as a multidimensional global, with some fields having data in one or more subscripts and the rest stored in node values. Subscripts control the data to be accessed.

  • For the master mapping,RowIDorIDKEYFields are often used as mapping subscripts.
  • For index maps, it is common to use other fields as leading subscripts, which willRowID/IDKEYField is used as an additional lower level index.
  • For bitmaps, think of the bit layer as an additional RowID subscript level. However, bitmaps can only be used for integers that are positiveRowID.

Development plan

Compiling an SQL query generates a set of instructions to access and return the data specified by the query. These instructions are represented as ObjectScript code in.int routines.

Instructions and the order in which they are executed are influenced by the DATA in the SQL compiler about the structure and contents of the tables involved in the query. The compiler tries to use information such as table sizes and available indexes to make the instruction set as efficient as possible.

Query access Plan (ShowPlan) is a readable translation of the resulting instruction set. The query author can use this query access plan to see how the data will be accessed. While the SQL compiler tries to make the most efficient use of the data specified by the query, sometimes the author of the query has a much clearer understanding of certain aspects of the stored data than the compiler. In this case, the author can use the query plan to modify the original query to provide more information or more guidance to the query compiler.

Reading program

The result of “ShowPlan” is a series of statements about the processing of accessing and displaying the data specified in the query. Information on how to interpret the ShowPlan statement is provided below.

Access to the map

A query plan can access multiple tables. When accessing a table, a plan can access a single map (index or primary map), two maps (index map followed by primary map), or, in the case of a multi-index plan, multiple maps.

The plan indicates the subscript used when accessing data through a map. It also indicates what the actual subscript values are: a given value, a given set of values, a range of values, or all the values that the subscript displays in the table. Which one you choose depends on the conditions specified in the query. Obviously, accessing a single or several subscript values is much faster than accessing all values at that subscript level.

Conditions and expressions

When the query runs, the various conditions specified by the query are tested. The ShowPlan output does not explicitly indicate the tests of the conditions, except for certain conditions that restrict subscripts mentioned earlier. It is always best to test conditions early. The best place to test the various conditions can be inferred from the planning details.

Similarly, ShowPlan does not detail the evaluation of expressions and subexpressions. In addition to simplicity, the main reason is that in most database environments, table and index access constitutes a more important aspect of processing; The cost of the key table data dominates the overall query cost because disk access is still several orders of magnitude slower than CPU processing.

cycle

When accessing data in a table, it is often necessary to iterate through multiple rows. Such access is indicated by a loop. Each instruction passed for execution is called the body of the loop. They can be visually shown by indenting. Database access involving multiple tables usually requires a loop within a loop. In this case, each loop level is represented by further indented compared to the previous level.

The temporary file

define

The query plan may also indicate that intermediate temp-files need to be built and used. This is the “temporary” area of the local array. It is used to save temporary results for various purposes, such as sorting. Just like a mapping, a temporary file has one or more subscripts and possibly node data.

use

Some temporary files contain data that works with a single table. In this case, you can think of building the temporary file as preprocessing the data in the table. After reading such a temporary file, the primary mapping of the source table can be accessed or not accessed. In other cases, temporary files may contain the results of processing multiple tables. In other cases, temporary files are used to store aggregation values of groups, check DISTINCT, and so on.

The module

The building of temporary files, as well as other processing, can be delegated to a separate unit of work called a module. Each module is named. When individual modules are listed, the plan specifies where each module is invoked. When module execution ends, processing continues in the next statement after the module invocation.

Queries sent to processing

For external tables linked through ODBC or JDBC gateway connections, the plan displays the query text sent to the remote SQL Gateway Connection to retrieve the requested data from the remote table.

For parallel query processing and shards, the plan shows the various queries sent to parallel processing or processed on shards. The plan for each query is also displayed.

Subqueries, joins, and unions

Some subqueries (and views) within a given query can also be processed separately. Their plans are specified in a separate subquery section. The exact location where the subquery part is called is not specified in the plan. This is because they are often called as part of condition or expression processing.

For queries specifying OUTER JOIN, if no matching rows are found, the plan may indicate the null rows that may be generated to satisfy the requirements of OUTER JOIN semantics.

For unions, this plan might indicate that result rows from different UNION subqueries are grouped into a single module where they can be further processed.

Project analysis

When analyzing the plan for a given query, an application developer may sometimes feel that a different plan would be more efficient. Application developers have several ways to influence the plan.

First, the plan is affected by running the tuning tables correctly in an environment that contains real application data. Manually defining some of the values that Tune Tables normally calculate in the class source definition — such as Table EXTENTSIZE, field SELECTIVITY, and mapping BlockCount — can also be used to implement the desired plan.

In addition, the analysis plan may indicate that some changes to the class definition may lead to a more effective plan, such as:

Adding an index

In some cases (though not always), preprocessing with a temporary file may mean that adding an index to the original table with the same or similar structure as the temporary file will eliminate the need to build the temporary file. Removing this processing step from the query plan obviously makes the query run faster, but this must be balanced against the amount of effort required to maintain the index while updating the table.

Add fields to index data

When the plan shows the index in use, followed by access to the primary map, this means that adding the primary map field used in the query to the inode data may produce a faster plan for that query. Again, this must be balanced against additional update time and additional time added to processing other queries that use the index because the index will be larger and therefore require more read time.

Adding a join index

When the plan shows that two tables are joined in a particular order (for example, t1 is retrieved first and then joined to T2 using the join condition T1.a = T2.b), it is possible that the reverse table order will result in a faster plan. For example, if T2 has additional conditions, you can significantly limit the number of rows that qualify. In this case, add a T1 index on T1. A will allow such a join order.