After being inundated with a frightening amount of email from readers, I have begun to recognize the strong resonance of this column in the SAP user community. I will try to answer your queries either in this column or by email; however, my time and resources are limited, so it may take a while. I hope to use this column frequently as a forum to answer as many of your questions as space will allow.
Glenn Oldfield of Tower Records, USA asked:
I've always wondered one thing about SQL: Why don't the database developers provide an optional parameter on the SELECT statement that would allow programmers to force the use of a particular index? SQL was obviously created for end users who may not have any idea of how they should be accessing the data. In that scenario, an optimizer makes sense. But for programmers, the idea that a database engine with limited intelligence decides how to access the data is a real annoyance. An optional parameter would allow complex WHERE clauses to be processed correctly. As you said in "Rules for Better SELECTS" (Vol. 1, No.2), a complex WHERE clause can really confuse the optimizer, and the usual result is a full table scan. What's your opinion on this matter?
Answer: It's true that in many cases the programmer knows exactly which index works best, which is one reason why different database system vendors provide so-called "optimizer hints." By using an optimizer hint, you can influence the optimizer's decision. At first glance, it seems that optimizer hints could make the database optimizer obsolete. But a closer look shows that in order to select an optimal search method and index, the programmer needs a good idea of the value distribution of the related tables. Because R/3 is a highly configurable meta-application used in many industries and countries by many customers, you can find very differently scaled customer-order and order-product relationships. As a consequence, the programmer can't always predict the value distribution of tables. With R/3, the selection of an optimal index can be industry- or even customer-specific.
In contrast, a modern cost-based optimizer uses table statistics to decide a SQL statement's execution plan. Updated table statistics represent actual knowledge about an installation's value distribution of database tables. Based on such profound information, an optimizer's decision should be sound.
On the other hand, no one can deny that most database optimizers sometimes display their limited intelligence. After all, we don't live in a perfect world! If the optimizer fails to find a sound execution plan in a customer's installation - and only in this case - optimizer hints can be the right remedy. In other words, the usual suspect behind a SELECT statement's poor performance is a missing index that supports the query, not the optimizer. Take the following query that returns all Lufthansa flights starting at Frankfurt:
DATA: xcarrid LIKE spfli-carrid, xconnid LIKE spfli-connid, xcityfrom LIKE spfli-cityfrom. SELECT carrid connid cityfrom FROM spfli INTO (xcarrid, xconnid, xcityfrom) WHERE carrid = 'LH ' AND cityfrom ='FRANKFURT'. WRITE: / xcarrid, xconnid, xcityfrom. ENDSELECT.Copy this ABAP code into a test program, and we'll take a look at SQL Trace - an important tool for programmers and system administrators. You'll need to follow these steps:
Start transaction ST05. On the initial screen, select the trace functions you want to switch on (SQL Trace). To switch on the trace under your user name, choose Trace on.
Run the test program you just created. Select the trace functions you want to switch off. Choose Trace off. Now the system has recorded all the SQL calls being executed during this time. Choose Trace list to get a list of these calls. (See Figure 1.)
Figure 1: SQL trace list
If you've never tried out this function, you'll probably experience some confusion:
The trace list obviously contains many lines that are not related to the SELECT statement in the ABAP program. Explanation: The execution of any ABAP program requires additional administrative SQL calls. To restrict the list output, you can use the filter introducing the trace list.
The trace list contains different SQL statements simultaneously related to the one SELECT statement in the ABAP program. Explanation: The R/3 Database Interface - a sophisticated component of the R/3 Application Server - maps every Open SQL statement to one or a series of physical database calls and brings it to execution. This mapping, crucial to R/3s performance, depends on the particular call and database system. For example, the SELECT-ENDSELECT loop on the SPFLI table in our test program is mapped to a sequence PREPARE-OPEN-FETCH of physical calls in an Oracle environment.
The WHERE clause in the trace list's SQL statement is different from the WHERE clause in the ABAP statement. Explanation: In an R/3 system, a client is a self-contained unit with separate master records and its own set of table data (in commercial, organizational, and technical terms). With ABAP, every Open SQL statement automatically executes within the correct client environment. For this reason, a condition with the actual client code (here MANDT = '000') is added to every WHERE clause if a client field is a component of the searched table.
To see a statement's execution plan, just position the cursor on the PREPARE statement and choose Explain SQL. A detailed explanation of the execution plan depends on the database system and, therefore, is beyond the scope of this column. (Interested readers should consult their reference manuals or the SAP Library for more information on this topic. If you don't have time to consult the reference manuals, don't worry; you'll find that the information presented on the Explain screen is mostly self-explanatory.)
The flight schedule table SPFLI is actually a table with a small amount of data, and - nothing ventured, nothing gained - all optimizers will correctly choose FULL TABLE SCAN as the least costly search method. Don't be confused if your optimizer uses TABLE ACCESS FULL or something else instead. Terms are not standardized; every database system is different. (And if you find something completely different, please don't send me an email! I'm not trying to present a real-world situation or discuss the "correct" optimizer strategy; I'm just showing how ABAP supports optimizer hints.)
Assume that SPFLI is a large table, and you want to make sure the optimizer uses a certain index to access the requested data. To find a candidate, you need to know:
Figure 2: Displaying information about an index.
Which indexes are actually available How they are denoted Which index can be used with a given SELECT statement. To tell the optimizer which index to use, you need to:
Start transaction SE11.
On the initial screen of the ABAP Dictionary, enter SPFLI in the Database table field and choose Display. (See Figure 2.) On the maintenance screen of this table, choose Indexes to get a list of all identifiers of SPFLI's secondary indexes. To obtain a description of the related index, position the cursor on the selected line and click Choose. (I know this sounds ridiculous, but don't blame me!)
For example, the identifier 001 represents a non-unique secondary index comprising the table columns CITYFROM and CITYTO. The index name on the database adheres to the convention
~ (SPFLI~001, in our example). Because the convention for defining the index name in the database has changed several times, I strongly recommended you check out the valid name in this way. To find appropriate candidates for index access, start with the set of columns referenced in the WHERE clause of the SELECT statement (in our example, MANDT CARRID CITYFROM). Candidates are indexes that match up some of their leading columns. I'd like to clear up a common misunderstanding: The sequence of the fields in the WHERE clause doesn't influence the optimizer's index selection. Regarding the list of secondary indexes, SPFLI~001 clearly is the only candidate. The situation becomes a little more complicated if you use comparison operators other than "=" or logical operators other than AND.
Another candidate for index access is the primary index consisting of the table fields MANDT CARRID CONNID. For the primary index, 0 is reserved as the index identifier. But again, always check out the valid name:
On the maintenance screen of the table SPFLI, choose Database Utility in the Utilities menu.
On the utility screen of this table choose Indexes to get a list of the identifiers of all SPFLI indexes. To obtain the primary index's database name, position the cursor into the selected line and choose Ok.
To sum up, the database system can alternatively use FULL TABLE SCAN, the primary key index SPFLI_0, or the secondary index SPFLI~001 to access the requested data. But how can you tell the optimizer which one to use?
GIVE ME A HINT
Since release 4.5, you can provide optimizer hints for SELECT statements using the %_HINTS parameter that enables emergency repairs on the spot. Of course, this parameter only works for those database systems that support optimizer hints. Because optimizer hints are not covered by SQL standards, each database vendor is free to provide them. Note that parameters starting with "%_" are never published in the documentation and should be used with special care only in an emergency. This introduction is very simplified; for a detailed description, please refer to Note 129385 in OSS.
Using the %_HINTS parameter, the example becomes:
SELECT carrid connid cityfrom FROM spfli INTO (xcarrid, xconnid, xcityfrom) WHERE carrid = 'LH ' AND cityfrom = 'FRANKFURT' %_HINTS ORACLE 'INDEX("SPFLI" "SPFLI~001")'. WRITE: / xcarrid, xconnid, xcityfrom. ENDSELECT.
Figure3: Execution plan for a SQL statement.
As you can see, the optimizer's execution plan for the query has changed. Instead of performing a full table scan, the optimizer will perform an index range scan, which is almost always a more efficient operation, unless the table has a small number of rows. (See Figure 3.) If you specify hints incorrectly, ABAP ignores them but doesn't return a syntax or runtime error. This lack of notification is the reason why you have to make sure that you specified the index's correct denotation. Now you should know how to perform optimizer hints. If you aren't careful to use them sparingly, you'll only have yourself to blame when things go wrong. o