Query Requires More Time to Execute from an Instant Client Connection

12 08 2010

August 12, 2010

An interesting thread appeared on the OTN forums a couple of days ago where the original poster demonstrated that a simple query that executed in 0.12 seconds from a SQL*Plus session on the server required 1 minute and 54.19 seconds on a client that was connected using Oracle’s Instant Client.  The simple query looked like this:

select mod(lastinstmessagesequence, 1000000000) LastInstIDSeqNo from tibex_msgseqbyuseralias where useralias='2221';

How would you troubleshoot this performance problem?  What would you do and not do, and in what order:

  • Create a Statspack or AWR report.
  • Review an ADDM report.
  • Execute the SQL statement multiple times in each environment to limit differences due to the effects of buffer caching.
  • Execute the SQL statement from another computer that is also using the Oracle Instance Client.
  • Execute the SQL statement from another computer that is using the regular Oracle Client.
  • Perform an AUTOTRACE to see the execution plan along with the number of reads from the buffer cache and the number of physical reads.
  • Execute the query at the client and server, specifying the GATHER_PLAN_STATISTICS hint and then examine the execution plan using DBMS_XPLAN.DISPLAY_CURSOR.
  • Set the STATISTICS_LEVEL to ALL for both client and display an execution plan with DBMS_XPLAN.DISPLAY_CURSOR.
  • Enable a 10046 trace at level 12 and process the trace file using TKPROF.
  • Enable a 10046 trace at level 12 and manually review the trace file.
  • Review a 10053 trace for the SQL statement executed on the client-side and server-side.
  • Verify that the NLS settings are the same for the client-side and server-side sessions.
  • Review the data captured by ASH.
  • Investigate the network connection.
  • Pass along the problem to someone more familiar with troubleshooting performance problems.
  • Say “that’s just the way it works,” then duck and cover.

——

What would you do if AUTOTRACE and DBMS_XPLAN show the same execution plan, yet the predicate information section is different.  For example:

Autotrace:

Predicate Information (identified by operation id):
---------------------------------------------------
   5 - filter("LASTINSTUSERALIAS"='2221')
   8 - access("LASTINSTUSERALIAS"='2221')
  10 - filter("LASTINSTUSERALIAS" IS NOT NULL AND "LASTINSTUSERALIAS"='2221')
  12 - filter("LASTINSTUSERALIAS" IS NOT NULL AND "LASTINSTUSERALIAS"='2221')
  14 - filter("LASTINSTUSERALIAS" IS NOT NULL AND "LASTINSTUSERALIAS"='2221')
  16 - filter("USERALIAS"='2221')
  18 - filter("USERALIAS"='2221')
  20 - access("USERALIAS"='2221')
  22 - access("USERALIAS"='2221')
  24 - access("USERALIAS"='2221')
  26 - filter("USERALIAS" IS NOT NULL AND "USERALIAS"='2221')
  28 - access("USERALIAS"='2221')
  30 - access("USERALIAS"='2221')
  32 - access("USERALIAS"='2221')
  34 - access("USERALIAS"='2221')
  36 - access("USERALIAS"='2221')
  38 - access("USERALIAS"='2221')
  40 - access("USERALIAS"='2221')
  42 - access("USERALIAS"='2221')
  44 - filter("USERALIAS"='2221')
  46 - filter("USERALIAS"='2221')

DBMS_XPLAN:

Predicate Information (identified by operation id):
---------------------------------------------------
   5 - filter("LASTINSTUSERALIAS"=:SYS_B_1)
   8 - access("LASTINSTUSERALIAS"=:SYS_B_1)
  10 - filter(("LASTINSTUSERALIAS" IS NOT NULL AND "LASTINSTUSERALIAS"=:SYS_B_1))
  12 - filter(("LASTINSTUSERALIAS" IS NOT NULL AND "LASTINSTUSERALIAS"=:SYS_B_1))
  14 - filter(("LASTINSTUSERALIAS" IS NOT NULL AND "LASTINSTUSERALIAS"=:SYS_B_1))
  16 - filter("USERALIAS"=:SYS_B_1)
  18 - filter("USERALIAS"=:SYS_B_1)
  20 - access("USERALIAS"=:SYS_B_1)
  22 - access("USERALIAS"=:SYS_B_1)
  24 - access("USERALIAS"=:SYS_B_1)
  26 - filter(("USERALIAS" IS NOT NULL AND "USERALIAS"=:SYS_B_1))
  28 - access("USERALIAS"=:SYS_B_1)
  30 - access("USERALIAS"=:SYS_B_1)
  32 - access("USERALIAS"=:SYS_B_1)
  34 - access("USERALIAS"=:SYS_B_1)
  36 - access("USERALIAS"=:SYS_B_1)
  38 - access("USERALIAS"=:SYS_B_1)
  40 - access("USERALIAS"=:SYS_B_1)
  42 - access("USERALIAS"=:SYS_B_1)
  44 - filter("USERALIAS"=:SYS_B_1)
  46 - filter("USERALIAS"=:SYS_B_1)

——

What if you executed the following script on the client-side and server-side and found that the SQL_ID and or CHILD_NUMBER displayed were different for the client and server executions:

ALTER SESSION SET STATISTICS_LEVEL='TYPICAL';
SET AUTOTRACE OFF
SET SERVEROUTPUT OFF
SET TIMING OFF
SET LINESIZE 160
SET PAGESIZE 1000

select /*+ GATHER_PLAN_STATISTICS */ mod(lastinstmessagesequence, 1000000000) LastInstIDSeqNo from tibex_msgseqbyuseralias where useralias='2221';

select /*+ GATHER_PLAN_STATISTICS */ mod(lastinstmessagesequence, 1000000000) LastInstIDSeqNo from tibex_msgseqbyuseralias where useralias='2221';

SELECT
  *
FROM
  TABLE(DBMS_XPLAN.DISPLAY_CURSOR('NULL,NULL,'ALLSTATS LAST'));

——

There are several interesting replies in the OTN message thread, including a very helpful response by Randolf Geist.


Actions

Information

2 responses

12 08 2010
Centinul

Note, I’ve intentionally tried to avoid the thread to try and not spoil the fun of this exercise.

Question 1:

– “How would you troubleshoot this performance problem? What would you do and not do, and in what order:”

Answer 1:

I think my first step would be to try and retrieve the actual executions plans for the query that runs well and the query that runs badly. I personally wouldn’t vote for using AUTOTRACE as it can report inaccurate results. If I had access to both machines the easiest way to probably do this, as long as the problem is repeatable would probably be to use the following method you identified:

– Execute the query at the client and server, specifying the GATHER_PLAN_STATISTICS hint and then examine the execution plan using DBMS_XPLAN.DISPLAY_CURSOR.

Setting STATISTICS_LEVEL to ALL would probably only be useful in situations where you couldn’t get access to and/or run the code manually.

If I didn’t have access to these clients my next step would be to try and probe the V$ views to try and identify the SQL_ID, and CHILD_NUMBER and use DBMS_XPLAN.DISPLAY_CURSOR. If this fails then I would use the following method:

– Enable a 10046 trace at level 12 and process the trace file using TKPROF.

I don’t think a 10046 trace at level 12 and manually reviewing the trace file is useful at the moment because I’m just after the execution plans. If I need to dig into the weeds this may be useful later.

I don’t think generating a STATSPACK/AWR report is useful at the moment because based on the problem description this seems like an issue in particular with the SQL statement and not an instance wide issue that may show up in a STATSPACK/AWR report. There is no complaint of general slowness of the entire system in the problem statement either.

I probably wouldn’t execute the queries on another computer regardless of what client is installed because chances are the client settings on the machine I test on will not be exactly the same as those on the two given in the problem statement.

The 10053 trace may be useful to use if further execution plan analysis is required. Using DBMS_XPLAN.DISPLAY_CURSOR will ALLSTATS generates a lot of information and I may only dig into the 10053 trace file if the execution plans are different and the A-Rows vs E-Rows given in DBMS_XPLAN.DISPLAY_CURSOR didn’t lead me down another path.

I think the network would only be worth investigating if there were significant wait times related to the network displayed in the TKPROF file.

Question 2:

– What would you do if AUTOTRACE and DBMS_XPLAN show the same execution plan, yet the predicate information section is different. For example:

Answer 2:

The immediate tip off here is the existence of “:SYS_B_01.” This tells me that there is most likely a difference in the client configurations between the two, probably related to the CURSOR_SHARING parameter.

This may be a good time to gather as much information about the client configurations as possible to see if there are any other differences. The 10053 trace file may be useful in this case because it may highlight some parameter differences related to the CBO.

Ultimately the issue could be like the Tom Kyte story, “the database is slow when it rains on .” Bind peeking and data distribution issues may be the culprit.

Question 3:

– What if you executed the following script on the client-side and server-side and found that the SQL_ID and or CHILD_NUMBER displayed were different for the client and server executions:

Answer 3:

Typically I wouldn’t expect the SQL_ID to be different in this case because the query text is exactly the same. If the CHILD_NUMBER is different then it may be beneficial to compare the PLAN_HASH_VALUE to determine if the execution plans were indeed the same. If they are different it could be do to environmental differences as mentioned above. The slow one may not be using the optimal plan that the well running query was using. If they have the same plan it may be worth digging deeper into the V$SQL_SHARED_CURSOR view to determine why they weren’t shared and continue the investigation there.

If after all this analysis a conclusion isn’t reached it may be worthwhile to backup a level and look at either an ASH report, or even a AWR/STATSPACK report to determine if there were any instance level issues that could have made the execution times different.

12 08 2010
Charles Hooper

Centinul,

For not having visited the OTN thread prior to posting your comment, I would say that you did an excellent job of analyzing the information that I provided and offering a well reasoned method for investigating the problem.

Regarding your answer for #2, I think that my blog article might have been slightly confusing. I intended to state that from within the client session the OP was seeing a descrepany between the predicate information section of the execution plan displayed by AUTOTRACE and the predicate information section of the execution plan displayed by DBMS_XPLAN when also displayed within the client session (the same behavior was noted on in the server-side session also). I believe that you are correct regarding the reason for the difference in the Predicate Information section – CURSOR_SHARING=FORCE in the database instance’s initialization parameters.

In the thread you may note that I suggested a 10046 trace in addition to DBMS_XPLAN. The OP provided an update to the thread shortly after I published this blog article. The update to the thread demonstrates why someone might move quickly to a 10046 trace with TKPROF analysis and then manual analysis.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s




Follow

Get every new post delivered to your Inbox.

Join 142 other followers

%d bloggers like this: