SQL – Reformatting to Improve Performance 10

4 12 2009

December 4, 2009

(Back to the Previous Post in the Series) (Forward to the Next Post in the Series)

Some time ago the following question appeared in a forum:
http://groups.google.com/group/oracle_dba_experts/browse_thread/thread/8176f2571da1da3d

I wanted to tune the below query.Any help would be of great help. Below is the query

--Find Year to Generate Risk Score Starts here
select to_char(stmtdt, 'yyyy') as ID, to_char(stmtdt, 'yyyy') as year
 from statement
 where borrid= &var_borrId and user_id= &var_user_id and coaid = &var_mdlid
 and to_char(stmtdt, 'yyyy') in
 (
   select os.year from
   (
     select borrid, year, count(*) as cntActuals
     from subjective_prm_trans
     where borrid= &var_borrId and user_id= &var_user_id and mdlid = &var_mdlid and endnodeflag = 'E'
     group by year, borrid
   ) os,
   (
     select j.borrid, j.year, j.cntVariable + k.cntFixed as cntMdlTotals
     from
     (
       select a.borrid, a.year , count(*) as cntVariable
       from subjective_prm_trans a, mdl_Parameter_Tree m
       where
       a.prmid = m.parentid and a.mdlid = m.mdlid and
       a.endnodeflag = 'N' and a.value between 0.0001 and 1 and
       a.prmid not in (select r.prmid from mdl_parameter_tree r
    where
       trim(lower(r.prmname)) = 'project risk' and r.mdlid= &var_mdlid ) and
       a.borrid= &var_borrId and a.mdlid= &var_mdlid and a.user_id= &var_user_id
       group by a.borrid , a.year
     ) j,
     (
       select count(*) as cntFixed
         from mdl_parameter_tree u
         where u.prmid not in
        (
          select t.prmid
          from mdl_parameter_tree t
          where t.rootnode in
          (
           select b.rootnode
           from subjective_prm_trans a, mdl_parameter_tree b
           where
                a.mdlid = b.mdlid and a.prmid = b.prmid and
                a.endnodeflag = 'N' and a.value between 0.0001 and 1
                and a.prmid not in (select r.prmid from mdl_parameter_tree r
                  where trim(lower(r.prmname)) = 'project risk' and r.mdlid= &var_mdlid ) and
                a.borrid= &var_borrId and a.mdlid= &var_mdlid and
                a.user_id= &var_user_id ) and
          t.endnodeflag = 'E' and parametertype = 'S' and mdlid= &var_mdlid
        ) and
         u.endnodeflag ='E' and
         u.parametertype = 'S' and
         u.mdlid= &var_mdlid
     ) k
   ) om
   where
   os.borrid = om.borrid and
   os.year = om.year and
   os.cntActuals  = om.cntMdlTotals
 )
 order by year desc

Well below is the explain plan:

Execution Plan
----------------------------------------------------------
0     SELECT STATEMENT Optimizer=ALL_ROWS (Cost=28 Card=1 Bytes=61 )
1 0    SORT (ORDER BY) (Cost=28 Card=1 Bytes=61)
2 1     HASH JOIN (SEMI) (Cost=26 Card=1 Bytes=61)
3 2      TABLE ACCESS (FULL) OF 'STATEMENT' (Cost=2 Card=1 Bytes=48)
4 2      VIEW OF 'VW_NSO_1' (Cost=23 Card=1 Bytes=13)
5 4       NESTED LOOPS (Cost=23 Card=1 Bytes=91)
6 5        HASH JOIN (Cost=13 Card=1 Bytes=78)
7 6         VIEW (Cost=4 Card=1 Bytes=39)
8 7          SORT (GROUP BY) (Cost=4 Card=1 Bytes=55)
9 8           TABLE ACCESS (FULL) OF 'SUBJECTIVE_PRM_TRANS' (Cost=2 Card=1 Bytes=55)
10 6        VIEW (Cost=9 Card=1 Bytes=39)
11 10        SORT (GROUP BY) (Cost=7 Card=1 Bytes=107)
12 11         FILTER
13 12          HASH JOIN (Cost=5 Card=1 Bytes=107)
14 13           TABLE ACCESS (FULL) OF 'SUBJECTIVE_PRM_TRANS' (Cost=2 Card=1 Bytes=81)
15 13           TABLE ACCESS (FULL) OF 'MDL_PARAMETER_TREE' (Cost=2 Card=142 Bytes=3692)
16 12          TABLE ACCESS (FULL) OF 'MDL_PARAMETER_TREE' (Cost=2 Card=1 Bytes=155)
17 5       VIEW
18 17       SORT (AGGREGATE)
19 18        FILTER
20 19         TABLE ACCESS (FULL) OF 'MDL_PARAMETER_TREE' (Cost=2 Card=1 Bytes=284)
21 19         FILTER
22 21          NESTED LOOPS (Cost=6 Card=1 Bytes=636)
23 22           MERGE JOIN (CARTESIAN) (Cost=4 Card=1 By tes=481)
24 23            TABLE ACCESS (FULL) OF 'SUBJECTIVE_PRM_TRANS' (Cost=2 Card=1 Bytes=68)
25 23            BUFFER (SORT) (Cost=2 Card=1 Bytes=413)
26 25             TABLE ACCESS (FULL) OF 'MDL_PARAMETER_TREE' (Cost=2 Card=1 Bytes=413)
27 22           TABLE ACCESS (FULL) OF 'MDL_PARAMETER_TREE' (Cost=2 Card=18 Bytes=2790)
28 21          TABLE ACCESS (FULL) OF 'MDL_PARAMETER_TREE' (Cost=2 Card=1 Bytes=155)

 

Statistics
----------------------------------------------------------
18 recursive calls
0 db block gets
387229 consistent gets
306954 physical reads
0 redo size
432 bytes sent via SQL*Net to client
584 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
114 sorts (memory)
0 sorts (disk)
1 rows processed

Any help would help to resolve the issue

———-
David has already mentioned that your use of “and to_char(stmtdt, ‘yyyy’)” makes it impossible to use a regular index on the stmtdt column, if one exists.

Looking at the plan, you may notice that there is not a single index access, the cardinality numbers are low (possibily indicating that it has been a while since statistics were gathered), there is a merge join Cartesian which can kill performance with large numbers of rows, the formatting without spaces makes it impossible to see how the data was retrieved in order, and it appears that you are using a version of Oracle prior to 10g R1 based on the presence of “SORT (GROUP BY)”. That may be an indication that Oracle is converting or is not converting some/all of the NOT IN subqueries as necessary.

Looking at the query, slightly reformatted:

SELECT
  TO_CHAR(STMTDT, 'YYYY') AS ID, TO_CHAR(STMTDT, 'YYYY') AS YEAR
FROM
  STATEMENT
WHERE
  BORRID= &VAR_BORRID
  AND USER_ID= &VAR_USER_ID
  AND COAID = &VAR_MDLID
  AND TO_CHAR(STMTDT, 'YYYY') IN
  (
   SELECT
     OS.YEAR
   FROM
   (
     SELECT
       BORRID,
       YEAR,
       COUNT(*) AS CNTACTUALS
     FROM
       SUBJECTIVE_PRM_TRANS
     WHERE
       BORRID= &VAR_BORRID
       AND USER_ID= &VAR_USER_ID
       AND MDLID = &VAR_MDLID
       AND ENDNODEFLAG = 'E'
     GROUP BY
       YEAR,
       BORRID) OS,
   (
     SELECT
       J.BORRID,
       J.YEAR,
       J.CNTVARIABLE + K.CNTFIXED AS CNTMDLTOTALS
     FROM
     (
       SELECT
         A.BORRID,
         A.YEAR ,
         COUNT(*) AS CNTVARIABLE
       FROM
         SUBJECTIVE_PRM_TRANS A,
         MDL_PARAMETER_TREE M
       WHERE
         A.PRMID = M.PARENTID
         AND A.MDLID = M.MDLID
         AND A.ENDNODEFLAG = 'N'
         AND A.VALUE BETWEEN 0.0001 AND 1
         AND A.PRMID NOT IN
           (SELECT
             R.PRMID
           FROM
             MDL_PARAMETER_TREE R
           WHERE
             TRIM(LOWER(R.PRMNAME)) = 'PROJECT RISK'
             AND R.MDLID= &VAR_MDLID )
         AND A.BORRID= &VAR_BORRID
         AND A.MDLID= &VAR_MDLID
         AND A.USER_ID= &VAR_USER_ID
       GROUP BY
         A.BORRID ,
         A.YEAR ) J,
     (
       SELECT
         COUNT(*) AS CNTFIXED
       FROM
         MDL_PARAMETER_TREE U
       WHERE
         U.PRMID NOT IN
           (SELECT
             T.PRMID
           FROM
             MDL_PARAMETER_TREE T
           WHERE
             T.ROOTNODE IN
               (SELECT
                 B.ROOTNODE
               FROM
                 SUBJECTIVE_PRM_TRANS A,
                 MDL_PARAMETER_TREE B
               WHERE
                 A.MDLID = B.MDLID
                 AND A.PRMID = B.PRMID
                 AND A.ENDNODEFLAG = 'N' AND A.VALUE BETWEEN 0.0001 AND 1
                 AND A.PRMID NOT IN
                   (SELECT
                     R.PRMID
                   FROM
                     MDL_PARAMETER_TREE R
                   WHERE
                     TRIM(LOWER(R.PRMNAME)) = 'PROJECT RISK'
                    AND R.MDLID= &VAR_MDLID )
             AND A.BORRID= &VAR_BORRID
             AND A.MDLID= &VAR_MDLID
             AND A.USER_ID= &VAR_USER_ID
          )
         AND T.ENDNODEFLAG = 'E'
         AND PARAMETERTYPE = 'S'
         AND MDLID= &VAR_MDLID
        )
       AND U.ENDNODEFLAG ='E'
       AND U.PARAMETERTYPE = 'S'
       AND U.MDLID= &VAR_MDLID
     ) K
   ) OM
   WHERE
     OS.BORRID = OM.BORRID
     AND OS.YEAR = OM.YEAR
     AND OS.CNTACTUALS  = OM.CNTMDLTOTALS
 )
ORDER BY
  YEAR DESC

You seem to be repeating the same inline view SQL in a couple places, you seem to be performing a COUNT in two inline views so that you can perform “AND OS.CNTACTUALS  = OM.CNTMDLTOTALS” – there might be a better way.  Where you have “SELECT OS.YEAR” – you may want to convert this to 2 date columns with Jan 1 of the year in the first date column and Dec 31 of the year in the second – doing that would allow you to change “TO_CHAR(STMTDT, ‘YYYY’) IN” into a statement that does not require TO_CHAR.

It will take someone familiar with the data to tell you if some of the extra work in the inline views may be eliminated.

A follow-up email to the OP:
There are some basic problems that I found in the query and the explain plan that should be fixed before trying David’s suggestions for a materialed view.  I suggested the following:
* There are no indexes used in the query, which may be part or most of the performance problem – verify that indexes exist and statistics are gathered for those indexes.
* The cardinality numbers are low (possibily indicating that it has been a while since statistics were gathered) – make certain that statistics are gathered with DBMS_STATS.
* There is a merge join Cartesian in the plan which can severely hurt performance with large numbers of rows – you may need a hint to fix this.
* The formatting of the plan without spaces makes it impossible to see how the data was retrieved by the plan.
* You seem to be using a version of Oracle prior to 10g R1, and those versions may not convert/change/transform some/all of the NOT IN subqueries as necessary for best performance.
* You seem to be repeating the same inline view SQL in a couple places.
* You seem to be performing a COUNT in two inline views so that you can perform “AND OS.CNTACTUALS  = OM.CNTMDLTOTALS” – there might be a better way.
* Where you have “SELECT OS.YEAR” – you may want to convert this to 2 date columns with Jan 1 of the year in the first date column and Dec 31 of the year in the second – doing that would allow you to change “TO_CHAR(STMTDT, ‘YYYY’) IN” into a statement that does not require TO_CHAR.
* It will take someone familiar with the data to tell you if some of the extra work in the inline views may be eliminated.


Actions

Information

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s




Follow

Get every new post delivered to your Inbox.

Join 142 other followers

%d bloggers like this: