Monday, November 19, 2012

Cardinality Estimate

When it comes to SQL Tuning, It always starts with an execution plan, likewise this post is no different. I normally suggest to avoid SQL's and advice to write them only when required. The best way to tune an SQL  is to avoid writing it, but in reality this can't happen.

Lets see how we can avoid writing SQL rather saying it as avoid executing SQL.

The below is a small piece which i have done in my recent tuning projects  

SQL> CREATE TABLE tblchk
  2  (
  3    id number,
  4    id_val varchar2(50)
  5  );

Table created.

Elapsed: 00:00:00.30
SQL> 
SQL> 
SQL> declare
  2  type t1 is table of tblchk%rowtype;
  3  v_t1 t1:=t1();
  4  x number:= 65;
  5  begin
  6  
  7  for i in 1..20000 loop
  8  v_t1.extend;
  9  v_t1(i).id:=i;
 10  v_t1(i).id_val:=chr(x);
 11  x:=x+1;
 12  if x=68 then
 13  x:=65;
 14  end if;
 15  end loop;
 16  
 17  forall j in v_t1.first..v_t1.last
 18  insert into tblchk values v_t1(j);
 19  
 20  commit;
 21  end;
 22  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.27
SQL> 
SQL> exec dbms_stats.gather_table_stats('HARI','TBLCHK');

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.51
SQL> 
SQL> select count(*) from tblchk;

  COUNT(*)                                               
----------                                               
     20000                                               

Elapsed: 00:00:00.00

Lets query the table
SQL> alter system flush shared_pool;

System altered.

Elapsed: 00:00:00.34
SQL> 
SQL> variable a varchar2(10);
SQL> 
SQL> exec :a:='ORACLE';
SQL> set autotrace traceonly
SQL>
SQL> 
SQL> select * from tblchk where id_val = :a;

no rows selected

Elapsed: 00:00:00.11

Execution Plan
----------------------------------------------------------                  
Plan hash value: 729812915                                                  
                                                                            
----------------------------------------------------------------------------
| Id  | Operation         | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |        |  6667 | 40002 |    11   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| TBLCHK |  6667 | 40002 |    11   (0)| 00:00:01 |
----------------------------------------------------------------------------
                                                                            
Predicate Information (identified by operation id):                         
---------------------------------------------------                         
                                                                            
   1 - filter("ID_VAL"=:A)                                                  


Statistics
----------------------------------------------------
        111  recursive calls                        
          0  db block gets                          
         51  consistent gets                        
         37  physical reads                         
          0  redo size                              
        342  bytes sent via SQL*Net to client       
        408  bytes received via SQL*Net from client 
          1  SQL*Net roundtrips to/from client      
          0  sorts (memory)                         
          0  sorts (disk)                           
          0  rows processed                         


"No rows selected" but why is the Rows column in execution plan showing it as 6667. Although we can get this fixed with a much similar approach as what we have seen in my earlier article on extended stats but now am looking for other options, 

Lets start with TKprof. 
select * 
from
 tblchk where id_val = :a1


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.07          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        1      0.00       0.12         36         38          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        3      0.00       0.19         36         38          0           0

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 85  

Rows     Row Source Operation
-------  ---------------------------------------------------
      0  TABLE ACCESS FULL TBLCHK (cr=38 pr=36 pw=0 time=0 us cost=11 size=40002 card=6667)


As expected the row source operation shows a cardinality as 6667. So how does oracle estimate the cardinality as 6667


SQL> select round(count(*)/count(distinct id_val)) ROWS_VAL from tblchk;

  ROWS_VAL
----------
      6667

Elapsed: 00:00:00.02
SQL>

Well, you can also cross check the actual's with the estimated rows


SQL> set linesize 150
SQL> set pagesize 150
SQL> 
SQL> SELECT * FROM   TABLE(DBMS_XPLAN.DISPLAY_CURSOR('96vjj2ncju5p5',null,'ALLSTATS LAST'));

PLAN_TABLE_OUTPUT                                                                               
------------------------------------------------------------------------------------------------
SQL_ID  96vjj2ncju5p5, child number 0                                                           
-------------------------------------                                                           
select * from tblchk where id_val = :a1                                                         
                                                                                                
Plan hash value: 729812915                                                                      
                                                                                                
----------------------------------------------------------------------------------------------- 
| Id  | Operation         | Name   | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  | 
----------------------------------------------------------------------------------------------- 
|   0 | SELECT STATEMENT  |        |      1 |        |      0 |00:00:00.03 |      38 |     36 | 
|*  1 |  TABLE ACCESS FULL| TBLCHK |      1 |   6667 |      0 |00:00:00.03 |      38 |     36 | 
----------------------------------------------------------------------------------------------- 
                                                                                                
Predicate Information (identified by operation id):                                             
---------------------------------------------------                                             
                                                                                                
   1 - filter("ID_VAL"=:A1)                                                                     


Lets provide more information to the optimizer so as to have a better cardinality estimate.


SQL>
SQL> alter table tblchk add CONSTRAINT check_idval CHECK (id_val in ('A','B','C'
));

Table altered.

Elapsed: 00:00:00.91
SQL>
SQL>
SQL> exec dbms_stats.gather_table_stats('HARI','TBLCHK');

PL/SQL procedure successfully completed.

Elapsed: 00:00:01.27
SQL>
Will check constraint help me to find the correct cardinality ?, Soon will post more interesting facts on the same