Tuesday, August 16, 2016

Date Filters and their impact over Index

I was discussing with my friend on date filters and how they impact the performance, thought this topic is worth blogging.

 With tables holding historic data, you would see columns 'Effective Date or termination date' which would logically mark the record as inactive and also helps to pick the current active record. Again few store NULL or any futuristic date in these columns to determine the active status. Now from this perspective let us see how date formatting can impact the optimizer decisions

  Let us walk through a set of SQL which has date filters,
select * from tbl_objects where effective_end_ts > systimestamp
  
 Plan hash value: 2402742835
  
 ---------------------------------------------------------------------------------
 | Id  | Operation         | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
 ---------------------------------------------------------------------------------
 |   0 | SELECT STATEMENT  |             |       |       |   563 (100)|          |
 |*  1 |  TABLE ACCESS FULL| TBL_OBJECTS |  4454 |   695K|   563   (1)| 00:00:01 |
 ---------------------------------------------------------------------------------
  At the very first look was not convinced with the plan, because i know the total number of active records where 26721 and not 4454, so let us see what is actual and expected.
 select /*+ gather_plan_statistics */ * from tbl_objects where 
 effective_end_ts > systimestamp
  
 Plan hash value: 2402742835
  
 -------------------------------------------------------------------------------------------
 | Id  | Operation         | Name        | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
 -------------------------------------------------------------------------------------------
 |   0 | SELECT STATEMENT  |             |      1 |        |  26721 |00:00:00.14 |    2539 |
 |*  1 |  TABLE ACCESS FULL| TBL_OBJECTS |      1 |   4454 |  26721 |00:00:00.14 |    2539 |
 -------------------------------------------------------------------------------------------
 Yes the actual cardinality is 26721. Now am not going to SQL Profiles to go further to validate the plan because my first bet is always on the data,application design, the first question is why this SQL and what use the operator '>' . 

         The SQL intent is to pick active trades whose end timestamp is 01-01-9999, so let us say this to Oracle and see how this overcomes the trouble created by the operator '> systimestamp'  that is sky is the limit :)   

Now with effective end timestamp values  
select * from tbl_objects where effective_end_ts = timestamp 
'9999-01-01 00:00:00'
 
Plan hash value: 1709553517
 
---------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |             |       |       |     2 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| TBL_OBJECTS |     1 |   160 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | IDX_EFF_TS  |     1 |       |     1   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------
    Though the cardinality is still not right, we could see a change in the access method - index is used.

Now same query with a different date formatting
select * from tbl_objects where effective_end_ts = 
 (to_timestamp('01/01/9999 00:00:00','dd/mm/yyyy hh24:mi:ss'))
  
 Plan hash value: 1709553517
  
 ---------------------------------------------------------------------------------------------------
 | Id  | Operation                           | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
 ---------------------------------------------------------------------------------------------------
 |   0 | SELECT STATEMENT                    |             |       |       |     2 (100)|          |
 |   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| TBL_OBJECTS |     1 |   160 |     2   (0)| 00:00:01 |
 |*  2 |   INDEX RANGE SCAN                  | IDX_EFF_TS  |     1 |       |     1   (0)| 00:00:01 |
 ---------------------------------------------------------------------------------------------------
    Both timestamp and to_timestamp yield the same results in terms of access path.  So avoid using sky limit (> systimestamp) in your SQL's if you know what you are looking for. SQL's like these do they impact partitions, watch out for the next post.    

No comments:

Post a Comment