wtorek, 10 kwietnia 2018

Exadata Smart Scan feature and fast full scan on indexes

I spent some time analyzing the problem, why some queries, despite a good execution plan and use of the Exadata (smart scan) functionality, are slow.

In execution plan ( what is most important for database) we see that full scan table WITH SMART SCAN functionality is used on table TABLE and for rest is used join with indexes (with index fast full scan).

------------------------------------------------------------------------------------------------------------
| Id  | Operation                        | Name                | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                 |                     |       |       |       |   933K(100)|          |
|   1 |  SORT ORDER BY                   |                     |    22M|   917M|  1113M|   933K (38)| 00:05:02 |
|   2 |   HASH JOIN RIGHT SEMI           |                     |    22M|   917M|    23M|   720K (45)| 00:03:53 |
|   3 |    VIEW                          | ViEW_1              |  1316K|  9000K|       |   201K (49)| 00:01:05 |
|   4 |     HASH JOIN                    |                     |  1316K|    77M|    84M|   201K (49)| 00:01:05 |
|   5 |      TABLE ACCESS STORAGE FULL   | TABLE_1             |  1316K|    69M|       |   182K (48)| 00:00:59 |
|   6 |      INDEX STORAGE FAST FULL SCAN| PK_TABLE_2          |    28M|   191M|       |  5398  (83)| 00:00:02 |
|   7 |    INDEX FAST FULL SCAN          | NPI_INC_COMPRESS_1  |   488M|    16G|       |   162K (63)| 00:00:53 |
------------------------------------------------------------------------------------------------------------

Reality looks a bit different. During execution of this query with default execution plan (with smart scan) I saw that smart scan is not used!!!
---------------------------------------------------------------------------------------------
Active% | SQL_ID          | SQL_CHILD | EVENT                               | WAIT_CLASS
---------------------------------------------------------------------------------------------
    58% | 2rvn92mjxr7pk   | 0         | ON CPU                              | ON CPU
    42% | 2rvn92mjxr7pk   | 0         | cell multiblock physical read       | User I/O


Lack of WAIT cell smart table scan

Why? If in the execution plan is smart scan but smart scan is not used!
This part of my analysis took most of the time: reason of this behavior

When I forced this query to use only full scans on all tables (without indexes) then on session I saw :
1558, SYS       , TIME, DB CPU                                                    ,       3798473,   759.69ms,    76.0%, [@@@@@@@@  ],          ,           ,
   1558, SYS       , TIME, sql execute elapsed time                                  ,       4004145,   800.83ms,    80.1%, [########  ],          ,           ,
   1558, SYS       , TIME, DB time                                                   ,       4004145,   800.83ms,    80.1%, [########  ],          ,           ,
   1558, SYS       , WAIT, cell smart table scan                                     ,        244890,    48.98ms,     4.9%, [W         ],       443,       88.6,    552.8us

And query finished 4 times faster than using default execution plan.
The reason of this behavior is little tricky. I found that if in the execution plan is used (INDEX FAST FULL SCAN) on compressed index then the smart scan cannot be used in all of this query!!!  Also if in execution plan database show that smart scan is used:
Smart Scans On Compressed Indexes (Doc ID 1561260.1)

And we have this situation for problematic query. Index NPI_INC_COMPRESS_1 is a compressed index and database in execution plan decide to use INDEX FAST FULL SCAN on this index and after that FULL_SCAN (SMART SCAN) on TABLE_1 is not used but execution plan shows that is .

Solution for that is add hints FULL on all tables used in this query or change this index to uncompressed . After that smart scan feature is used.


What is really interesting that if index fast full scan is used then is  more than 10 conditions when smart scan is not really used if the even execution plan says otherwise (details in 1561260.1)




piątek, 30 marca 2018

RMAN - transfer archivelog from primary database to standby

rman target /@rman_prm catalog [catalog]@rmancat auxiliary /@rman_stb

backup as copy archivelog sequence between  5643  and 5683   thread 2  auxiliary format '+RECOC1';

środa, 28 lutego 2018

Checking most heavy queries every snapshot and system statistics

Most important system statiscts/performance overview:

SELECT begin_time,
  CASE METRIC_NAME
    WHEN 'SQL Service Response Time'
    THEN 'SQL Service Response
Time (secs)'
    WHEN 'Response Time Per Txn'
    THEN 'Response Time Per Txn
(secs)'
    ELSE METRIC_NAME
  END METRIC_NAME,
  CASE METRIC_NAME
    WHEN 'SQL Service Response Time'
    THEN ROUND((MINVAL / 100),2)
    WHEN 'Response Time Per Txn'
    THEN ROUND((MINVAL / 100),2)
    ELSE MINVAL
  END MININUM,
  CASE METRIC_NAME
    WHEN 'SQL Service Response Time'
    THEN ROUND((MAXVAL / 100),2)
    WHEN 'Response Time Per Txn'
    THEN ROUND((MAXVAL / 100),2)
    ELSE MAXVAL
  END MAXIMUM,
  CASE METRIC_NAME
    WHEN 'SQL Service Response Time'
    THEN ROUND((AVERAGE / 100),2)
    WHEN 'Response Time Per Txn'
    THEN ROUND((AVERAGE / 100),2)
    ELSE AVERAGE
  END AVERAGE
FROM SYS.DBA_HIST_SYSMETRIC_SUMMARY
WHERE METRIC_NAME IN ('CPU Usage Per Sec', 'CPU Usage Per Txn', 'Database CPU Time Ratio', 'Database Wait Time Ratio', 'Executions Per Sec', 'Executions Per Txn', 'Response Time Per Txn', 'SQL Service Response Time', 'User Transaction Per Sec','User Commits Per Sec')
AND BEGIN_TIME BETWEEN sysdate -1 and sysdate
ORDER BY 1;


Output:

Most heavy queries:

SELECT t.sql_id,
  dbms_lob.substr(q.SQL_TEXT,100,1),
  t.PARSING_SCHEMA_NAME username,
  t.executions_delta exec_count,
  begin_interval_time,
  ROUND(SUM(t.elapsed_time_delta/1000000)/SUM(t.executions_delta),4) time_exec
FROM dba_hist_sqlstat t,
  dba_hist_snapshot s,
  DBA_HIST_SQLTEXT q
WHERE t.snap_id           = s.snap_id
AND t.dbid                = s.dbid
AND q.sql_id              =t.sql_id
AND t.instance_number     = s.instance_number
AND t.executions_delta   IS NOT NULL
AND t.elapsed_time_delta IS NOT NULL
AND t.executions_delta    > 0
AND s.begin_interval_time BETWEEN TRUNC(sysdate)-1 AND TRUNC(sysdate)
AND t.PARSING_SCHEMA_NAME NOT                  IN ('SYS','SYSTEM','DBSNMP') -- yesterday's stats
GROUP BY t.sql_id,
  dbms_lob.substr(q.SQL_TEXT,100,1),
  PARSING_SCHEMA_NAME,
  t.executions_delta,
  s.begin_interval_time
ORDER BY 5,6 DESC;


Output:
 

środa, 22 lutego 2017

RMAN-06004: ORACLE error from recovery catalog database: ORA-20999: internal error: found non-null and null site name


In metalink is only one solution. Recreate controlfile and this issue occurs for database with long db_unique_name ( 30 chars in length) and when catalog is in use.

RMAN-06004: ORACLE error from recovery catalog database: ORA-20999: internal error: found non-null and null site name

I found another solution. I set debug option for rman and in trace file was information:

DBGSQL:           RCVCAT> select count(*) into cnt from db
DBGSQL:              sqlcode = 905
DBGSQL:           error: ORA-00905: missing keyword (krmkosqlerr)


So. It's looks like that rman wrongly generate code. After that I decided to use:

upgrade catalog 

and now everything works ...

Probably is a small difference in catalog (for example after PSU) and all packages must be recreated.

What is strange that command "upgrade catalog" had been executed on newly created catalog... 

środa, 27 lipca 2016

resmgr:pq queued

When trying to connect to database (11g,12c)  I  see wait events on "resmgr:pq queued" indefinitely.
What can be done to resolve the resmgr:pq queued wait message?


By changing the following parameter, this issue was resolved :

alter system set "_parallel_statement_queuing"=FALSE scope=both;

czwartek, 21 lipca 2016

Clusterware out of sync

I had strange inconsistency in Clusterware 12c.

crsctl stat res -t -w "NAME = ora.dbname.db"

return information that db is configured on:
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details      
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.dbname.db
      1        ONLINE  ONLINE       node04              Open,Readonly,STABLE
      2        ONLINE  UNKNOWN      node01              Abnormal Termination
                                                             ,STABLE
--------------------------------------------------------------------------------

but when I checked configuration for this database:

srvctl config database -d dbname
...
Database instances: dbname1,dbname1
Configured nodes: node03,node04

 In documentation is a info that crsctl has a option "relocate resource" but dosn't work :/

To relocate resource I had to use:
srvctl modify instance -d dbname -i dbname1 -n node03


In one of document on metalink I found that only modify option force clusterware to synchronize information about resources

środa, 11 maja 2016

KTBCONVERTREDO

ORA-600 [KTBCONVERTREDO_1]


After ora-4030 errors and sequence of ora-600 ktbConvertRedo_1
and  led to an instance crash. Restarting database didn't help
(database was up ~ 5minutes).
We tried also switchover  to physical standby but unfortunately
we received the same error. 
Smon generated trace and if you carefully look in trace there is
 a object_id. When we tried 
run queries on this object then we received the same error. 
Bingo!!! Object had been corrupted.

luckily that was table with logs from application and we can
 easy drop table and recreate empty structure.   
 
In metalink I found information that  event 10153 should be set
 to see object_id but without that in alertlog this information was.