You know what? How to debug RMAN !!


Most of cases of RMAN might be critical situation except daily or regular backup operation. Ok let's look at when you need to use RMAN for restore or recover data from backup. We can guess that RMAN seems not doing anything but ..... you don't know the inside of RMAN command execution.

Then there is a way to check it out.
DEBUG or TRACE for RMAN process or command. Let's give it a go for it.

First of all, there are 2 ways of generating log file from RMAN command.

From RMAN prompt;
RMAN> spool log to '/tmp/rman_logtest.log';
RMAN> "run-rman-command"
RMAN> spool log off;

From the command prompt;
$rman target catalog log=/tmp/rman_logtest.log
RMAN> "run-rman-command"
RMAN> exit;

In cases of RMAN operation failure, debug trace is good to collect debugging information that can help troubleshoot the underlying causes. You can use below syntax to generate debug trace:

$rman taret / catalog debug trace=/tmp/rman_debugtest.trc log=/tmp/rman_debugtest.log
RMAN> "run-rman-command"
RMAN> exit;

But sometime you need to collect more information about RMAN issue including performance as well. Let's see what we need to do.

1. RMAN debug output
2. Channel debug and trace output trace files
3. AWR/STATSPACK reports
4. 10046 event trace files/tkprof report
5. OSB or 3rd party media issue information

RMAN debug output

This will always provide information to help the issue if RMAN is a fault. Oracle will produce a useful error as "sqlcode=" or "rc=" in the debug. If you do not see any errors and the channel released due to errors it is likely the failure occurred outside of RMAN and failed in Oracle (channels) or the device output used for RMAN operation.

RMAN> DEBUG ON
RMAN> "run-rman-command"
RMAN> DEBUG OFF

Channel debug/trace output trace files

trace 1: general trace information for skg and sbt
trace 5: verbose channel tracing of skg and sbt calls

CONFIGURE CHANNEL DEVICE TYPE DISK DEBUG=5 TRACE 1;
or
CONFIGURE CHANNEL DEVICE TYPE SBT_TAPE PARMS "END=(....) DEBUG=5 TRACE 5;

* In 12c you need to specify "Highest/High/medium" for DEBUG, older version can use number as above.
RMAN> CONFIGURE CHANNEL DEVICE TYPE DISK DEBUG "high" TRACE 2;
new RMAN configuration parameters:
CONFIGURE CHANNEL DEVICE TYPE DISK DEBUG 'high' TRACE 2;
new RMAN configuration parameters are successfully stored


AWR reports Analysis


You can check the Top events and CPU time for RMAN session from AWR reports

                                                                % of  % of Total
Statistic Name                                       Time (s) DB Time   CPU Time
------------------------------------------ ------------------ ------- ----------
sql execute elapsed time                             67,099.7    96.4
DB CPU                                               30,753.9    44.2       68.8
PL/SQL execution elapsed time                         4,109.4     5.9
Java execution elapsed time                           3,521.3     5.1
RMAN cpu time (backup/restore)                        1,842.1     2.6        4.1
parse time elapsed                                      182.1      .3
inbound PL/SQL rpc elapsed time                         125.9      .2
hard parse elapsed time                                  54.9      .1
repeated bind elapsed time                               34.8      .0
connection management call elapsed time                  20.1      .0
hard parse (sharing criteria) elapsed time                0.9      .0
PL/SQL compilation elapsed time                           0.3      .0
failed parse elapsed time                                 0.1      .0
sequence load elapsed time                                0.1      .0
hard parse (bind mismatch) elapsed time                   0.0      .0
DB time                                              69,617.2
background elapsed time                              26,906.1
background cpu time                                  13,939.9               31.2
total CPU time                                       44,693.8


Background Wait Events            
                                                Total        Avg
                                       %Time     Wait       wait    Waits   % bg
Event                            Waits -outs Time (s)       (ms)     /txn   time
-------------------------- ----------- ----- -------- ---------- -------- ------
RMAN backup & recovery I/O     426,685     0    8,439      19.78      0.3   31.4
cell single block physical  10,975,008     0    3,469       0.32      8.1   12.9
enq: CF - contention             6,664    20    1,042     156.40      0.0    3.9
control file sequential re   2,332,054     0      930       0.40      1.7    3.5

10046 Event Trace files/Tkprof Report


RMAN is running slow then you need to check the all SQLs from RMAN metadata and 10046 event will give you more clear view on this detail information.

RMAN> sql "alter session set events ''10046 trace name context forever, level 12''";

tkprof testdb_ora_9605.trc chk_list_backupofdatabase_10046_rman.txt

[sample output]
SQL ID: ggw6zw74vrg8t Plan Hash: 480089779

select db.name, dbinc.resetlogs_time, dbinc.resetlogs_change#, db.dbid,
  rpad(db.name, 8, 'x'), dbinc.prior_resetlogs_time,
  dbinc.prior_resetlogs_change#, primary_db_unique_name into :dbname, :rltime,
   :rlscn, :fhdbi, :pdbname, :parent_rltime, :parent_rlscn, :prim_dbuname
from
 v$database db, v$database_incarnation dbinc where dbinc.status='CURRENT'

OSB or 3rd Party Media issue

You can use obtool or 3rd party tool to check the TAPE library issue.

So we can set debug/Trace for RMAN session and get more information from trace files. From 12.1 debug option has been changed to use "Highest/High/Medium" and trace is same as before.


allocate channel c1 device type disk debug "highest/high"/medium";
or
alter system set events 'trace[krb.*] disk highest/high/medium'

That's for Today. See you next time.

Comments

Popular posts from this blog

You know what? How to debug RMAN !!

You know what? - DIAGWAIT

You know what " TRACE for EXPDP/IMPDP"