Figuring Out What EXPDP and IMPDP are Doing Right Now

Starting Oracle 10g, the old export/import tools were replaced by a newer, more sophisticated tool: the data pump.
This new tool had some advantages over the older tool – it was able to run in parallel, it has a better interface and it can be ran from PL/SQL code easily. It can even be restarted or resumed if needed. On the con side, it uses the database DIRECTORY object which means the file we’re exporting to or importing from must reside on the server.

The problem start when we try to diagnose what is going on when there is a performance issue. Since the process we use for exporting might be forked into multiple processes, we can’t really know what is going on behind the scenes.

So, how can we resolve that?

OS Level


First level: how to we recognize the processes that are related to the expdp command?

Answer:

/app01/oracle 17:54> expdp zelkayam2 directory=HISTORY_DIR dumpfile=zelkayam%u.dmp full=y parallel=2
/app01/oracle 17:54> ps -ef | grep expdp

oracle 27854 27567 0 17:54 pts/0 00:00:00 expdp directory=HISTORY_DIR dumpfile=zelkayam%u.dmp full=y parallel=2

Session Level


Level two: How do recognize the sessions within the database?

Answer 1: getting the current jobs – the DATAPUMP_SESSIONS and ATTACHED_SESSIONS columns will show us the current used SIDs:

select * from DBA_DATAPUMP_JOBS;

Answer 2: if we actually want to see the current processes and session information, we can join the DBA_DATAPUMP_JOBS view to the V$SESSION and V$PROCCESS views like this:

select to_char(sysdate,'YYYY-MM-DD HH24:MI:SS') "DATE", s.program, s.sid,
s.status, s.username, d.job_name, p.spid, s.serial#, p.pid
from v$session s, v$process p, dba_datapump_sessions d
where p.addr=s.paddr and s.saddr=d.saddr;

DATE                PROGRAM                                    SID STATUS   USERNAME   JOB_NAME                       SPID    SERIAL#     PID
------------------- -------------------------------------- ------- -------- ---------- ------------------------------ ------- ------- -------
2010-11-11 17:54:45 [email protected] (TNS V1-V3)                        40 ACTIVE   ZELKAYAM2  SYS_EXPORT_FULL_01             28906     38890      26
2010-11-11 17:54:45 [email protected] (DM00)                          37 ACTIVE   ZELKAYAM2  SYS_EXPORT_FULL_01             28908      1939      30
2010-11-11 17:54:45 [email protected] (DW01)                          80 ACTIVE   ZELKAYAM2  SYS_EXPORT_FULL_01             28911     19847      31
2010-11-11 17:54:45 [email protected] (DW02)                         126 ACTIVE   ZELKAYAM2  SYS_EXPORT_FULL_01             28915      3684      12

Tracing the Sessions


Level 3: How do we trace the actual session?

Okay, so this is a bit tricky. Let’s say we have an export that is running a bit slow and we want to trace the session. We know that the regular trace won’t really help us. We need to start the trace the moment the session start – but we can’t really control this since the export coordinator is doing that for us. How do we trace the sessions?

The answer is that the expdp and impdp has a built in mechanism to trace executions. We can add the (hidden) expdp parameter “trace” so we can mark the command to output a useful trace file. Unfortunately, this parameter is hidden – meaning it does not appear in the expdp help=yes and we still need to figure out how to use it.

The trace parameter is being set using an hexadecimal number with 7 digits. There is no need to add 0x at the beginning of the string – it will not accept decimal numbers or binary inputs. The number must be written with a lower case. We will also need the privileges to run trace on the session, and obviously, the permissions to export or import.

Once we figure out all of these, we can use it to run the traced execution:

/app01/oracle 18:19> expdp zelkayam2 directory=HISTORY_DIR dumpfile=zelkayam2%u.dmp full=y parallel=2 trace=480300

There are 2 kinds of trace files that will be created in our background dump dest (diag/rdbms/trace):

  1. One trace file for the master processes – file names SID>_dm_.trc>
  2. One or multiple trace files for the worker processes – file names SID>_dw_.trc>

How do we set what to trace?


In order to know what and how to trace, we can use this helpful table. If we want to add more than one trace levels, we need to add the trace levels together (make sure you use hex calculator if you do – since 40300 + 80300 in hex is not the same for dec):

-- Summary of Data Pump trace levels:
-- ==================================

  Trace   DM   DW  ORA  Lines
  level  trc  trc  trc     in
  (hex) file file file  trace                                         Purpose
------- ---- ---- ---- ------ -----------------------------------------------
  10300    x    x    x  SHDW: To trace the Shadow process (API) (expdp/impdp)
  20300    x    x    x  KUPV: To trace Fixed table
  40300    x    x    x  'div' To trace Process services
  80300    x            KUPM: To trace Master Control Process (MCP)      (DM)
 100300    x    x       KUPF: To trace File Manager
 200300    x    x    x  KUPC: To trace Queue services
 400300         x       KUPW: To trace Worker process(es)                (DW)
 800300         x       KUPD: To trace Data Package
1000300         x       META: To trace Metadata Package
--- +
1FF0300    x    x    x  'all' To trace all components          (full tracing)

A common trace level is 480300 which monitors both the master and the workers.
I hope to write more about this (if I’ll see a demand for that) – in the meantime, you can read the MOS note that explains how to read the trace here: note 286496.1.

2 replies
  1. Pini Dibask
    Pini Dibask says:

    Great post Zohar (as always).
    Another option that I like is to use the ATTACH parameter when running the IMPDP command. It enables to see all the running worker processes, and their current status (including % progress, completed objects, total objects, current object).

    Thanks Zohar!

    Reply

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply