I was working with the data warehouse team at a customer site and at some point we realized that some parallel executions are not getting enough resources (downgraded).
Not getting enough parallel processes in such a complex environment is really bad. That means that since everybody is hogging the CPU, some sessions will not be able to complete inside the night ETL time frame. If that happens – some ETLs will go on into the day providing wrong data to the customer in even worse performance for the morning shift. Another aspect is the memory usage for large sorts or hash joins. Using less processes will mean some of the data will not reside in the memory and will need to be allocated in the temporary tablespace.
The customer asked how can he find those downgraded (meaning, not getting enough parallel processes) at real time. This is the script we created for that.
First of all, we need to distinguish between two types of downgrades.
The first one is downgrading the query so it will receive partial resources. The query itself looks like it is running in parallel – but it is only getting some of the processes it needs/expects. For example, if the ETL process expect 8-12 parallel processes, and the session got 4-6 – that is considered disastrous to the customer.
In order to locate those queries, we used this query. This query find the parent PX coordinator, the requested degree and the actual degree.
SELECT a.sid parent_sid, a.logon_time, a.program, a.osuser, a.username, a.sql_id, ps.req_degree, ps.got_degree FROM (SELECT s.sid, s.osuser, s.USERNAME, s.status, s.sql_id, s.logon_time, s.last_call_et, UPPER(s.program) program FROM v$session s WHERE ((s.username IS NOT NULL) AND (NVL(s.osuser, 'x') <> 'SYSTEM') AND (s.TYPE <> 'BACKGROUND'))) a JOIN (SELECT DISTINCT qcsid, CASE WHEN req_degree > degree THEN 1 ELSE 0 END is_downgrade, MAX(req_degree) req_degree, MAX(degree) got_degree FROM v$px_session GROUP BY qcsid, CASE WHEN req_degree > degree THEN 1 ELSE 0 END) ps ON a.sid = ps.qcsid WHERE ps.is_downgrade = 1
This query assumes the query got some kind of parallel execution – but that not always the case. That brings us to the second type of downgrade.
In the following query we tried to detect the queries that did not get any parallel processes and were downgraded to serial.
SELECT sess.sid parent_sid, logon_time, sess.program, sess.osuser, sess.username, sql_id, TRUNC (last_call_et / 60) dur_minutes FROM v$sesstat sesstat, v$sysstat sysstat, v$session sess WHERE sesstat.statistic# = sysstat.statistic# AND sesstat.sid = sess.sid AND name = 'Parallel operations downgraded to serial' AND sesstat.VALUE > 0 AND sess.status = 'ACTIVE'