How to investigate real time over all performance bottleneck (enq: TX – row lock contention) using OS tools like top/sar/vmstat,ASH,AWR,ORATOP and OEM
I will simulate “row lock contention” using following procedure( I will resue my test case mentioned in Test case 4:-Now I will run update in parallel 10 sessions of https://clouddba.co/test-case-when-index-size-can-grow-than-column-size-and-effect-of-rebuild-index/)
CREATE OR REPLACE procedure TEST.manyupdates as
v_m number;
begin
for i in 1..100000 loop
select round(dbms_random.value() * 44444444444) + 1 into v_m from dual t;
update t set id=v_m/15*i where rownum<10;
commit;
end loop;
end;
/
CREATE OR REPLACE procedure TEST.manysessions_update as
v_jobno number:=0;
begin
FOR i in 1..10 LOOP
dbms_job.submit(v_jobno,’manyupdates;’, sysdate);
END LOOP;
commit;
end;
/
Exec TEST.manysessions_update;
As we are running this procedure parallel with 10 sessions,it will definitely create row lock contention even in very short time.
Now application team will raise concern that system is very slow and you as DBA start to investigate.
A.real time performance tuning using OS tools
Tool1:-top utility
You should check %CPU section to identify top CPU consuming processes
You will notice load average is almost 5.It means at least 5-6 processes are running at a time.You can check in TOP with column S where it shows ‘R’ means running and ‘S’ means sleeping.
Also note 72% idle and 22% CPU busy.It is just a indication that your system is bit busy with CPU load.
%Cpu(s): 22.6 us, 4.6 sy, 0.0 ni, 72.3 id, 0.1 wa, 0.0 hi, 0.5 si, 0.0 st
If you want to investigate which process takes more memory,you need to sort the output on basis of “RES”
Tool 2:-SAR utility with -U option to check CPU and IO bottleneck
sar -u 10 8
%user: This shows the total time that the processor is spending on different process .
%iowait: the name iowait itself suggests that its the time spend by processor waiting for devices(input and output)
%nice: Most of you guys must be knowing that a user can change the priority of a linux process. This table shows the time spend by CPU for process whose nice value has been changed.
%steal: This column shows the amount to time spend by a CPU (which is virtualized), for resources from the physical CPU
%idle: This suggests the idle time spend by the processor.
A low %idle time could point to which processes are using the most CPU . Use the ps or top command to find a CPU-intensive job. A poorly written query requiring a large amount of disk access can also cause a large amount of CPU usage as well.
In the above sar output, the cause for concern is the large values being returned for %user means application is using more cpu.%iowait (waiting for block I/O) is low here so we are not having disk contention.
We need to make below consideration while checking sar report of CPU
- Low CPU idle times.
- High percentage of time spent waiting on I/O or %iowait> 10.
- Bottlenecks with %system> 15, which could indicate that swapping, paging, or backups are causing a bottleneck.
- Abnormally high %user, which could be due to applications not being tuned properly or CPU over utilization.This is our case.
Tool 3:vmstat report
vmstat 1 10
r column is runnable processes.So 7 processes are currently running.IDLE percent is overall 70%.
Any non-zero value of “SO” column leads to probable memory leak in SWAP which needs to be analyzed immediately.
For more info about vmstat:-
Linux vmstat Command – Tool to Report Virtual Memory Statistics
Tool 4:To identify disk bottleneck
sar -d 5 2
tps: tps stands for transfer per second, so it shows the transfer per second to that particular device
rd_sec/s: this shows you the total number of sectors on that device which is being read
wr_sec/s: if rd_sec/s is sectors being read per second then obviously wr_sec is sectors being written per second.
await: this shows the total number of time that the processor waited for requests regarding IO
%util on a device that is greater than 50 percent denotes disk bottleneck.In our case it is normal for highly utilized disk.
If await is greater than svctm due to Unbalanced disk I/O load.
Please note disk dev129-160 is most busiest disk.The tps is 100 almost.
Tool 5:-sar -b to report disk usage
- tps – Transactions per second (this includes both read and write)
- rtps – Read transactions per second
- wtps – Write transactions per second
- bread/s – Bytes read per second
- bwrtn/s – Bytes written per second
The sar –b command reports on the system’s buffer cache activities (not the Oracle buffer cache). It gives you the number of transfers per second between system buffers and block devices.
Tool 6:-sar -q to find processes under queue.We need to look blocked section
Tool 7:-To identify memory usage using sar -r
B.real time performance tuning using ORATOP report
./oratop -f -d -i 10 / as sysdba
ASW:-Active session waiting
Event:-Enq:TX – row lock contention is top wait event.AVG_MS is almost 0.It will be big concern if AVG_MS is very large which means your session is being blocked long time.
SQLID/BLOCKER:-You will get which session is blocking and which sessions are blocked.
Other statistics are also will be useful for more investigation.
Please refer for more info on oratop :-
C.real time performance tuning using OEM monitoring:-
Form OEM console you can check how many active sessions are waiting and RED means concurrency wait.You can also find SQL_ID responsible.
D.real time performance tuning using AWR report:-
You may look below section for AWR.The most important top 10 foreground events .42% DB time has been contributed by enqTX – row lock contention.
This section segments by row lock waits show which objects are getting locked.
E.Real time performance tuning using ASH report:-
ASH report is very handy for generating which SQL are responsible for locking.
Please check top sessions with SID and SERIAL for deeper details.Please check %event column to check what are most eventful wait events.
F.Real time performance tuning using SQL Developer Please refer below:-