High waits on enq: US - contention and row cache lock was seen during a load test on a two node cluster. On the statspack reports these two wait events were among the top 5 wait evetns
The first peaks on the following graphs corresponds to high waits on above events observed during the initial load test.
Although 1332738.1 suggested this is related to undo segments and could be seen dc_rollback_segments. But there was not much difference between this metric during problem period and a good period. Below is the problem period
Datafile assigned to undo tablespaces has auto extensible on and has enough free space on the disk to extend the datafile. Therefore 420525.1 and 413732.1 wasn't much of a help.
742035.1 and 7291739.8 mentions bug 7291739 which materializes in high contention on above two wait events when autotuned undo retention is in use. Therefore applied the patch for bug 7291739 and set the parameter _first_spare_parameter value to the run length of the longest running query found on v$undostat(as this is 11.1, other version may require HIGHTHRESHOLD_UNDORETENTION refer above mention notes). Running the load test again didn't show any improvement and high waits could still be seen (second peak on the above graphs).
Raised a SR. Oracle couldn't determine why the patch is not effective in reducing the high wait events and suggested another hidden parameter rollback_segment_count(also mentioned as a work around on 1332738.1) It was recommended to set a value of 1.5 times the online undo segments for this parameter.
After above value is set running the load test did not result in any enq: US - contention waits. It should be noted that patch was still in place even with this parameter set, but highly unlikely that it had contributed to resolve the high waits. It is possible that rollback_segment_count alone is responsible for reducing the high waits. This would be verified once the patch is roll backed later on.
Useful metalink notes
Full UNDO Tablespace In 10gR2 [ID 413732.1]
Contention Under Auto-Tuned Undo Retention [ID 742035.1]
Automatic Tuning of Undo_retention Causes Space Problems [ID 420525.1]
How to correct performance issues with enq: US - contention related to undo segments [ID 1332738.1]
Bug 7291739 - Contention with auto-tuned undo retention or high TUNED_UNDORETENTION [ID 7291739.8]
Top 5 Timed Events Avg %TotalDuring normal operation there was no waits on enq: US - contention and row cache lock waits were between 0 - 4. High wait events only appear during the load test when the system is stressed.
~~~~~~~~~~~~~~~~~~ wait Call
Event Waits Time (s) (ms) Time
----------------------------------------- ------------ ----------- ------ ------
enq: US - contention 7,498 4,035 538 21.9
row cache lock 8,486 1,240 146 6.7
The first peaks on the following graphs corresponds to high waits on above events observed during the initial load test.
enq: US - contention
row cache lock waits
Although 1332738.1 suggested this is related to undo segments and could be seen dc_rollback_segments. But there was not much difference between this metric during problem period and a good period. Below is the problem period
Cache Requests Miss Reqs Miss Reqs UsageThe good period
------------------------- ------------ ------ ------- ----- -------- ----------
dc_objects 105,703 0.1 0 0 5,389
dc_rollback_segments 38,780 0.3 0 250 514
dc_segments 4,234 5.0 0 12 2,719
dc_tablespaces 165,248 0.0 0 0 21
dc_users 178,080 0.0 0 0 222
Cache Requests Miss Reqs Miss Reqs UsageComparing the above two there's only a slight difference but comparing GES stats shows following for problem period
------------------------- ------------ ------ ------- ----- -------- ----------
dc_objects 284,414 0.5 0 8 2,753
dc_rollback_segments 22,307 0.0 0 0 515
dc_segments 17,724 7.9 0 10 1,790
dc_tablespaces 142,346 0.0 0 0 21
dc_users 158,440 0.0 0 0 116
Cache Requests Conflicts Releasesand no requests or conflicts for dc_rollback_segments during good period.
------------------------- ------------ ------------ ------------
dc_objects 84 2 0
dc_rollback_segments 511 133 0
dc_segments 352 6 0
Datafile assigned to undo tablespaces has auto extensible on and has enough free space on the disk to extend the datafile. Therefore 420525.1 and 413732.1 wasn't much of a help.
742035.1 and 7291739.8 mentions bug 7291739 which materializes in high contention on above two wait events when autotuned undo retention is in use. Therefore applied the patch for bug 7291739 and set the parameter _first_spare_parameter value to the run length of the longest running query found on v$undostat(as this is 11.1, other version may require HIGHTHRESHOLD_UNDORETENTION refer above mention notes). Running the load test again didn't show any improvement and high waits could still be seen (second peak on the above graphs).
Raised a SR. Oracle couldn't determine why the patch is not effective in reducing the high wait events and suggested another hidden parameter rollback_segment_count(also mentioned as a work around on 1332738.1) It was recommended to set a value of 1.5 times the online undo segments for this parameter.
SQL> select TABLESPACE_NAME,count(*) from DBA_ROLLBACK_SEGS where status='ONLINE' group by tablespace_name;According to Oracle the value set is for "entire instance not for undo tablespace" which I would imagine means per database and not per instance. This value act as the "lower limit for the number of undo segments online at a given time". Setting this value doesn't result in database proactively online number of undo segments as specified. It is the minimum number of undo segments to kept online and only comes into play if the number of undo segments goes beyond the value specified. So going by the above statistics the value to set would be (323 + 300) x 1.5 = 935. One more thing is that this value is not dynamic and requires a restart (Not an ideal workaround for a busy production system).
TABLESPACE_NAME COUNT(*)
--------------- ----------
UNDOTBS1 323
SYSTEM 1
UNDOTBS2 300
After above value is set running the load test did not result in any enq: US - contention waits. It should be noted that patch was still in place even with this parameter set, but highly unlikely that it had contributed to resolve the high waits. It is possible that rollback_segment_count alone is responsible for reducing the high waits. This would be verified once the patch is roll backed later on.
Useful metalink notes
Full UNDO Tablespace In 10gR2 [ID 413732.1]
Contention Under Auto-Tuned Undo Retention [ID 742035.1]
Automatic Tuning of Undo_retention Causes Space Problems [ID 420525.1]
How to correct performance issues with enq: US - contention related to undo segments [ID 1332738.1]
Bug 7291739 - Contention with auto-tuned undo retention or high TUNED_UNDORETENTION [ID 7291739.8]