Latch, mutex and beyond

July 30, 2012

Mutex waits. Part III. Contemporary Oracle wait schemes diversity.

Filed under: 11.2,Contention,DTrace,Instrumentation,Mutex,OS tuning,Patches,Spinlock — andreynikolaev @ 12:21 pm

Several months have passed since my previous “mutex wait” post. I was so busy with work and conference presentations. Thanks to all my listeners at UKOUG2011, Hotsos2012 and Medias2012 conferences and several seminars for inspiring questions and conversations.

I. Unexpected change.

Now it is time to discuss how contemporary Oracle waits for mutexes. My previous posts described evolution of “invisible and aggressive” 10.2-11.1 mutex waits into fully accounted and less aggressive 11gR2 mutexes. Surprisingly Oracle (or PSU2) appeared in April 2011 demonstrated almost negligible CPU consumption during mutex waits. (more…)

October 25, 2011

Mutex waits. Part II. “Cursor: Pin S” in Oracle 11.2 _mutex_wait_scheme=0. Steps out of shadow.

Filed under: 11.2,Contention,DTrace,Instrumentation,Mutex,OS tuning,Patches — andreynikolaev @ 4:23 pm

I would like to describe how Oracle versions waited for mutexes. This algorithm also appears to be used in post- PSUs and new patchset as _mutex_wait_scheme=0.

My previous post demonstrated that before version 11.2:

  • “Cursor: pin S” was pure wait for CPU. Long “cursor: pin S” waits indicated CPU starvation.
  • Mutex contention was almost invisible to Oracle Wait Interface
  • Spin time to acquire mutex was accounted as CPU time. It was service time, not waiting time.

Things changed. Mutex waits in Oracle 11.2 significantly differ from previous versions. Contemporary mutex waits are not CPU aggressive anymore, completely visible to Oracle Wait Interface and highly tunable.

July 9, 2011

Mutex waits. Part 1. “Cursor: Pin S” in Oracle 10.2-11.1. Invisible and aggressive.

Filed under: Contention,History,Instrumentation,Mutex,Patches,Spinlock — andreynikolaev @ 1:18 pm

Oracle KGX mutexes appeared more than 7 years ago. However, mutex waits are still obscure. Oracle Documentation provided only brief description of mutex wait events without any information about wait durations and timeouts.
Look at the following timeline:


April 22, 2011

“Cursor: pin S” mutex contention testcase and diagnostics tools.

Filed under: 11.2,Instrumentation,Mutex — andreynikolaev @ 4:33 pm

I like testcases. One testcase results in more understanding than ten page article or weeks of data collection. This is why we need reproducible testcases if we want to explore mutex contention. Testcases will also give me a possibility to demonstrate how to use mutex contention diagnostics tools embedded in Oracle. I will use Oracle for Linux X86 32bit on my Dual-Core laptop in this posts. Your numbers for other Oracle versions and platforms may vary.

I. “Cursor: pin S” contention testcase:

Each time the session execute SQL operator, it needs to ‘pin’ the cursor in library cache using mutex. True mutex contention arises when the same SQL operator executes concurrently at high frequency. Therefore the simplest testcase for “Cursor: pin S” contention should look like:

for i in 1..1000000
   execute immediate 'select 1 from dual where 1=2';
end loop;


February 23, 2011

Latch statistics

Filed under: DTrace,Instrumentation,Latch,Latch Statistics,Theory,Uncategorized — andreynikolaev @ 10:36 am

In previous posts, I investigated how the Oracle process spins and waits for the latch. Now we need the tool to estimate when the latch acquisition works efficiently and when we need to tune it. This tool is the latch statistics. Contemporary Oracle documentation describes v$latch statistics columns as:

Statistic: x$ksllt column Documentation description: When and how it changed:
GETS kslltwgt
“wait gets”
Number of times the latch was requested in willing-to-wait mode Incremented by one after latch acquisition. Therefore protected by latch
MISSES kslltwff
“wait fails”
Number of times the latch was requested in willing-to-wait mode and the requestor had to wait Incremented by one after latch acquisition if miss occured
SLEEPS kslltwsl
“wait sleeps”
Number of times a willing-to-wait latch request resulted in a session sleeping while waiting for the latch Incremented by number of times process slept during latch acquisition
SPIN_GETS ksllthst0 Willing-to-wait latch requests which missed the first try but succeeded while spinning Incremented by one after latch acquisition if miss but not sleep occured. Counts only the first spin
WAIT_TIME kslltwtt
“wait time”
Elapsed time spent waiting for the latch (in microseconds) Incremented by wait time spent during latch acquisition.
“nowait gets”
Number of times a latch was requested in no-wait mode Incremented by one after each no-wait latch get. May not be protected by latch
“nowait fails”
Number of times a no-wait latch request did not succeed Incremented by one after unsuccessful no-wait latch get. Not protected by latch


July 11, 2010

Latch get and spin instrumentation. The unknown knowns. V2

Filed under: Instrumentation,Latch — andreynikolaev @ 12:11 am
Top 5 Timed Events
~~~~~~~~~~~~~~~~~~                                                     % Total     Waits Time (s)
Event                                               Waits    Time (s) Ela Time   per sec  per sec
-------------------------------------------- ------------ ----------- -------- --------- --------
enqueue                                         1,801,215   3,281,392    59.82     499.9   910.74
buffer busy waits                               1,984,703   1,235,865    22.53     550.8   343.01
latch free                                      6,425,043     847,386    15.45   1,783.2   235.19
SQL*Net break/reset to client                      50,394      35,937      .66      14.0     9.97
CPU time                                                       23,828      .43               6.61

This is the statspack report for instance suffered from heavy latch contention.
Every time I saw such CPU bound Oracle instance with latch contention, I asked myself. Which part of this CPU power is currently burned for useless latch spin attempts? How many processes spin for the latch? How can we estimate this?

Unfortunately I still do not have contemporary answer yet. But in this post I would like to show that we had had such estimations before 11g.

We all do know that latch wait is instrumented well in Oracle wait interface. Oracle 11.2 has 32 specific latch wait events and one general ‘latch free’. But all these events are only for latch sleeps. Oracle Wait Interface don’t know anything about latch gets and spins.

It occurs that Oracle had instrumented the latch acquisition also, and even documented it. I do not know why it is not popular enough. This instrumentation resides in process array v$process. The fixed table behind v$process view is x$ksupr.

Of course, my post is about v$process.latchwait and v$process.latchspin. (more…)

Create a free website or blog at