Latch, mutex and beyond

March 19, 2011

11.2.0.2 is the right patchset for mutexes

Filed under: 11.2,Mutex,Patches — andreynikolaev @ 10:05 am

Many people asked me about the second part of my blog title – the mutex. This is the first post about it. Mutexes is another Oracle spinlock, which was appeared in version 10.2.0.2. Despite being known since 2005, Oracle mutex internals is still Terra incognita.

This post is inspired by several recent escalations due to mutex contention. It occurs that 11.2.0.2 patchset contains extraordinary number of mutex related changes. Some enhancements like 10411618 exist only for 11.2.0.2. The following patches even changed the mutex architecture: (more…)

March 15, 2011

Hotsos 2011 Symposium follow-up

Filed under: Uncategorized — andreynikolaev @ 9:58 pm

Just returned from the Hotsos 2011 Symposium. It was an exciting experience to be speaker at this legendary conference. Many thanks to the Hotsos team for this opportunity.
The Symposium and the presentations were inspiring. I hope my presentation was interesting too. It had to be concise because of one hour time limit. Usually these topics discussed during almost half of my training day. I will cover the topics more detaily in future posts.

February 23, 2011

Latch statistics

Filed under: DTrace,Instrumentation,Latch,Latch Statistics,Theory,Uncategorized — andreynikolaev @ 10:36 am

In previous posts, I investigated how the Oracle process spins and waits for the latch. Now we need the tool to estimate when the latch acquisition works efficiently and when we need to tune it. This tool is the latch statistics. Contemporary Oracle documentation describes v$latch statistics columns as:

Statistic: x$ksllt column Documentation description: When and how it changed:
GETS kslltwgt
“wait gets”
Number of times the latch was requested in willing-to-wait mode Incremented by one after latch acquisition. Therefore protected by latch
MISSES kslltwff
“wait fails”
Number of times the latch was requested in willing-to-wait mode and the requestor had to wait Incremented by one after latch acquisition if miss occured
SLEEPS kslltwsl
“wait sleeps”
Number of times a willing-to-wait latch request resulted in a session sleeping while waiting for the latch Incremented by number of times process slept during latch acquisition
SPIN_GETS ksllthst0 Willing-to-wait latch requests which missed the first try but succeeded while spinning Incremented by one after latch acquisition if miss but not sleep occured. Counts only the first spin
WAIT_TIME kslltwtt
“wait time”
Elapsed time spent waiting for the latch (in microseconds) Incremented by wait time spent during latch acquisition.
IMMEDIATE_GETS kslltngt
“nowait gets”
Number of times a latch was requested in no-wait mode Incremented by one after each no-wait latch get. May not be protected by latch
IMMEDIATE_MISSES kslltnfa
“nowait fails”
Number of times a no-wait latch request did not succeed Incremented by one after unsuccessful no-wait latch get. Not protected by latch

(more…)

January 18, 2011

Spin tales: Part 3. Non-standard latch classes in Oracle 9.2-11g

Filed under: DTrace,Latch,shared latch,_spin_count — andreynikolaev @ 8:20 pm

In version 9.2 Oracle introduced new possibilities for fine grain latch tuning. Latch can be assigned to one of eight classes. Each class can have a different spin and wait policy. In addition, exclusive and shared latches behave differently.

By default, all the latches except “process allocation latch” belong to the standard class 0. In my previous posts, I discussed how the standard class exclusive and shared latches spin and wait. Now, it is the time to explore the non-standard class latch behaviors.
(more…)

January 14, 2011

Spin tales: Part 2. Shared latches in Oracle 9.2-11g

Filed under: DTrace,Latch,shared latch,_spin_count — andreynikolaev @ 6:23 pm

My previous experiments demonstrated that, opposite to common belief, spin count for exclusive latches in Oracle 9.2-11g cannot be tuned dynamically. The _spin_count parameter is effectively static for exclusive latches. This seems to disagree with the well-known study “Using _spin_count to reduce latch contention in 11g” by Guy Harrison. The study explored how dynamic tuning of _spin_count influenced latch waits, CPU consumption and throughput. I think that there is no contradiction. Probably Guy Harrison’s experiments have been performed with the cache buffers chains latch contention. This is the shared latch.

We already know that exclusive latch in Oracle 9.2-11g uses static spin value from x$ksllclass fixed table. This spin can be adjusted by _latch_class_0 parameter. By default, the exclusive latch spins up to twenty thousand cycles.

This post will show that shared latch in Oracle 9.2-11g is governed by _spin_count value and spins upto four thousand cycles by default.
(more…)

January 6, 2011

Spin tales: Part 1. Exclusive latches in Oracle 9.2-11g

Filed under: DTrace,Latch,Spinlock — andreynikolaev @ 9:59 pm

How does Oracle process spin for a latch? How many times does it check the latch before going to sleep? Anyone knows. This is the _spin_count=2000. Two thousand cycles by default. Oracle established this value long ago in version 6 at the beginning of 90s. However, let me count.

My previous investigation showed that latch wait was cardinally changed in Oracle 9.2. At that time, the exponential backoff disappeared. The latches have been using completely new wait posting since 2002. We may expect that latch spin have been changed too. Controversial results of _spin_count tuning in Oracle 9.2 confirm this also. In this series of posts, I will explore how the contemporary Oracle latches spin. The first post is about exclusive latches that form the majority of Oracle latches. For example, 460 out of 551 latches are exclusive in Oracle 11.2.0.2.

I will demonstrate that exclusive latches spin 10 times more than we expected. The _spin_count occurred to be effectively static for exclusive latches, and there is a big difference between not setting _spin_count and setting it to 2000.
(more…)

December 16, 2010

Hotsos Symposium 2011

Filed under: Uncategorized — andreynikolaev @ 9:33 pm

I will be speaking at Hotsos Symposium 2011 which will be held on March 6 — 10, 2011 in Dallas.

Of course, I will speak about “Contemporary Latch Internals“. I would like to post my abstract here:

“Latches in Oracle Database were radically changed during the last 10 years to achieve higher levels of concurrency and performance. Using oradebug and DTrace, this presentation explores how the latch works. Contemporary latch does not use exponential backoff, most of them spin 10 times more then expected, and they may wait for an infinite time in a queue.

This presentation will show how Oracle instruments the latch operations, discuss parameters and statistics related to latch performance diagnostics, and fine grain tuning. It will also explore the long-standing question: when and how to tune the “_spin_count” and “_latch_classes“. DTrace provides additional capabilities to measure effectiveness and statistical properties of latch in production.”

Some parts of this abstract were appeared in my blog and discussed in seminars. There will be many new topics!
The Hotsos Symposium is the unique conference dedicated to Oracle performance. Hopefully to see you there!

Hidden latch wait revolution that we missed

Filed under: History,Latch,shared latch — andreynikolaev @ 8:25 pm

The way an Oracle process waits for a latch. This seems trivial. Oracle 11.2 documentation explicitly states about “latch free” wait event:
… The wait time increases exponentially and does not include spinning on the latch (active waiting). The maximum wait time also depends on the number of latches that the process is holding. There is an incremental wait of up to 2 seconds…

In this post, I would like to show that this exponential backoff was obsoleted in Oracle 9.2. Since that time, all the Oracle versions use completely new latch wait posting mechanism and FIFO queuing for all, except one latch.
Oracle no longer uses “repeatedly spin and sleep” approach. The process spins and waits only once. The pseudo code for contemporary latch acquisition should looks like:

  Immediate latch get 
    Spin latch get 
       Add the process to the queue of latch waiters
          Sleep until posted

(more…)

November 23, 2010

Shared latches by Oracle version

Filed under: 11.2,Latch,shared latch,Summary tables — andreynikolaev @ 6:13 pm

As I described in my previous post, Shared and Exclusive Oracle latches differ significantly. Shared latch behaves like enqueue. It has S and X incompatible modes. Moreover  X mode serializes the shared latch. The contention for shared and exclusive latches has different patterns. This leads to different methods to tune such contentions.

But we do not know which latches are shared. Oracle never published the list of shared latches. Every time looking in the AWR or Statspack report we had to guess the type of contending latch. We only know that “cache buffer chains” latches became shared since in 9.2.

Oracle executable internally determines that latch is shared using flag hidden somewhere in x$kslld (v$latchname) structure. Google search shows that  KSLLD means  [K]ernel [S]ervice [L]ock [L]atch [D]escriptor. Unfortunately this shared flag was not externalized to SQL. It is possible to check the flag manually using oradebug peek or DTrace. But the flag offset is version and platform dependent. We need more systematic way to determine the latch type.
(more…)

November 17, 2010

Shared latch behaves like enqueue

Filed under: DTrace,Latch,shared latch — andreynikolaev @ 10:16 pm

We know a lot about the exclusive latches. This is Oracle realization of TTS spinlocks. Only one process at a time can hold the exclusive latch. Other processes compete for the exclusive latch by spinning. If process can not get the latch by spinning, it will wait until the latch becomes free.

But since the version 8.0 Oracle had another spinlock – shared latch. This is a realization of “Read-Write” spinlocks. Such spinlocks can be held by several “reader” processes simultaneously in SHARED mode. But if the process needs to write to the protected structure it must acquire RW spinlock in EXCLUSIVE mode. This mode prevents any concurrent access to the latch.

From version to version the number of shared latch increased. Several widely known latches like “session idle bit”, “In memory undo latch”. “Result Cache: RC Latch”, “resmgr group change latch” are shared.

Famous “cache buffers chains” latch was also became shared in Oracle 9.2. We usually react on “cache buffers chains” latch contention finding ineffective SQL plans and “hot blocks”. Recently Kyle Hailey posted an excellent graphical explanation of Oracle mechanics related to “latch: cache buffers chains” contention. But it always was a mystery to me why the sessions have to wait for SHARED latch during READ operations like searching the hash chains. Other busy shared latches like “session idle bit” do not experience such contention. This is why I would like to dive deeper into shared latch internals.
(more…)

« Previous PageNext Page »

Theme: Rubric. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 29 other followers