In these pre-OOW 2013 days, everyone is excited about new features of Oracle 12c. This post describes particularly interesting one – IO outliers profiling in V$KERNEL_IO_OUTLIER.
According to Oracle Documentation and Glenn Fawcett blog we know that V$KERNEL_IO_OUTLIER uses the DTrace to collect information about the amount of time an IO spends in operating system components. DTrace is the Solaris dynamic tracing tool which allows to observe all OS activity.
V$KERNEL_IO_OUTLIER should provide an operating system-based latency breakdown of IO taking an excessive amount of time to complete. This will be essential diagnostics feature. But, many people complained that V$KERNEL_IO_OUTLIER is always empty. This was enough for me to start investigating it.
I will discuss the view, when and how it works, the underlying DTrace script and how it can be used for pre-12c Oracle.
Several months have passed since my previous “mutex wait” post. I was so busy with work and conference presentations. Thanks to all my listeners at UKOUG2011, Hotsos2012 and Medias2012 conferences and several seminars for inspiring questions and conversations.
I. Unexpected change.
Now it is time to discuss how contemporary Oracle waits for mutexes. My previous posts described evolution of “invisible and aggressive” 10.2-11.1 mutex waits into fully accounted and less aggressive 11gR2 mutexes. Surprisingly Oracle 188.8.131.52.2 (or 184.108.40.206 PSU2) appeared in April 2011 demonstrated almost negligible CPU consumption during mutex waits. (more…)
I would like to describe how Oracle versions 220.127.116.11-18.104.22.168.1 waited for mutexes. This algorithm also appears to be used in post-22.214.171.124.2 PSUs and new 126.96.36.199 patchset as _mutex_wait_scheme=0.
My previous post demonstrated that before version 11.2:
- “Cursor: pin S” was pure wait for CPU. Long “cursor: pin S” waits indicated CPU starvation.
- Mutex contention was almost invisible to Oracle Wait Interface
- Spin time to acquire mutex was accounted as CPU time. It was service time, not waiting time.
Things changed. Mutex waits in Oracle 11.2 significantly differ from previous versions. Contemporary mutex waits are not CPU aggressive anymore, completely visible to Oracle Wait Interface and highly tunable.
A week ago I returned from MEDIAS-2011 conference, which was held in Limassol (Cyprus). It was an exciting experience to speak at general Computer Science conference. This was also an opportunity to discuss topics beyond the usual scope of Oracle conferences and see non addicted to Oracle point of view.
As you may expect, my presentation was entitled “Exploring the Oracle latches”. You can download it here. The presentation contains more math and less X$ materials than usual. Also, I added several introductory slides about Oracle, its performance and tuning.
And, of course, Cyprus is a great place!
Thanks to Professor S.V. Klimenko for kindly inviting me to MEDIAS 2011 conference.
Thanks to RDTEX CEO I.G. Kunitsky for financial support.
Thanks to RDTEX Technical Support Centre Director S.P. Misiura for years of encouragement and support of my investigations.
In previous posts, I investigated how the Oracle process spins and waits for the latch. Now we need the tool to estimate when the latch acquisition works efficiently and when we need to tune it. This tool is the latch statistics. Contemporary Oracle documentation describes v$latch statistics columns as:
||When and how it changed:
|Number of times the latch was requested in willing-to-wait mode
||Incremented by one after latch acquisition. Therefore protected by latch
|Number of times the latch was requested in willing-to-wait mode and the requestor had to wait
||Incremented by one after latch acquisition if miss occured
|Number of times a willing-to-wait latch request resulted in a session sleeping while waiting for the latch
||Incremented by number of times process slept during latch acquisition
||Willing-to-wait latch requests which missed the first try but succeeded while spinning
||Incremented by one after latch acquisition if miss but not sleep occured. Counts only the first spin
|Elapsed time spent waiting for the latch (in microseconds)
||Incremented by wait time spent during latch acquisition.
|Number of times a latch was requested in no-wait mode
||Incremented by one after each no-wait latch get. May not be protected by latch
|Number of times a no-wait latch request did not succeed
||Incremented by one after unsuccessful no-wait latch get. Not protected by latch
In version 9.2 Oracle introduced new possibilities for fine grain latch tuning. Latch can be assigned to one of eight classes. Each class can have a different spin and wait policy. In addition, exclusive and shared latches behave differently.
By default, all the latches except “process allocation latch” belong to the standard class 0. In my previous posts, I discussed how the standard class exclusive and shared latches spin and wait. Now, it is the time to explore the non-standard class latch behaviors.
My previous experiments demonstrated that, opposite to common belief, spin count for exclusive latches in Oracle 9.2-11g cannot be tuned dynamically. The _spin_count parameter is effectively static for exclusive latches. This seems to disagree with the well-known study “Using _spin_count to reduce latch contention in 11g” by Guy Harrison. The study explored how dynamic tuning of _spin_count influenced latch waits, CPU consumption and throughput. I think that there is no contradiction. Probably Guy Harrison’s experiments have been performed with the cache buffers chains latch contention. This is the shared latch.
We already know that exclusive latch in Oracle 9.2-11g uses static spin value from x$ksllclass fixed table. This spin can be adjusted by _latch_class_0 parameter. By default, the exclusive latch spins up to twenty thousand cycles.
This post will show that shared latch in Oracle 9.2-11g is governed by _spin_count value and spins upto four thousand cycles by default.
How does Oracle process spin for a latch? How many times does it check the latch before going to sleep? Anyone knows. This is the _spin_count=2000. Two thousand cycles by default. Oracle established this value long ago in version 6 at the beginning of 90s. However, let me count.
My previous investigation showed that latch wait was cardinally changed in Oracle 9.2. At that time, the exponential backoff disappeared. The latches have been using completely new wait posting since 2002. We may expect that latch spin have been changed too. Controversial results of _spin_count tuning in Oracle 9.2 confirm this also. In this series of posts, I will explore how the contemporary Oracle latches spin. The first post is about exclusive latches that form the majority of Oracle latches. For example, 460 out of 551 latches are exclusive in Oracle 188.8.131.52.
I will demonstrate that exclusive latches spin 10 times more than we expected. The _spin_count occurred to be effectively static for exclusive latches, and there is a big difference between not setting _spin_count and setting it to 2000.
We know a lot about the exclusive latches. This is Oracle realization of TTS spinlocks. Only one process at a time can hold the exclusive latch. Other processes compete for the exclusive latch by spinning. If process can not get the latch by spinning, it will wait until the latch becomes free.
But since the version 8.0 Oracle had another spinlock – shared latch. This is a realization of “Read-Write” spinlocks. Such spinlocks can be held by several “reader” processes simultaneously in SHARED mode. But if the process needs to write to the protected structure it must acquire RW spinlock in EXCLUSIVE mode. This mode prevents any concurrent access to the latch.
From version to version the number of shared latch increased. Several widely known latches like “session idle bit”, “In memory undo latch”. “Result Cache: RC Latch”, “resmgr group change latch” are shared.
Famous “cache buffers chains” latch was also became shared in Oracle 9.2. We usually react on “cache buffers chains” latch contention finding ineffective SQL plans and “hot blocks”. Recently Kyle Hailey posted an excellent graphical explanation of Oracle mechanics related to “latch: cache buffers chains” contention. But it always was a mystery to me why the sessions have to wait for SHARED latch during READ operations like searching the hash chains. Other busy shared latches like “session idle bit” do not experience such contention. This is why I would like to dive deeper into shared latch internals.
To discover how the Oracle latch works, we need the tool. Oracle Wait Interface allows us to explore the waits only. Oracle X$/V$ tables instrument the latch acquisition and give us performance counters. To see how latch works through time and to observe short duration events, we need something like stroboscope in physics. Likely such tool exists in Oracle Solaris. The DTrace, Solaris 10 Dynamic Tracing framework!
Here I would like to give brief, Oracle DBA inclined into to some of DTrace topics. Tanel Poder, James Morle , Dough Burns were used the DTrace for performance diagnostics for years. But it is still not popular as should be in our DBA community. One of the problems is another “language”. The best DTrace presentations talk about “probes”, “actions”, unfamiliar Solaris kernel structures, etc… Begging pardon to the DTrace inventors, I will use more database-like terminology here.