Oracle KGX mutexes appeared more than 7 years ago. However, mutex waits are still obscure. Oracle Documentation provided only brief description of mutex wait events without any information about wait durations and timeouts.
Look at the following timeline:
A week ago I returned from MEDIAS-2011 conference, which was held in Limassol (Cyprus). It was an exciting experience to speak at general Computer Science conference. This was also an opportunity to discuss topics beyond the usual scope of Oracle conferences and see non addicted to Oracle point of view.
As you may expect, my presentation was entitled “Exploring the Oracle latches”. You can download it here. The presentation contains more math and less X$ materials than usual. Also, I added several introductory slides about Oracle, its performance and tuning.
And, of course, Cyprus is a great place!
Thanks to Professor S.V. Klimenko for kindly inviting me to MEDIAS 2011 conference.
Thanks to RDTEX CEO I.G. Kunitsky for financial support.
Thanks to RDTEX Technical Support Centre Director S.P. Misiura for years of encouragement and support of my investigations.
Oracle 184.108.40.206 contains enhancements 9282521 and 9239863 named “Library cache: mutex X” for objects highly contended for. Part I and II. These enhancements introduce new interesting possibilities to tune some types of the mutex contention.
Contention for heavily accessed objects can now be divided between multiple copies of object in the library cache. According to notes 9282521.8 and 9239863.8 describing the patches, the enhancements should be used:
When there is true contention on a specific library cache object….
Let me investigate this deeper. I will use Oracle 220.127.116.11.2 for Solaris SPARC (64-bit) on 8Cores/32Threads Sun Fire T2000. I chose this platform in order to emphasize how the enhancements work. (more…)
I like testcases. One testcase results in more understanding than ten page article or weeks of data collection. This is why we need reproducible testcases if we want to explore mutex contention. Testcases will also give me a possibility to demonstrate how to use mutex contention diagnostics tools embedded in Oracle. I will use Oracle 18.104.22.168 for Linux X86 32bit on my Dual-Core laptop in this posts. Your numbers for other Oracle versions and platforms may vary.
I. “Cursor: pin S” contention testcase:
Each time the session execute SQL operator, it needs to ‘pin’ the cursor in library cache using mutex. True mutex contention arises when the same SQL operator executes concurrently at high frequency. Therefore the simplest testcase for “Cursor: pin S” contention should look like:
for i in 1..1000000
execute immediate 'select 1 from dual where 1=2';
Many people asked me about the second part of my blog title – the mutex. This is the first post about it. Mutexes is another Oracle spinlock, which was appeared in version 10.2.0.2. Despite being known since 2005, Oracle mutex internals is still Terra incognita.
This post is inspired by several recent escalations due to mutex contention. It occurs that 22.214.171.124 patchset contains extraordinary number of mutex related changes. Some enhancements like 10411618 exist only for 126.96.36.199. The following patches even changed the mutex architecture: (more…)
Just returned from the Hotsos 2011 Symposium. It was an exciting experience to be speaker at this legendary conference. Many thanks to the Hotsos team for this opportunity.
The Symposium and the presentations were inspiring. I hope my presentation was interesting too. It had to be concise because of one hour time limit. Usually these topics discussed during almost half of my training day. I will cover the topics more detaily in future posts.
In previous posts, I investigated how the Oracle process spins and waits for the latch. Now we need the tool to estimate when the latch acquisition works efficiently and when we need to tune it. This tool is the latch statistics. Contemporary Oracle documentation describes v$latch statistics columns as:
||When and how it changed:
|Number of times the latch was requested in willing-to-wait mode
||Incremented by one after latch acquisition. Therefore protected by latch
|Number of times the latch was requested in willing-to-wait mode and the requestor had to wait
||Incremented by one after latch acquisition if miss occured
|Number of times a willing-to-wait latch request resulted in a session sleeping while waiting for the latch
||Incremented by number of times process slept during latch acquisition
||Willing-to-wait latch requests which missed the first try but succeeded while spinning
||Incremented by one after latch acquisition if miss but not sleep occured. Counts only the first spin
|Elapsed time spent waiting for the latch (in microseconds)
||Incremented by wait time spent during latch acquisition.
|Number of times a latch was requested in no-wait mode
||Incremented by one after each no-wait latch get. May not be protected by latch
|Number of times a no-wait latch request did not succeed
||Incremented by one after unsuccessful no-wait latch get. Not protected by latch
In version 9.2 Oracle introduced new possibilities for fine grain latch tuning. Latch can be assigned to one of eight classes. Each class can have a different spin and wait policy. In addition, exclusive and shared latches behave differently.
By default, all the latches except “process allocation latch” belong to the standard class 0. In my previous posts, I discussed how the standard class exclusive and shared latches spin and wait. Now, it is the time to explore the non-standard class latch behaviors.
My previous experiments demonstrated that, opposite to common belief, spin count for exclusive latches in Oracle 9.2-11g cannot be tuned dynamically. The _spin_count parameter is effectively static for exclusive latches. This seems to disagree with the well-known study “Using _spin_count to reduce latch contention in 11g” by Guy Harrison. The study explored how dynamic tuning of _spin_count influenced latch waits, CPU consumption and throughput. I think that there is no contradiction. Probably Guy Harrison’s experiments have been performed with the cache buffers chains latch contention. This is the shared latch.
We already know that exclusive latch in Oracle 9.2-11g uses static spin value from x$ksllclass fixed table. This spin can be adjusted by _latch_class_0 parameter. By default, the exclusive latch spins up to twenty thousand cycles.
This post will show that shared latch in Oracle 9.2-11g is governed by _spin_count value and spins upto four thousand cycles by default.
How does Oracle process spin for a latch? How many times does it check the latch before going to sleep? Anyone knows. This is the _spin_count=2000. Two thousand cycles by default. Oracle established this value long ago in version 6 at the beginning of 90s. However, let me count.
My previous investigation showed that latch wait was cardinally changed in Oracle 9.2. At that time, the exponential backoff disappeared. The latches have been using completely new wait posting since 2002. We may expect that latch spin have been changed too. Controversial results of _spin_count tuning in Oracle 9.2 confirm this also. In this series of posts, I will explore how the contemporary Oracle latches spin. The first post is about exclusive latches that form the majority of Oracle latches. For example, 460 out of 551 latches are exclusive in Oracle 188.8.131.52.
I will demonstrate that exclusive latches spin 10 times more than we expected. The _spin_count occurred to be effectively static for exclusive latches, and there is a big difference between not setting _spin_count and setting it to 2000.