Latch, mutex and beyond

December 22, 2011

Latch Timescales

Filed under: Latch,Theory,_spin_count — andreynikolaev @ 12:44 pm

To compare old and new latch mechanisms, I found useful the following illustration. Since it is hard for us, humans, to visualize milli- and microseconds, imagine “time microscope” that zooms in timed events one million times.

Alternatively, just imagine contemporary Oracle software running on 1950th style hardware.

Such microscope will magnify the microsecond to second. One real “second” will transform to one million seconds. It takes more than 11 days. Light will travel at sonic speed. Lunar rocket will crawl like snail.
(more…)

November 24, 2011

UKOUG 2011

Filed under: Conference — andreynikolaev @ 7:35 pm


I will be speaking about latches at UK Oracle User Group Conference 2011. It will be my first visit to this extraordinary conference and UK. Hopefully see you there!

October 25, 2011

Mutex waits. Part II. “Cursor: Pin S” in Oracle 11.2 _mutex_wait_scheme=0. Steps out of shadow.

Filed under: 11.2,Contention,DTrace,Instrumentation,Mutex,OS tuning,Patches — andreynikolaev @ 4:23 pm

I would like to describe how Oracle versions 11.2.0.1-11.2.0.2.1 waited for mutexes. This algorithm also appears to be used in post-11.2.0.2.2 PSUs and new 11.2.0.3 patchset as _mutex_wait_scheme=0.

My previous post demonstrated that before version 11.2:

  • “Cursor: pin S” was pure wait for CPU. Long “cursor: pin S” waits indicated CPU starvation.
  • Mutex contention was almost invisible to Oracle Wait Interface
  • Spin time to acquire mutex was accounted as CPU time. It was service time, not waiting time.

Things changed. Mutex waits in Oracle 11.2 significantly differ from previous versions. Contemporary mutex waits are not CPU aggressive anymore, completely visible to Oracle Wait Interface and highly tunable.
(more…)

July 9, 2011

Mutex waits. Part 1. “Cursor: Pin S” in Oracle 10.2-11.1. Invisible and aggressive.

Filed under: Contention,History,Instrumentation,Mutex,Patches,Spinlock — andreynikolaev @ 1:18 pm

Oracle KGX mutexes appeared more than 7 years ago. However, mutex waits are still obscure. Oracle Documentation provided only brief description of mutex wait events without any information about wait durations and timeouts.
Look at the following timeline:

(more…)

May 22, 2011

MEDIAS-2011

Filed under: Conference,DTrace,Latch,Latch Statistics,Theory,_spin_count — andreynikolaev @ 1:56 pm

A week ago I returned from MEDIAS-2011 conference, which was held in Limassol (Cyprus). It was an exciting experience to speak at general Computer Science conference. This was also an opportunity to discuss topics beyond the usual scope of Oracle conferences and see non addicted to Oracle point of view.

As you may expect, my presentation was entitled “Exploring the Oracle latches”. You can download it here. The presentation contains more math and less X$ materials than usual. Also, I added several introductory slides about Oracle, its performance and tuning.

And, of course, Cyprus is a great place!

Thanks to Professor S.V. Klimenko for kindly inviting me to MEDIAS 2011 conference.
Thanks to RDTEX CEO I.G. Kunitsky for financial support.
Thanks to RDTEX Technical Support Centre Director S.P. Misiura for years of encouragement and support of my investigations.

May 1, 2011

Divide and conquer the “true” mutex contention

Filed under: 11.2,Contention,Mutex,Patches — andreynikolaev @ 7:16 pm

Oracle 11.2.0.2 contains enhancements 9282521 and 9239863 named “Library cache: mutex X” for objects highly contended for. Part I and II. These enhancements introduce new interesting possibilities to tune some types of the mutex contention.

Contention for heavily accessed objects can now be divided between multiple copies of object in the library cache. According to notes 9282521.8 and 9239863.8 describing the patches, the enhancements should be used:
When there is true contention on a specific library cache object….
Let me investigate this deeper. I will use Oracle 11.2.0.2.2 for Solaris SPARC (64-bit) on 8Cores/32Threads Sun Fire T2000. I chose this platform in order to emphasize how the enhancements work. (more…)

April 22, 2011

“Cursor: pin S” mutex contention testcase and diagnostics tools.

Filed under: 11.2,Instrumentation,Mutex — andreynikolaev @ 4:33 pm

I like testcases. One testcase results in more understanding than ten page article or weeks of data collection. This is why we need reproducible testcases if we want to explore mutex contention. Testcases will also give me a possibility to demonstrate how to use mutex contention diagnostics tools embedded in Oracle. I will use Oracle 11.2.0.2 for Linux X86 32bit on my Dual-Core laptop in this posts. Your numbers for other Oracle versions and platforms may vary.

I. “Cursor: pin S” contention testcase:

Each time the session execute SQL operator, it needs to ‘pin’ the cursor in library cache using mutex. True mutex contention arises when the same SQL operator executes concurrently at high frequency. Therefore the simplest testcase for “Cursor: pin S” contention should look like:

begin
for i in 1..1000000
loop
   execute immediate 'select 1 from dual where 1=2';
end loop;
end;
/

(more…)

March 19, 2011

11.2.0.2 is the right patchset for mutexes

Filed under: 11.2,Mutex,Patches — andreynikolaev @ 10:05 am

Many people asked me about the second part of my blog title – the mutex. This is the first post about it. Mutexes is another Oracle spinlock, which was appeared in version 10.2.0.2. Despite being known since 2005, Oracle mutex internals is still Terra incognita.

This post is inspired by several recent escalations due to mutex contention. It occurs that 11.2.0.2 patchset contains extraordinary number of mutex related changes. Some enhancements like 10411618 exist only for 11.2.0.2. The following patches even changed the mutex architecture: (more…)

March 15, 2011

Hotsos 2011 Symposium follow-up

Filed under: Uncategorized — andreynikolaev @ 9:58 pm

Just returned from the Hotsos 2011 Symposium. It was an exciting experience to be speaker at this legendary conference. Many thanks to the Hotsos team for this opportunity.
The Symposium and the presentations were inspiring. I hope my presentation was interesting too. It had to be concise because of one hour time limit. Usually these topics discussed during almost half of my training day. I will cover the topics more detaily in future posts.

February 23, 2011

Latch statistics

Filed under: DTrace,Instrumentation,Latch,Latch Statistics,Theory,Uncategorized — andreynikolaev @ 10:36 am

In previous posts, I investigated how the Oracle process spins and waits for the latch. Now we need the tool to estimate when the latch acquisition works efficiently and when we need to tune it. This tool is the latch statistics. Contemporary Oracle documentation describes v$latch statistics columns as:

Statistic: x$ksllt column Documentation description: When and how it changed:
GETS kslltwgt
“wait gets”
Number of times the latch was requested in willing-to-wait mode Incremented by one after latch acquisition. Therefore protected by latch
MISSES kslltwff
“wait fails”
Number of times the latch was requested in willing-to-wait mode and the requestor had to wait Incremented by one after latch acquisition if miss occured
SLEEPS kslltwsl
“wait sleeps”
Number of times a willing-to-wait latch request resulted in a session sleeping while waiting for the latch Incremented by number of times process slept during latch acquisition
SPIN_GETS ksllthst0 Willing-to-wait latch requests which missed the first try but succeeded while spinning Incremented by one after latch acquisition if miss but not sleep occured. Counts only the first spin
WAIT_TIME kslltwtt
“wait time”
Elapsed time spent waiting for the latch (in microseconds) Incremented by wait time spent during latch acquisition.
IMMEDIATE_GETS kslltngt
“nowait gets”
Number of times a latch was requested in no-wait mode Incremented by one after each no-wait latch get. May not be protected by latch
IMMEDIATE_MISSES kslltnfa
“nowait fails”
Number of times a no-wait latch request did not succeed Incremented by one after unsuccessful no-wait latch get. Not protected by latch

(more…)

Next Page »

Theme: Rubric. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.