Latch, mutex and beyond

July 30, 2012

Mutex waits. Part III. Contemporary Oracle wait schemes diversity.

Filed under: 11.2,Contention,DTrace,Instrumentation,Mutex,OS tuning,Patches,Spinlock — andreynikolaev @ 12:21 pm

Several months have passed since my previous “mutex wait” post. I was so busy with work and conference presentations. Thanks to all my listeners at UKOUG2011, Hotsos2012 and Medias2012 conferences and several seminars for inspiring questions and conversations.

I. Unexpected change.

Now it is time to discuss how contemporary Oracle waits for mutexes. My previous posts described evolution of “invisible and aggressive” 10.2-11.1 mutex waits into fully accounted and less aggressive 11gR2 mutexes. Surprisingly Oracle 11.2.0.2.2 (or 11.2.0.2 PSU2) appeared in April 2011 demonstrated almost negligible CPU consumption during mutex waits. (more…)

May 25, 2012

MEDIAS-2012

Filed under: 11.2,Conference,Mutex,Spinlock,Theory — andreynikolaev @ 7:50 pm

A week ago I was back home from MEDIAS-2012 conference. It was held in Limassol (Cyprus), near the spectacular ruins of ancient city Amathus. This was unique style general Computer Science conference with speakers including legendary Soviet cosmonaut Alexandr Serebrov and the inventor of Mean Value Analysis Professor Martin Reiser.

In my experience the Reiser’s Law stating that “Software is getting slower more rapidly than hardware becomes faster” has been repeatedly illustrated by numerous performance problems I observe.

RDTeX presentations covered Oracle topics ranged from Data Warehousing by Mikhail Kozyr to Oracle Coherence by Alexei Zolotarev.

This conference gave me unique opportunity to discuss mathematics related to Oracle mutexes. If you are interested, you can download my presentation here.

And, of course, Cyprus is a great place to visit!

Thanks to Professor S.V. Klimenko for kindly inviting me to MEDIAS 2012 conference.
Thanks to RDTEX for financial support.
Thanks to RDTEX Technical Support Centre Director S.P. Misiura for years of encouragement and support of my investigations.

October 25, 2011

Mutex waits. Part II. “Cursor: Pin S” in Oracle 11.2 _mutex_wait_scheme=0. Steps out of shadow.

Filed under: 11.2,Contention,DTrace,Instrumentation,Mutex,OS tuning,Patches — andreynikolaev @ 4:23 pm

I would like to describe how Oracle versions 11.2.0.1-11.2.0.2.1 waited for mutexes. This algorithm also appears to be used in post-11.2.0.2.2 PSUs and new 11.2.0.3 patchset as _mutex_wait_scheme=0.

My previous post demonstrated that before version 11.2:

  • “Cursor: pin S” was pure wait for CPU. Long “cursor: pin S” waits indicated CPU starvation.
  • Mutex contention was almost invisible to Oracle Wait Interface
  • Spin time to acquire mutex was accounted as CPU time. It was service time, not waiting time.

Things changed. Mutex waits in Oracle 11.2 significantly differ from previous versions. Contemporary mutex waits are not CPU aggressive anymore, completely visible to Oracle Wait Interface and highly tunable.
(more…)

May 1, 2011

Divide and conquer the “true” mutex contention

Filed under: 11.2,Contention,Mutex,Patches — andreynikolaev @ 7:16 pm

Oracle 11.2.0.2 contains enhancements 9282521 and 9239863 named “Library cache: mutex X” for objects highly contended for. Part I and II. These enhancements introduce new interesting possibilities to tune some types of the mutex contention.

Contention for heavily accessed objects can now be divided between multiple copies of object in the library cache. According to notes 9282521.8 and 9239863.8 describing the patches, the enhancements should be used:
When there is true contention on a specific library cache object….
Let me investigate this deeper. I will use Oracle 11.2.0.2.2 for Solaris SPARC (64-bit) on 8Cores/32Threads Sun Fire T2000. I chose this platform in order to emphasize how the enhancements work. (more…)

April 22, 2011

“Cursor: pin S” mutex contention testcase and diagnostics tools.

Filed under: 11.2,Instrumentation,Mutex — andreynikolaev @ 4:33 pm

I like testcases. One testcase results in more understanding than ten page article or weeks of data collection. This is why we need reproducible testcases if we want to explore mutex contention. Testcases will also give me a possibility to demonstrate how to use mutex contention diagnostics tools embedded in Oracle. I will use Oracle 11.2.0.2 for Linux X86 32bit on my Dual-Core laptop in this posts. Your numbers for other Oracle versions and platforms may vary.

I. “Cursor: pin S” contention testcase:

Each time the session execute SQL operator, it needs to ‘pin’ the cursor in library cache using mutex. True mutex contention arises when the same SQL operator executes concurrently at high frequency. Therefore the simplest testcase for “Cursor: pin S” contention should look like:

begin
for i in 1..1000000
loop
   execute immediate 'select 1 from dual where 1=2';
end loop;
end;
/

(more…)

March 19, 2011

11.2.0.2 is the right patchset for mutexes

Filed under: 11.2,Mutex,Patches — andreynikolaev @ 10:05 am

Many people asked me about the second part of my blog title – the mutex. This is the first post about it. Mutexes is another Oracle spinlock, which was appeared in version 10.2.0.2. Despite being known since 2005, Oracle mutex internals is still Terra incognita.

This post is inspired by several recent escalations due to mutex contention. It occurs that 11.2.0.2 patchset contains extraordinary number of mutex related changes. Some enhancements like 10411618 exist only for 11.2.0.2. The following patches even changed the mutex architecture: (more…)

November 23, 2010

Shared latches by Oracle version

Filed under: 11.2,Latch,shared latch,Summary tables — andreynikolaev @ 6:13 pm

As I described in my previous post, Shared and Exclusive Oracle latches differ significantly. Shared latch behaves like enqueue. It has S and X incompatible modes. Moreover  X mode serializes the shared latch. The contention for shared and exclusive latches has different patterns. This leads to different methods to tune such contentions.

But we do not know which latches are shared. Oracle never published the list of shared latches. Every time looking in the AWR or Statspack report we had to guess the type of contending latch. We only know that “cache buffer chains” latches became shared since in 9.2.

Oracle executable internally determines that latch is shared using flag hidden somewhere in x$kslld (v$latchname) structure. Google search shows that  KSLLD means  [K]ernel [S]ervice [L]ock [L]atch [D]escriptor. Unfortunately this shared flag was not externalized to SQL. It is possible to check the flag manually using oradebug peek or DTrace. But the flag offset is version and platform dependent. We need more systematic way to determine the latch type.
(more…)

August 24, 2010

Exclusive latches in memory. Oracle versions 7-11.2

Filed under: 11.2,Latch,Uncategorized — andreynikolaev @ 8:54 am
Tags:

Contrary to popular believe Oracle latches were significantly evolved through the last decade. Not only additional statistics appeared (and disappeared) and new (shared)  latch type was introduced,  the latch  itself was changed

It is interesting to see how the latch was organized in the past and contemporary  versions.

To see the latch in-memory seems hard, since latches typically held for a small amount of time. Hopefully Oracle gives us a possibility to call any its kernel function using oradebug call utility. We only need to know that Oracle itself uses kslgetl(laddr, wait, why, where) function to acquire the exclusive latch. Recently I blogged about this function and its parameters.

This function can be also used to artificially acquire any latch for demonstration by:


SQL> oradebug call kslgetl <latch address> <wait> <why> >where>

(more…)

July 29, 2010

Strange “db file async I/O submit” wait event

Filed under: 11.2,Uncategorized — andreynikolaev @ 5:05 pm

This post will not be directly related to the blog theme. I would like to discuss “db file async I/O submit” wait event. This new event was introduced in Oracle 11.2. So far it have not been described in Oracle documentation and Metalink.

At the beginning of this story, this event became the topmost background wait for one production instance under HP-UX:


                                                             Avg
                                        %Time Total Wait    wait    Waits   % bg
Event                             Waits -outs   Time (s)    (ms)     /txn   time
-------------------------- ------------ ----- ---------- ------- -------- ------
db file async I/O submit        151,159     0     35,625     236      0.7   96.3
log file parallel write         427,728     0        308       1      2.0     .8
...

This looks mystique. HP-UX not supports AIO for filesystem at all!
(more…)

Create a free website or blog at WordPress.com.