Latch, mutex and beyond

December 16, 2010

Hidden latch wait revolution that we missed

Filed under: History,Latch,shared latch — andreynikolaev @ 8:25 pm

The way an Oracle process waits for a latch. This seems trivial. Oracle 11.2 documentation explicitly states about “latch free” wait event:
… The wait time increases exponentially and does not include spinning on the latch (active waiting). The maximum wait time also depends on the number of latches that the process is holding. There is an incremental wait of up to 2 seconds…

In this post, I would like to show that this exponential backoff was obsoleted in Oracle 9.2. Since that time, all the Oracle versions use completely new latch wait posting mechanism and FIFO queuing for all, except one latch.
Oracle no longer uses “repeatedly spin and sleep” approach. The process spins and waits only once. The pseudo code for contemporary latch acquisition should looks like:

  Immediate latch get 
    Spin latch get 
       Add the process to the queue of latch waiters
          Sleep until posted


The real story:

Couple of years ago, I was involved in hangs diagnostics. This was large Oracle 9.2 instance on Linux. Hangs were rare and unpredictable. No one could logon to database. This sounded like a latch problem. Using LTOM we obtained several SYSTEMSTATE dumps. We immediately found the problem latch:

PROCESS 348:
  ----------------------------------------
  SO: 0x2d93cee80, type: 2, owner: (nil), flag: INIT/-/-/0x00
  (process) Oracle pid=348, calls cur/top: ...
 ...
    (latch info) wait_event=0 bits=30
        Location from where call was made: kcslcu3:
       waiting for  50011868 lgwr LWN SCN level=7 
        Location from where latch is held: kcs024:
        Context saved from call: 0
        state=free
       wtr=50011868, next waiter 2d93cee80
      holding     2e06f4810 Child redo allocation level=5 child#=1 
...
      holding     2e06ecf10 Child redo copy level=4 child#=8  
...
    Short stack dump:
ksdxfstk<-ksdxcb<-sspuser<-__pthread_sighandler_rt<-__semop<-sskgpwwait<-skgpwwait...

The process held two latches “redo allocation” and “redo copy” and waited for the third “lgwr LWN SCN” latch. However, the latch was free! The blocker waited for nothing. We obtained three consecutive systemstate dumps. The process still waited for the free latch in all the systemstates. Many sessions were waiting for busy “redo allocation” and “redo copy” latches. Another strange thing was that processes waited using “semop()” system call. This call waits infinitely until posted. We had to kill the offending process from OS to resolve the hang.

The problem repeated again and again. After several SR escalations, we received excellent advice to set “_enable_reliable_latch_waits”=false. This solved the problem.

The name of parameter turns my previous latch knowledge upside down. What were these reliable latch waits? Why Oracle process waited for latch infinitely? How all this was related to well-known exponential backoff.

After this, I saw the hang due to infinite latch wait several times. This is always preceded by high CPU consumption. The parameter _enable_reliable_latch_waits”=false always helped me.

Something strange happened in Oracle documentation also. 10.2 Reference states that all V$LATCH columns related to exponential backoff: WAITERS_WOKEN, WAITS_HOLDING_LATCH, SLEEP[1-4] have been deprecated and will always have a value of zero.

Alex Fatkullin first discovered this infinite wait for the Result Cache: Latch in “LONGHOLD” latch waits on Linux and FIFO queuing for latch gets in Latch waits in 11G: queuing. He concluded: “It looks like 11G has invalidated all that pseudo code you may find around about latch acquisition algorithm Oracle uses.”

Tanel Poder in Reliable latch waits and a new blog stated that this is well known “latch wait posting”. Tanel conclusion was:”… since 9i … most latches do have wait posting enabled by default.”

This definitely needs further investigation. My point of view is that since Oracle 9.2 the latches use completely new wait algorithm. The latch sleep now is not timed, but relies on posting. The old-fashioned exponential backoff is gone. By default, only one “process allocation” latch used timed sleep.

Experiments with latch wait:

I will use oradebug call technique to explore the latch wait. Let me start with Oracle 10.2.0.4 for Solaris X86 32bit. From the first sqlplus session, I can artificially acquire unused latch:

bash-3.00$ sqlplus / as sysdba
SQL> set sqlprompt "SQL1> " 
SQL1> select addr from v$latch_parent where name ='first spare latch' ; 
ADDR
--------
500063D0
SQL1> oradebug setmypid
SQL1> oradebug call kslgetl 0x500063D0 1 2 3
Function returned 1 

My goal is to see how Oracle process waits for the latch from the OS side. In Solaris OE, I can trace system calls using truss utility. For this purpose, I will create another session, spawn truss and call kslgetl() again:

bash-3.00$ sqlplus / as sysdba
SQL> set sqlprompt "SQL2> "
SQL> set define %
SQL2> SELECT unique spid  FROM v$process
       JOIN v$session ON addr = paddr
       natural JOIN (SELECT sid  FROM v$mystat where rownum=1);
SPID
------------
3808
SQL2> oradebug setmypid
Statement processed.
SQL2> host truss -p 3808 &
SQL2> oradebug call kslgetl 0x500063D0 1 4 5
...
semop(3, 0x08044720, 1)         (sleeping...)  - for semaphore post

In most *nix platforms except AIX and latest HP-UX, Oracle process waits using the semaphore associated with it. If latch spinning was unsuccessful, the process puts itself into the waiter list for the latch. Then the process goes to sleep on its semaphore using semop(…) system call. At this time, I took the systemstate dump to observe the latch wait list contents:

PROCESS 14:          - Latch holder
----------------------------------------
...
  (latch info) wait_event=0 bits=1
    holding    (efd=4) 500063d0 first spare latch level=0
...
        waiters [orapid (seconds since: put on list, posted, alive check)]:   - latch wait list
         16 (36, 1290379647, 36)
         waiter count=1
...
PROCESS 16:        - Latch waiter
----------------------------------------
...
    waiting for 500063d0 first spare latch level=0
...
        possible holder pid = 14 ospid=3736
    on wait list for 500063d0
...

Further attempts to acquire the latch will be queued in FIFO order. The semop() call will be finished only when the latch holder released the latch. At that time, the holder will scan the latch wait list, identify next waiter, and post the waiter’s semaphore using semctl(3,..) system call:

SQL1> oradebug call kslfre 0x500063D0
...
semctl(3, 20, SETVAL, 1)                        = 0    -- post the semaphore
...
Function returned 4EC
SQL1> 

Therefore, process sleeping for the latch will be awakened only when posted. The sleep relies on inter-process post. The sleep even does not have the timeout, unlike most Oracle wait events. This completely differs from classic latch exponential backoff. Oracle 10.2 latch appears not to use “repeatedly spin and timed sleep”. The process spins and waits only once.

All except one latches use wait posting:

Note 34576.1 “WAITEVENT: “latch free” Reference Note” states “…Oracle 9i uses this “latch posting” far more than Oracle8i (and earlier) releases“. What means “far more”? We know that in 8i only several latches used wait posting by default. In this section, I will demonstrate that all 9.2 latches except one use wait posting.

To prove this I need to repeat the truss experiment for all the latches. This is dangerous. If process holds critical latch for long time, the instance crashes. I had to acquire such latches only for seconds. To achieve this, I wrote script latch_wait_trace.sql. The script uses latch number as an argument and spawns sqlplus to acquire corresponding parent latch.

This second sqlplus process will hold the latch for 3 seconds. During this time, the first script spawns truss and tries to acquire the busy latch again. Script redirect the truss output of the latch wait to unique file. This will be one-time experiment for each Oracle version and platform. It will be enough to check trace files contents:

sqlplus / as sysdba
set pagesize 0
spool ll.sh
select 'sqlplus /nolog @latch_wait_trace '||latch#
   from v$latch_parent;
spool off
rem !sh ll.sh
exit

Running this script in Oracle 9.2-11.2 generate truss file for every latch. All the files except one contain the semop() infinite wait.
Only “process allocation” latch shows different system calls. On Solaris this latch waits using pollsys() system call. On Linux and HP-UX it uses select():

gettimeofday({1292009494, 986601}, NULL) = 0
select(0, [], [], [], {0, 8000})        = 0 (Timeout)
select(0, [], [], [], {0, 8000})        = 0 (Timeout)
...

The select() and poll() syscalls are commonly used to poll IO operations. When used with zero first parameter, this calls are “…fairly portable way to sleep with subsecond precision” (man select). We just saw that the “process allocation” latch repeatedly sleeps for 0.008 sec (8000 microseconds). Unlike the exponential backoff, this timeout value is constant. The sleep cannot be interrupted by inter-process post.

I cannot explain why Oracle made the “process allocation” latch to be an exception. However, it is easy to find the origin of timeout values. The “process allocation” is the only latch that has non-zero value of CLASS_KSLLT column in x$ksllt/v$latch (x$kslltr in 11g). The latch classes allow fine grain tuning of latches. It occurs that nondefault class latches use the timed sleeps. The “process allocation” latch belongs to class 2. The sleep timeouts for latch class 2 can be tuned by “_latch_class_2” initialization parameter. Current timeout values are externalized in x$ksllclass table:
SQL> set markup html on
SQL> select indx,sleep0,sleep1,sleep2,sleep3,sleep4,sleep5,sleep6,sleep7
from x$ksllclass where indx=2;

INDX SLEEP0 SLEEP1 SLEEP2 SLEEP3 SLEEP4 SLEEP5 SLEEP6 SLEEP7
2 8000 8000 8000 8000 8000 8000 8000 8000

Oracle process uses SLEEP0-7 values as timeouts for the first 8 sleeps.

Oracle versions that use the new latch wait:

I repeated the above experiment for other Oracle versions. Oracle 11g, 10g and 9.2 demonstrated the same infinite semop() wait. On contrast, the truss of most Oracle 8.1.7 latches showed timed waits and the classic exponential backoff:

...
pollsys(0x08044A60, 0, 0x08044AB0, 0x00000000)	= 0
	timeout: 0.010001000 sec
pollsys(0x08044A60, 0, 0x08044AB0, 0x00000000)	= 0
	timeout: 0.010240000 sec
...	timeout: 0.010240000 sec
...	timeout: 0.030720000 sec
...	timeout: 0.030720000 sec
semctl(57, 6, SETVAL, 1)			= 0     -- post the PMON to check up on the latch holder
pollsys(0x08044A60, 0, 0x08044AB0, 0x00000000)	= 0
	timeout: 0.081920000 sec
...
pollsys(0x08044A60, 0, 0x08044AB0, 0x00000000)	= 0
	timeout: 2.048000000 sec
...

Timeouts were exponentially increased every second wait up to 2sec. Actual series was: 0.01-0.01-0.01-0.03-0.03-0.07-0.07-0.15-0.23-0.39-0.39-0.71-0.71-1.35-1.35-2.0-2.0-2.0-2.0 … seconds. These timed waits were not interruptible by inter-process posts. Oracle 7 and 8.0 latch wait also showed very close exponential backoff pattern.

Under the load, such behavior was ineffective. Most Oracle 8i latch waits were for nothing. The latch holder typically released the latch long before the end of waiter sleep. I will post in future about inefficiencies of exponential backoff.

I want to add that Oracle 9.0 had some transient behavior between pure exponential backoff and contemporary wait posting. “Shared pool“, “library cache” and “library cache load lock” latches used wait posting, 14 shared latches used classic exponential backoff upto 2sec and other Oracle 9.0 latches used used short exponential backoff upto of 0.4 sec. Surprisingly, this is the _max_sleep_holding_latch value. However, the process did not held any latch at that time.

Is this the old 8i “latch wait posting”:

In Oracle 8i several latches uses “wait posting”. Note 104426.1 “Short Wait and Long Wait Latches” describes these as “long wait latches”. The number of “long wait latches” was controllable by _latch_wait_posting parameter in pre 9i Oracle versions. The same note states that “library cache latch” is an example of a long wait latch. Let me truss its sleep:

...setitimer(ITIMER_REAL, 0x0804495C, 0x00000000)  = 0
         value:  interval:    0.000000 sec  value:    0.010000 sec
semop(57, 0x08044B48, 1)                        Err#91 ERESTART
...setitimer(ITIMER_REAL, 0x0804495C, 0x00000000)  = 0
         value:  interval:    0.000000 sec  value:    0.010000 sec
semop(57, 0x08044B48, 1)                        Err#91 ERESTART
...setitimer(ITIMER_REAL, 0x0804495C, 0x00000000)  = 0
         value:  interval:    0.000000 sec  value:    0.030000 sec
semop(57, 0x08044B48, 1)                        Err#91 ERESTART
...setitimer(ITIMER_REAL, 0x0804495C, 0x00000000)  = 0
         value:  interval:    0.000000 sec  value:    0.030000 sec
semop(57, 0x08044B48, 1)                        Err#91 ERESTART
...

The Oracle 8i process waited for long latch using semop() call which was timed out by timer alarm. The timeout exponentially increased:
0.01-0.01-0.03-0.03-0.08-0.07-0.16-0.23-0.4-0.39-0.73-0.72-1.39-1.38-2.04-2.04 sec
We see that old 8i latch wait posting was the combination of exponential backoff and timed sleeps until post. This rather differs from contemporary infinite latch wait. I suppose that the old latch wait posting was more CPU consuming.

Summary:
This post is about new latch wait post mechanics which appeared in Oracle 9.2. Contemporary latch spins and waits until posted only once. Since 2002 for almost a decade, we tuned the latch contention assuming its exponential backoff behavior. I even mistakenly tried to estimate the number of sleeps as a logarithm of “latch free” wait event duration . This is why I named this post: “Hidden latch revolution that we missed”

About these ads

8 Comments »

  1. Good Work.Very Thx

    Comment by Sikimkalkti.com — December 17, 2010 @ 11:23 am | Reply

  2. отличная, познавательная и полезная статья!
    спасибо

    Comment by Игорь Усольцев — December 23, 2010 @ 2:13 pm | Reply

  3. [...] previous investigation showed that latch wait was cardinally changed in Oracle 9.2. At that time, the exponential backoff [...]

    Pingback by Spin tales: Part 1. Exclusive latches in Oracle 9.2-11g « Latch, mutex and beyond — January 6, 2011 @ 9:59 pm | Reply

  4. [...] list and spins again. If the second spin also not succeed, the process sleeps using semop() syscall until posted. Compare this with onetime spin for the exclusive [...]

    Pingback by Spin tales: Part 2. Shared latches in Oracle 9.2-11g « Latch, mutex and beyond — January 14, 2011 @ 6:23 pm | Reply

  5. [...] 0. In my previous posts, I discussed how the standard class exclusive and shared latches spin and wait. Now, it is the time to explore the non-standard class latch behaviors. Parameter _latch_classes [...]

    Pingback by Spin tales: Part 3. Non-standard latch classes in Oracle 9.2-11g « Latch, mutex and beyond — January 19, 2011 @ 7:51 am | Reply

  6. [...] I demonstrated in previous post, since 9.2 Oracle uses completely new latch acquisition [...]

    Pingback by Latch statistics « Latch, mutex and beyond — February 26, 2011 @ 11:45 am | Reply

  7. [...] would like to explore the “cursor: pin S” waits in the way similar to my investigation of latch waits. In previous posts, I acquired latches artificially using oradebug call kslgetl. Unfortunately, KGX [...]

    Pingback by Mutex waits. Part 1. “Cursor: Pin S” in Oracle 10.2-11.1. Invisible and aggressive. « Latch, mutex and beyond — July 9, 2011 @ 1:18 pm | Reply

  8. [...] compare old and new latch mechanisms, I found useful the following illustration. Since it is hard for us, humans, to visualize milli- [...]

    Pingback by Latch Timescales « Latch, mutex and beyond — December 22, 2011 @ 12:44 pm | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Rubric Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 63 other followers

%d bloggers like this: