Latch, mutex and beyond

April 12, 2010

Latch get “where” and “why”.

Filed under: Uncategorized — andreynikolaev @ 7:48 pm

It is widely known that the Oracle server uses kslgetlKernel Service Lock Management Get Latch function to acquire the latch. In 2006 Tanel Poder first demonstrated that  oradebug call  kslgetl/kslfre can be used to acquire the latch manually. This is very useful  to simulate latch related hangs and contention.

For several years it was commonly supposed that kslgetl() has two parameters – latch address and wait. But on AIX 5L we have unique procstack tool. This tool shows the actual number of parameters for function. It occurs that kslgetl has four parameters:

...
sskgpwwait(??, ??, ??, ??, ??) + 0x38
skgpwwait(??, ??, ??, ??, ??) + 0xbc
kslges(??, ??, ??, ??, ??) + 0x54c
kslgetl(??, ??, ??, ??) + 0x33c
ksfglt(??, ??, ??, ??, ??) + 0x198
...


We do know the meaning the first two of parameters. What about the others?
At my work I had to analyze a lot of errostacks for latch related problems. My guess is that these kslgetl parameters should be named like:

    kslgetl(laddr, wait, why, where) – Get exclusive latch

More precisely, to request the latch Oracle kernel needs:

  • laddress – address of latch in SGA
  • wait – flag. If true, this is latch get in willing-to-wait mode..
  • where – code for location from where the latch is acquired.
  • why –  context why the latch is acquired at this where.

“Where” and “why” parameters are using for the instrumentation of latch get.  Integer “where” value is the reason for latch acquisition. This is the index in an array of “locations” strings that literally describe “where” . Oracle externalize this array to SQL in  x$ksllw fixed table.

SQL> select indx, ksllwnam,ksllwlbl from x$ksllw;

INDX KSLLWNAM                            KSLLWLBL
----- ----------------------------------- ----------------------
   0 No latch
   1 kywmcrpln: creating new WLM plan
…
  82 ksuxds                              session ptr
  83 ksukls: mark killed                 session ptr
  84 ksukls: move session                session ptr
…
3707 kghupr1                             Chunk Header
3708 kghchk:dsidx                        heap descriptor
…

Note that this “location” is not the name of function from which the latch have gotten. These strings the DBA commonly see in v$latch_misses and AWR/Statspack reports.

Fixed view v$latch_misses is based on x$kslwsc fixed table. In this table Oracle maintains an array of counters for latch misses by “where” location. X$kslwsc allows us to find which “where” location allowed by Oracle for each latch:

select indx from x$kslwsc where ksllasnam = ‘<latch name>’;

Also, these strings present as  “Location from where latch is held” in process state object dumps. For example, look at the process holding cache buffers chains latch:

PROCESS 299:
 ----------------------------------------
 SO: 0x2d93be720, type: 2, owner: (nil), flag: INIT/-/-/0x00
 (process) Oracle pid=299, calls cur/top: 0x2e9028a38/0x2e9028a38, flag: (0)
…
   (latch info) wait_event=0 bits=2
     holding     2dee1ac50 Child cache buffers chains level=1 child#=124200
        Location from where latch is held: kcbgtcr: fast path:
        Context saved from call: 39022946

Tanel Poder  wrote the  excellent article about systematically troubleshooting latch contention. His famous latchprofx.sql tool samples v$latchholder/x$ksuprlat at extremely high rate. In this article he also discussed meaning of commonly seen “where” parameters

Why” parameter is named “Context saved from call” in dumps. It specifies why the latch is acquired at this “where”. Its meaning depends on latch and “where”. For example, “why” contains DBA address of block that accessed under protection of cache buffers chain latch.  Tanel Poder elegantly used this to investigate the root cause of cache buffers chain latch contention.

Why” meaning for some of “where” may be guessed from ksllwlbl column of x$ksllw. From the above listing one can conclude that “why” contain the SGA chunk address for shared pool latch and the session address for session idle bit latch correspondingly.

“Where” and “why” parameters instrument the latch get. When the latch will be acquired, Oracle will save these values into the latch structure.   Oracle 11g externalizes latch structures in x$kslltr_parent and x$kslltr_children fixed tables for parent and child latches respectively. Versions 10g and before used x$ksllt table. Fixed views v$latch and v$latch_children were created on these tables.

Where” and “why” parameters for last latch acquisition may be seen in kslltwhr and kslltwhy columns of these tables. Note the column name structure, this is common. These columns is the externalization of corresponding latch structure members.

Fixed table x$ksuprlat shows latches that processes are currently holding. View v$latchholder created on it. Again, “where” and “why” parameters of latch get present  in ksulawhr and ksulawhy columns.

And the last, but not the least. When Oracle process waits (sleeps) for the latch, it puts latch address into ksllawat, “where” and “why” values into ksllawer and ksllawhy columns of corresponding  x$ksupr row. This is the fixed table behind the  v$process view. These columns are extremely useful when  exploring why the processes contend for the latch.

Unfortunately Oracle does not include any of these columns in v$latchholder, v$latch and v$process views. This is why I usually prefer to select X$ tables, not V$ views.

Let us see how it works. I will intentionally choose not usually used latch ‘DMON Work Queues Latch’ for my test 11.2.0.1 Linux IA-32 instance:

SQL> select addr, name from v$latch where name='DMON Work Queues Latch';
ADDR     NAME
-------- ------------------------
200222A0 DMON Work Queues Latch


Possible “where” values for this latch I found in  x$kslwsc:

SQL> select indx from x$kslwsc where ksllasnam ='DMON Work Queues Latch'
INDX
-----------
 4642

SQL> oradebug setmypid
Statement processed.
SQL> oradebug call kslgetl 0x200222A0  1 100 4642
Function returned 1


The process now holds the exclusive latch. One can see this from another session in x$ksuprlat/v$latchholder .

SQL> select ksuprpid pid,ksuprsid sid ,ksuprlat laddr,ksuprlnm name,
 2         ksuprlmd,ksulawhy,ksulawhr, ksulagts gets
 3          from x$ksuprlat;
 
 PID   SID LADDR    NAME                    KSUPRLMD  KSULAWHY KSULAWHR GETS
----- ----- -------- ----------------------- --------- -------- -------- ----
 31   138 200222A0 DMON Work Queues Latch  EXCLUSIVE      100     4642    2
 


Indeed,  process with PID=31 hold ‘DMON Work Queues Latch’ at “where”=ksulawhr=4642 with “why”=ksulawhy=100.

Let us look into the latch structure itself. Parameters of the last latch get:


SELECT kslltwhr,kslltwhy  FROM x$kslltr_parent  where kslltnam='DMON Work Queues Latch';
 KSLLTWHR   KSLLTWHY
---------- ----------
4642          100
 

If I request the latch from the another session with PID=41, process will sleep for latch.


SQL> oradebug setmypid
Statement processed.
SQL> oradebug call kslgetl 0x200222A0  1 100 4643
<... wait ...>

Process number 41 currently waits for the latch. Its latch get parameters are:


SQL> SELECT ksllawhy,ksllawer  FROM x$ksupr where indx=41;
 KSLLAWHY   KSLLAWER
---------- ----------
 100        4643

This is all about kslgetl parameters.

7 Comments »

  1. Very interesting.
    Thank you.

    Comment by Alexey — April 27, 2010 @ 9:50 am | Reply

  2. Can process hold several latches?

    Comment by Alexey — April 28, 2010 @ 6:15 am | Reply

  3. […] is easy to write such a script (ksl_args.d). Remember that Oracle acquires exclusive latches using kslgetl(laddr, wait, why, where), and shared latches using kslgetsl(laddr, wait, why, where, rs) (ksl_get_shared_latch() in […]

    Pingback by Appetizer for DTrace « Latch, mutex and beyond — October 28, 2010 @ 3:33 pm | Reply

  4. […] The first DTrace trigger (probe) pid$target::kslgetl:entry turns on in_latch flag at the entry to kslgetl() function. The second trigger pid$target:::entry will fire at the entry to any Oracle function but […]

    Pingback by Spin tales: Part 1. Exclusive latches in Oracle 9.2-11g « Latch, mutex and beyond — January 6, 2011 @ 9:59 pm | Reply

  5. […] mutex sleep location_id corresponds to latch where parameter. Unlike the latches, Oracle did not externalize all possible location_id values in […]

    Pingback by “Cursor: pin S” mutex contention testcase and diagnostics tools. « Latch, mutex and beyond — April 22, 2011 @ 4:33 pm | Reply

  6. […] they would end up waiting on the enqueue holder. A raw version for enqueue of what is described here for latches (as usual amazing job by Tanel and […]

    Pingback by Manually holding an enqueue with SystemTap | Mauro Pagano's Blog — July 6, 2015 @ 3:08 pm | Reply


RSS feed for comments on this post. TrackBack URI

Leave a comment

Create a free website or blog at WordPress.com.