# Latch, mutex and beyond

## February 23, 2011

### Latch statistics

Filed under: DTrace,Instrumentation,Latch,Latch Statistics,Theory,Uncategorized — andreynikolaev @ 10:36 am

In previous posts, I investigated how the Oracle process spins and waits for the latch. Now we need the tool to estimate when the latch acquisition works efficiently and when we need to tune it. This tool is the latch statistics. Contemporary Oracle documentation describes v$latch statistics columns as:  Statistic: x$ksllt column Documentation description: When and how it changed: GETS kslltwgt “wait gets” Number of times the latch was requested in willing-to-wait mode Incremented by one after latch acquisition. Therefore protected by latch MISSES kslltwff “wait fails” Number of times the latch was requested in willing-to-wait mode and the requestor had to wait Incremented by one after latch acquisition if miss occured SLEEPS kslltwsl “wait sleeps” Number of times a willing-to-wait latch request resulted in a session sleeping while waiting for the latch Incremented by number of times process slept during latch acquisition SPIN_GETS ksllthst0 Willing-to-wait latch requests which missed the first try but succeeded while spinning Incremented by one after latch acquisition if miss but not sleep occured. Counts only the first spin WAIT_TIME kslltwtt “wait time” Elapsed time spent waiting for the latch (in microseconds) Incremented by wait time spent during latch acquisition. IMMEDIATE_GETS kslltngt “nowait gets” Number of times a latch was requested in no-wait mode Incremented by one after each no-wait latch get. May not be protected by latch IMMEDIATE_MISSES kslltnfa “nowait fails” Number of times a no-wait latch request did not succeed Incremented by one after unsuccessful no-wait latch get. Not protected by latch

Since Oracle 10.2 all the following columns described as “have been deprecated … ;they will always have a value of zero”:

• SLEEP1-SLEEP10
• WAITERS_WOKEN
• WAITS_HOLDING_LATCH
• MISSES_WAITLIST: Number of latch misses on the associated wait list latch
• SLEEPS_WAITLIST: Number of sleeps while trying to acquire the associated wait list latch
• WAIT_TIME_WAITLIST: Wait time for the associated wait list latch

Several other never documented latch statistics have been deprecated also in Oracle 10.2. We have lost important additional information about latch performance. Here I will discuss the remaining statistics set.

As I demonstrated in previous post, since 9.2 Oracle uses completely new latch acquisition algorithm:

Immediate latch get
Spin latch get
Add the process to the queue of latch waiters
Sleep until posted

We need to interpret the latch statistics from this new algorithm point of view.

GETS, MISSES, etc. are the integral statistics counted from the start of the instance. These values depend on complete workload history. AWR and Statspack report changes of integral statistics per snapshot interval. Usually these values are “averaged by hour”, which is much longer then typical latch performance spike.

Another problem with AWR/Statspack report is averaging over child latches. By default AWR gathers only summary data from v$latch. This greatly distorts latch efficiency coefficients. The latch statistics should not be averaged over child latches. Parent and child latch statistics are stored in AWR repository only when “statistics_level”=’ALL’ or the snapshots were manually created using dbms_workload_repository.create_snapshot(‘ALL’). The Statspack gathers parent and child latch statistics separately only at 10 level. To avoid averaging distortions the following analysis uses the latch statistics from v$latch_parent and v$latch_children (or x$ksllt in Oracle version less then 11g)

The current workload is characterized by differential latch statistics and ratios. There exist several ways to choose the basic set of statistics. I will choose the most close to Statspack way and define key statistics as:

 Description: Definition: AWR equivalent: Dimensions: Latch requests arrival rate $\ \lambda=\frac{\Delta GETS}{\Delta time}$ $\ \frac{''Get\ Requests''}{''Snap\ Time\ (Elapsed)''}$ $\left[\frac{requests}{sec} =\ Hz\ \right]$ Latch gets efficiency $\rho=\frac{\Delta MISSES}{\Delta GETS}$ $\ ''Pct\ Get\ Miss''/100$ – Latch sleeps ratio $\kappa=\frac{\Delta SLEEPS}{\Delta MISSES}$ $\ ''Avg\ Slps\ /Miss''$ – Latch wait time per second $\ W =\frac{\Delta WAIT\_TIME}{\Delta time}*10^{-6}$ $\ \frac{''Wait\ Time\ (s)''}{''Snap\ Time\ (Elapsed)''}$ $\left[\frac{sec}{sec}=Erlang \right]$ Latch spin efficiency $\ \sigma=\frac{\Delta SPIN\_GETS}{\Delta MISSES}$ $\ \frac{''Spin\ Gets''}{''Misses''}$ –

At this post, I will consider only wait latch gets. The no-wait (“IMMEDIATE”) gets add some complexity only for several latches. I will also assume ${\Delta time}$ to be small enough that workload do not change significantly.
Other statistics reported by AWR depend on these key statistics:

• Latch miss rate is $\frac{\Delta MISSES}{\Delta time}=\rho\lambda$. This is “Misses” from AWR divided by snapshot interval
• Latch waits (sleeps) rate is $\frac{\Delta SLEEPS}{\Delta time}=\kappa\rho\lambda$. This is per second value of “Sleeps” from AWR

From the queuing theory point of view, the latch is G/G/1/(SIRO+FIFO) system with interesting queue discipline including Serve In Random Order spin and First In First Out sleep wait. Using the latch statistics, I can roughly estimate queuing characteristics of latch. I expect that the accuracy of such estimations is about 20-30%.

As a first approximation, I will assume that incoming latch requests stream is Poisson distributed and latch holding (service) times are exponentially distributed. Therefore, our first latch model will be M/M/1/(SIRO+FIFO).

Oracle measures more statistics then usually consumed by classic queuing models. It is interesting what these additional statistics can be used for.

Average service time:

The PASTA (Poisson Arrivals See Time Averages) property connects $\rho$ ratio with the latch utilization. For Poisson streams the latch gets efficiency should be equal to utilization:

$\rho=\frac{\Delta misses}{\Delta gets}\approx U=\frac{\Delta latch\ hold\ time}{\Delta time}$

However, this is not exact for server with finite number of processors. The Oracle process occupies the CPU while acquiring the latch. As a result, the latch get see the utilization induced by other $N_{CPU}-1$ processors only. Compare this with MVA arrival theorem. In some benchmarks there may be only $N_{proc}\leq N_{CPU}$ Oracle shadow processes that generates the latch load. In such case we should substitute $N_{proc}$ instead $N_{CPU}$ in the following estimate:

$\rho \simeq \left( 1-\frac{1}{min(N_{CPU},N_{proc})}\right) U=\frac{1}{\eta}U;\quad\eta=\frac{min(N_{CPU},N_{proc})}{min(N_{CPU},N_{proc})-1}$

Here I introduced the the $\eta$ multiplier to correct naive utilization estimation. Clearly, the $\eta$ multiplier confirms that the entire approach is inapplicable to single CPU machine. Usually $\eta$ significantly differs from one only for demonstrations on my Dual-Core notebook. For servers its impact is below precision of my estimates. For example for small 8 CPU server the $\eta$ multiplier adds only 14% correction.

We can experimentally check the accuracy of these formulas and, therefore, Poisson arrivals approximation. $U$ can be independently measured by sampling of v$latchholder. The latchprofx.sql script by Tanel Poder do this at high frequency. Within our accuracy we can expect that $\rho$ and $U$ should be at least of the same order. To calculate the complete statistics set, I wrote latch_stats_10g.sql script. The script uses latch address as the parameter. It computes differential latch statistics for small interval (30sec) and estimates latch utilization and queue length by sampling. Note that this is only demonstration script. To measure such quantities accurately we need to perform statistical analysis and least squares fits. We know that $U= \lambda S$, where $S$ is average service (latch holding) time. This allows us to estimate the latch holding time as: $S=\frac{\eta\rho}{\lambda}$ This is interesting. We obtained the first estimation of latch holding time directly from statistics. In AWR terms this formula looks like $S=\eta\frac{''Pct\ Get\ Miss''\ \times\ ''Snap\ Time''}{100\ \times\ ''Get\ Requests''}=\eta\frac{''Misses''\times''Snap\ Time''}{''Gets''^2}$ The inverse of service time is the service rate $\mu = 1/S=\frac{\lambda}{\eta\rho}$. This is the estimation of the upper limit for latch request frequency the instance can serve. As we will see later, this estimation is unreliable. Wait time: Look more closely on the summary wait time per second $W$. Each latch acquisition increments the “WAIT_TIME” statistics by time it waited for the latch. According to the Little law, average latch sleeping time is related the length of wait (sleep) queue: $L = \lambda_{waits}\times \left\langle average\ wait\ time \right\rangle; = \lambda \rho \kappa \times \delta (Wait\_Time)$ The right hand side of this identity is exactly the “wait time per second” statistic. Therefore: $W\equiv L$ We can experimentally confirm this conclusion because $L$ can be independently measured by sampling of v$process.latchwait column.

Recurrent sleeps:

In ideal situation the process spins and sleeps only once. Consequently, the latch statistics should satisfy the following identity:

$MISSES\ =\ SPIN\_GETS\ +\ SLEEPS$

Or, equivalently:

$1=\sigma+\kappa$

In reality, some processes had to sleep for the latch several times. This occurred when the sleeping process was posted, but another process got the latch before the first process received the CPU. The awakened process spins and sleeps again. As a results the previously equality became invalid.

Before 10.2 Oracle directly counted these sequential sleeps in SLEEP? statistics. In 10.2 these statistics became obsolete. However, we can estimate the rate of such “sleep misses” from other basic statistics. The recurrent sleep increments only the SLEEP counter. The SPIN_GETS statistics not changed. The $\sigma+\kappa-1$ is the ratio of inefficient latch sleeps to misses. The ratio of “unsuccessful sleep” to “sleeps” is given by:

Recurrent sleeps ratio $=\frac{\sigma+\kappa-1}{\kappa}$

If this ratio exceeds several percents, it should be investigated. Frequent “unsuccessful sleeps” are inefficient and usually a symptom of OS waits posting problems.

Latch acquisition time:

Latch acquisition time is the average time that process spent to acquire the latch. This is the sum of spin time and wait time. Oracle does not measure the spin time explicitly. However, we can measure it on Solaris platform using simple DTrace script:

#!/usr/sbin/dtrace -s
#pragma D option quiet
BEGIN {
in_latch_acquisition=0;
ts=0;
n=0;
}
pid$target::kslgetl:entry / arg0 ==$1 /
{
ts=timestamp;
in_latch_acquisition=1;
}
pid$target::kslgetl:return / in_latch_acquisition / { @timeavg = avg(timestamp - ts); in_latch_acquisition=0; n++; } tick-10s { normalize(@timeavg,1000); printf("Latch gets=%d",n); printa(" average acquisition time=%@d mus\n",@timeavg); trunc(@timeavg); n=0; }  This script sets DTrace triggers (probes) at entry and return of kslgetl() function inside the Oracle shadow process. The script measures timestamp difference between entry to and return from the latch acquisition function. This is the time the process spent acquiring the latch. Using DTrace aggregation this time is averaged over latch acquisitions and reported each 10 sec. Typical output looks like:  /usr/sbin/dtrace -32 -s latch_time.d -p 17141 0x5B7C75F8 ... Latch gets=45810 average acquisition time=24 mus Latch gets=48422 average acquisition time=20 mus Latch gets=49989 average acquisition time=19 mus ... On other platforms we should rely on statistics. Fortunately in Oracle 9.2-10.2 one can count the average number of spinning processes by sampling x$ksupr.ksllalaq. The process set this column equal to address of acquired latch during active phase of latch get. Oracle 8i and before even fill the v$process.latchspin during latch spinning. Little law allows us to connect average number of spinning processes with the spinning time: $N_s = \lambda T_s$ As a result the average latch acquisition time is: $T_a= \lambda^{-1} (N_s+W)$ Note that according to queuing theory the “Serve In Random Order” discipline of latch spin do not affect average latch acquisition time. It does not depend on queuing discipline. In steady state the number of processes served during the passage of incoming request through the system should be equal to the number of spinning and waiting processes. In Oracle 11g the latch spin is no longer instrumented due to a bug. The 11g spin is invisible for SQL. This do not allow us to estimate $N_s$ and related quantities. I will be very usefull if Oracle Develoment fix this and reimplements the LATCHSPIN functionality in future versions. Comparison of results: Let me compare the results of DTrace measurements and latch statistics discussed. I wrote latch_times.d script to measure both latch acquisition and holding times and their distribution. For the following demonstration, I artifically induced the library cache contention repeatedly selecting from v$sql_plan. Typical script results look like:


/usr/sbin/dtrace  -32 -s latch_times.d -p 17242 0x5B7C75F8
...
latch gets traced: 165180
latch acquisition time histogram for "Library cache latch" laddress=5b7c75f8
Acquisition time:
value  ------------- Distribution ------------- count
4096 |                                         0
8192 |@@                                       7324
16384 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@     151748
32768 |@                                        4493
65536 |                                         1676
131072 |                                         988
262144 |                                         464
524288 |                                         225
1048576 |                                         211
2097152 |                                         53
4194304 |                                         21
8388608 |                                         1
16777216 |                                         1
33554432 |                                         0

Holding time:
value  ------------- Distribution ------------- count
8192 |                                         0
16384 |@@@@@@@@@@@@@@@@@@@@@@@@@                105976
32768 |@@@@@@@@@@@@                             50877
65536 |@@                                       6962
131072 |                                         1986
262144 |                                         829
524288 |                                         330
1048576 |                                         205
2097152 |                                         34
4194304 |                                         6
8388608 |                                         0

average acquisition time =26 mus
average holding time =37 mus

Here the histogram shows latch acquisition and holding time distributions in logarithmic scale. The values are in nanoseconds. Compare the above average times with with the results of latch_stats_10g.sql script under the same conditions:


Latch statistics  for  0x5B7C75F8   "library cache"  level#=5   child#=1
Requests rate:       lambda=  20812.2 Hz
Miss /get:              rho=  .078164
Est. Utilization:   eta*rho=  .156327
Sampled   Utilization:    U=  .143333
Slps /Miss:      kappa=    .013027
Wait_time/sec:       W=    .025497
Sampled queue length L=    .043333
Spin_gets/miss:  sigma=    .987073
Sampled spinnning  :Ns=    .123
Derived statistics:
Secondary sleeps ratio =  1.02E-04
Avg latch holding time =      7.5 musec
.        sleeping time =      1.2 musec
.     acquisition time =      7.2 musec
context switch time estimation tau/S =     5.7


We can see that $\eta\rho$ and $W$ are really close to sampled $U$ and $L$ respectively. The holding and acquisition times from both methods are of the same order. Since both methods are intrusive I conclude that this is remarkable agreement. On the other hand for the purpose of contention diagnostics the main result of this post is:

The latch holding time for the contemporary servers should be normally in microseconds range.