Tag Archives: library cache lock

记一次诡异的library cache lock事件

1 发现应用出现大量的library cache lock / library cache: mutex X,都是同一个sql_id 0y30pf6xwqt3x:

  SID USERNAME   MACHINE                EVENT                          PARAM                   W   WT SQL                          ST     LT LOGON_TIME
------ ---------- ---------------------- ------------------------------ -------------------- ---- ---- ---------------------------- -- ------ ----------
  1339 xxxxx_user JDBC Thin Client       library cache lock             592165555264/5905026    0    0 0y30pf6xwqt3x/c8rhf8yvfnfxg  A      70         70
                                                                        83424/5373954

  1343 xxxxx_user JDBC Thin Client       library cache lock             592165555264/5909307    0    0 0y30pf6xwqt3x/c8rhf8yvfnfxg  A      61         61
                                                                        85976/5373954

  1344 xxxxx_user JDBC Thin Client       library cache: mutex X         1442036623/904949609    0    0 0y30pf6xwqt3x/c8rhf8yvfnfxg  A     112        112
                                                                        2672/82

  1347 xxxxx_user JDBC Thin Client       library cache lock             592165555264/5909300    0    0 0y30pf6xwqt3x/459f3z9u4fb3u  A      51         94

2 查询对应的sql_id,找不到对应的sql_text

$sql 0y30pf6xwqt3x

no rows selected

SQL_ID CHILD outline/plan_hash_value Ex DISK_READS bg bg/exec rows LOAD_TIME
————- ———- —————————————————————- ———- ———- ————— ———— ———- ———–
TOTAL

No such SQL or cursor body of the SQL does not exist.

3 做了hanganalyze和systemdump
SQL> oradebug setmypid
Statement processed.
SQL> oradebug unlimit;
Statement processed.
SQL> oradebug dump hanganalyze 3
Statement processed.
SQL> oradebug dump systemstate 266
Statement processed.

awk -f ass109.awk 分析后没有发现什么异常

4 STATSPACK Statistics Report for Physical Standby因为我们这个是adg,已经部署了STATSPACK,显示’failed parse elapsed time’占用了最多的时间 :

Top 5 Timed Events                                                    Avg %Total
~~~~~~~~~~~~~~~~~~                                                   wait   Call
Event                                            Waits    Time (s)   (ms)   Time
----------------------------------------- ------------ ----------- ------ ------
library cache lock                           1,576,360     311,235    197   73.5
library cache: mutex X                       3,859,597      98,181     25   23.2
CPU time                                                    12,024           2.8
db file sequential read                      2,176,402         638      0     .2
read by other session                        1,455,746         492      0     .1

Time Model System Stats  DB/Inst: itemadg04/item  Snaps: 10255-10259
-> Ordered by % of DB time desc, Statistic name

Statistic                                       Time (s) % DB time
----------------------------------- -------------------- ---------
failed parse elapsed time                      432,886.6     102.1
parse time elapsed                             409,205.2      96.5

5 经过分析systemdump,发现了错误的sql:select 1

(session) sid: 2831 ser: 20921 trans: (nil), creator: 0x88c3105560 
flags: (0x41) USR/- flags_idl: (0x1) BSY/-/-/-/-/- 
flags2: (0x40009) -/-/INC 
DID: , short-term DID: 
txn branch: (nil) 
oct: 0, prv: 0, sql: 0x8a1fb511e8, psql: 0x89bd9a7e58, user: 45/xxxxx_user 

LibraryObjectLock: Address=0x895f96d108 Handle=0x8a1fb511e8 Mode=N CanBeBrokenCount=1 Incarnation=1 ExecutionCount=0 

User=0x88c3961660 Session=0x88c3961660 ReferenceCount=1 Flags=CNB/[0001] SavepointNum=53ed6599 
LibraryHandle: Address=0x8a1fb511e8 Hash=bbcb647d LockMode=N PinMode=0 LoadLockMode=0 Status=VALD 
ObjectName: Name=select 1 
FullHashValue=eae8b8ed4f6c136a0f0c1571bbcb647d Namespace=SQL AREA(00) Type=CURSOR(00) Identifier=3150668925 OwnerIdn=45 

产生竞争的的是一个cursor build lock: 

User=0x88c3961660 Session=0x88c3961660 ReferenceCount=1 Flags=CNB/[0201] SavepointNum=77 
LibraryHandle: Address=0x89dfd14c40 Hash=55f3bb8f LockMode=X PinMode=0 LoadLockMode=0 Status=0 
ObjectName: Name=$BUILD$. f0c1571bbcb647d 
FullHashValue=5185a2c8f9d62898c21e9d0c55f3bb8f Namespace=SQL AREA BUILD(82) Type=CURSOR(00) Identifier=0 OwnerIdn=0 

6 原因总结:
这个语句来自我们的一个jdbc连接池的检查语句,错误的把mysql的放到oracle上了:
#检查数据库连接的sql(注意:以下例句只适合MySQL,Oracle为select 1 from dual)
jdbc.validationQuery=select 1

导致oracle大量的解析错误,所以能看到sql_id,但是我们看不到sql_text.
同时因为这个是物理备库,我们的错误sql trigger是记录不了的。

另外,我们还遇到过另外一种情况可能找不到sql_text:
http://www.vmcd.org/2011/12/%E4%B8%80%E6%AC%A1lob%E5%AD%97%E6%AE%B5%E7%9A%84%E6%9F%A5%E8%AF%A2%E5%AF%BC%E8%87%B4%E7%9A%84%E7%B3%BB%E7%BB%9F%E9%97%AE%E9%A2%98%E4%BB%A5%E5%8F%8Asql_id%E5%8F%98%E5%8C%96%E7%9A%84%E6%8E%A2%E7%A9%B6/