诊断dz 搜索结果页美化dz什么意思

博客分类:
这两天一只对外提供查询的数据库CPU使用率频繁攀升到100%,客户记得焦头烂额,总希望我抓点sql让开发商优化。和客户通完***后,我心里想到,这烂系统,抓几个sql顶什么用,问题早就提过好几次了,每次都不了了之,出了问题就知道在那瞎忙,找点表面问题修修补补,本质问题从来都是置之不理。一通抱怨后,开始逐步分析,人就是这样,吃人嘴软,谁让客户是上帝呢?抱怨归抱怨,工作还是要认认真真去对待的,分析报告如下,抛砖引玉,如有错误,望批评指正,谢谢!
系统环境:AIX 6.1
Oracle 10g 10.2.0.5.4
11:00~~12:00 AWR
通过以上等待事件的对比可以发现,CPU等待事件明显,同时都伴随着gc cr multi block request和db file sequential read等待事件。CPU等待事件与应用上表现出CPU占用率100%的现象相吻合。结合gc cr multi block request和db file sequential read事件明显这个因素,推测是由于节点之间频繁交换数据块(构造gc cr时所进行的请求和调度需要消耗CPU时间)和磁盘与内存直接频繁读写(内存的分配与撤销同样需要消耗CPU时间)。
查看磁盘信息如下
AIX fjlt_zgcx_db02
1 6 00F65AD44C00
configuration: lcpu=32 drives=150
mode=Capped
可以看到虽然本机连接的磁盘有140
余块,但读写主要发生在disk18,disk22,disk44,disk57,disk61
查看热点对象
linesize 180
object_name for a40
SQL& select *
ob.owner, ob.object_name, sum(b.tch) Touchs
x$bh b , dba_objects ob
b.obj = ob.data_object_id
group by ob.owner, ob.object_name
order by sum(tch) desc)
where rownum &=10 ;
OBJECT_NAME
------------------------------
---------------------------------------- ----------
FP_DEAL_INFO
IDX_SB_ZSXX_FQ_DZDAH
HD_DQDEHDB
IDX_SB_SBXX_DZDAH_SSQ
DJ_SZ_JBXX
IND_SB_ZSXX_FQ_NSRSWJG
IDX_FPDEALINFO_NSRDZDAH
可以看到DJ_NSRXX
这两张表为热点对象
SQL& select TABLE_NAME ,INDEX_NAME
,TABLESPACE_NAME from dba_indexes where table_name='DJ_NSRXX'
TABLE_NAME
INDEX_NAME
TABLESPACE_NAME
------------------------------
------------------------------ ------------------------------
IDX_DJ_NSRXX_NSRDNBM
XMZG_TEM_DAT
IDX_DJ_NSRXX_NSRSBH
XMZG_TEM_DAT
PK_DJ_NSRXX
XMZG_TEM_DAT
PK_DJ_NSRXX
FJLTAIS_IDX
IDX_DJ_NSRXX_DJZCLX_DM
FJLTAIS_IDX
IDX_DJ_NSRXX_HY_DM
FJLTAIS_IDX
IDX_DJ_NSRXX_JDXZ_DM
FJLTAIS_IDX
IDX_DJ_NSRXX_LRR_DM
FJLTAIS_IDX
IDX_DJ_NSRXX_NSRDNBM
FJLTAIS_IDX
IDX_DJ_NSRXX_NSRSBH
FJLTAIS_IDX
IDX_DJ_NSRXX_NSRZT_DM
FJLTAIS_IDX
TABLE_NAME
INDEX_NAME
TABLESPACE_NAME
------------------------------
------------------------------ ------------------------------
IDX_DJ_NSRXX_NSR_SWJG_DM
FJLTAIS_IDX
IDX_DJ_NSRXX_WSPZXH
FJLTAIS_IDX
IDX_DJ_NSRXX_ZGSWRY_DM
FJLTAIS_IDX
IDX_DJ_NSRXX_ZJHM
FJLTAIS_IDX
IDX_DJ_NSRXX_NSRMC
FJLTAIS_IDX
IDX_DJ_NSRXX_NSRDNBM
FJLTAIS_IDX
IDX_DJ_NSRXX_DJZCLX_DM
IDX_DJ_NSRXX_HY_DM
IDX_DJ_NSRXX_JDXZ_DM
IDX_DJ_NSRXX_LRR_DM
IDX_DJ_NSRXX_NSRSBH
FJLTAIS_IDX
TABLE_NAME
INDEX_NAME
TABLESPACE_NAME
------------------------------
------------------------------ ------------------------------
IDX_DJ_NSRXX_NSRMC
IDX_DJ_NSRXX_SWDJBLX
FJLTAIS_IDX
IDX_DJ_NSRXX_NSRZT_DM
IDX_DJ_NSRXX_NSR_SWJG_DM
IDX_DJ_NSRXX_WSPZXH
IDX_DJ_NSRXX_ZGSWRY_DM
IDX_DJ_NSRXX_ZJHM
PK_DJ_NSRXX
FJLTAIS_DAT
IDX_DJ_NSRXX_DJZCLX_DM
IDX_DJ_NSRXX_HY_DM
IDX_DJ_NSRXX_JDXZ_DM
IDX_DJ_NSRXX_LRR_DM
IDX_DJ_NSRXX_NSRMC
IDX_DJ_NSRXX_NSRZT_DM
IDX_DJ_NSRXX_NSR_SWJG_DM
IDX_DJ_NSRXX_WSPZXH
IDX_DJ_NSRXX_ZGSWRY_DM
IDX_DJ_NSRXX_ZJHM
PK_DJ_NSRXX
IDX_DJ_NSRXX_NSRDNBM
IDX_DJ_NSRXX_NSRSBH
PK_DJ_NSRXX
这些索引并没有正确的存储在索引表空间FJLTAIS_IDX
上,查看消耗CPU
时间最多的sql
SELECT (select NSRDNBM from dj_nsrxx where nsrdzdah=a.nsrdzdah) AS NSRDNBM, A.MC AS NSRMC, (select NSRSBH from dj_nsrxx where nsrdzdah=a.nsrdzdah) AS NSRSBH, A.ZJHM AS ZJHM, A.XSSR AS JSJE, A.SL AS SL, A.YNSE AS YNSE, A.SE AS SE, TO_CHAR(A.SBRQ, 'YYYY-MM-DD') AS SBRQ, TO_CHAR(A.SSSQ_Q, 'YYYY-MM-DD') || ' - ' || TO_CHAR(A.SSSQ_Z, 'YYYY-MM-DD') AS SKSSQ, TO_CHAR(A.XJQX, 'YYYY-MM-DD') AS XJQX, TO_CHAR(A.SJRQ_JZ, 'YYYY-MM-DD') AS SJRQ, TO_CHAR(A.RKRQ, 'YYYY-MM-DD') AS RKRQ, TO_CHAR(A.RKRQ_JZ, 'YYYY-MM-DD') AS RKRQ_JZ, TO_CHAR(A.KPRQ, 'YYYY-MM-DD') AS KPRQ, P_GET_CODENAME('DM_SBFS', A.SBFS_DM) AS SBFS_MC, P_GET_CODENAME('DM_SKZT', A.SKZT_DM) AS SKZT_MC, P_GET_CODENAME('DM_SKSX', A.SKSX_DM) AS SKSX_MC, (SELECT ZSPM_MC FROM DM_ZSPM WHERE ZSXM_DM =A.ZSXM_DM AND ZSPM_DM=A.ZSPM_DM) AS ZSPM_MC, A.NSR_SWJG_DM AS NSR_SWJG_DM, A.ZSXM_DM AS ZSXM_DM, A.YSKM_DM AS YSKM_DM, A.YSFPBL_DM AS YSFPBL_DM, A.JKPZLRR_DM AS JKPZLRR_DM, A.SJXHR_DM AS SJXHR_DM, A.RKXHR_DM AS RKXHR_DM, A.ZGSWRY_DM AS ZGSWRY_DM, A.LRR_DM AS LRR_DM, A.SKSS_SWJG_DM AS SKSS_SWJG_DM, A.ZSJG_DM AS ZSJG_DM, A.JKPZZL_DM AS JKPZZL_DM, TO_CHAR(A.JKPZXH) AS JKPZXH, A.ZG AS ZG, A.ZH AS ZH, A.SKGK_DM as skgkDm, DECODE( B.DDLXQY_BZ, 'Z', '????????', 'S', '????????', '') AS "ddlxqyBz" FROM SB_ZSXX A, DJ_DDLXQY B, (SELECT E.SWJG_DM FROM DM_SWJG E WHERE E.SWJG_BZ = 'J' CONNECT BY SJ_SWJG_DM = PRIOR SWJG_DM START WITH SWJG_DM = :1) C WHERE A.NSR_SWJG_DM = C.SWJG_DM AND A.NSRDZDAH = B.NSRDZDAH(+) AND A.TZLX_DM IN('1', '4') AND A.SE!=0 AND A.YZFSRQ_JZ IS NOT NULL AND A.SBRQ &= TO_DATE(:2, 'YYYY-MM-DD') AND A.SBRQ &= TO_DATE(:3, 'YYYY-MM-DD') AND A.ZSXM_DM = :4 AND A.DQ=:5
执行计划如下
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------------------------
| Operation
| Bytes |TempSpc| Cost (%CPU)|
| Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------------------------------------
0 | SELECT STATEMENT
(1)| 00:03:32 |
TABLE ACCESS BY INDEX
| GF_CBDJYB_DZ
(0)| 00:00:01 |
INDEX RANGE SCAN
| IDX_GF_CBDJYB_DZ_YBIDXH
(0)| 00:00:01 |
TABLE ACCESS BY INDEX
| GF_CBDJYB_DZ
(0)| 00:00:01 |
INDEX RANGE SCAN
IDX_GF_CBDJYB_DZ_YBIDXH
(0)| 00:00:01 |
SORT AGGREGATE
TABLE ACCESS FULL
| GF_CBDJYB_DZ
(1)| 00:00:04 |
SORT AGGREGATE
TABLE ACCESS FULL
| GF_CBDJYB_DZ
(1)| 00:00:04 |
HASH JOIN SEMI
(1)| 00:03:02 |
2104K| 12688
(1)| 00:02:33 |
NESTED LOOPS
(1)| 00:02:02 |
(0)| 00:00:01 |
CONNECT BY WITH FILTERING
TABLE ACCESS BY INDEX
(0)| 00:00:01 |
INDEX UNIQUE SCAN
| PK_DM_SWJG
(0)| 00:00:01 |
NESTED LOOPS
CONNECT BY PUMP
TABLE ACCESS BY
INDEX ROWID
(0)| 00:00:01 |
INDEX RANGE SCAN
| IDX_DM_SWJG_SJ_SWJG_DM
(0)| 00:00:01 |
PARTITION RANGE
(0)| 00:00:09 |
TABLE ACCESS BY LOCAL INDEX ROWID|
(0)| 00:00:09 |
INDEX RANGE SCAN
| IDX_DJ_NSRXX_NSR_SWJG_DM
(0)| 00:00:01 |
TABLE ACCESS FULL
| GF_NFRXX
(1)| 00:00:22 |
TABLE ACCESS FULL
(2)| 00:00:30 |
TABLE ACCESS BY INDEX
| DJ_NSRXX_KZ
(0)| 00:00:01 |
INDEX UNIQUE SCAN
| PK_DJ_NSRXX_KZ
(0)| 00:00:01 |
-------------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by
operation id):
---------------------------------------------------
1 - filter("CBDJYIDZ"."NSRDZDAH"=:B1 AND
"CBDJYIDZ"."ZIDBZ"='Y' AND
"CBDJYIDZ"."YXBZ"='Y' AND
"CBDJYIDZ"."XYBZ"='Y')
2 - access("CBDJYIDZ"."YBIDXH"=:B1)
3 - filter("CBDJYIDZ"."NSRDZDAH"=:B1 AND
"CBDJYIDZ"."ZIDBZ"='Y' AND
"CBDJYIDZ"."YXBZ"='Y' AND
"CBDJYIDZ"."XYBZ"='Y')
4 - access("CBDJYIDZ"."YBIDXH"=:B1)
6 - filter("CBDJYIDZ"."NSRDZDAH"=:B1 AND
"CBDJYIDZ"."ZIDBZ"&&'Y' AND
"CBDJYIDZ"."YXBZ"='Y' AND
"CBDJYIDZ"."XYBZ"='Y')
8 - filter("CBDJYIDZ"."NSRDZDAH"=:B1 AND
"CBDJYIDZ"."ZIDBZ"&&'Y' AND
"CBDJYIDZ"."YXBZ"='Y' AND
"CBDJYIDZ"."XYBZ"='Y')
access("DJXZ"."NSRDZDAH"="NSRXX"."NSRDZDAH")
access("NSRXX"."NSRDZDAH"="NFRXX"."NSRDZDAH")
14 - filter("YXBZ"='Y')
15 - access("SJ_SWJG_DM"=PRIOR "SWJG_DM")
17 - access("SWJG_DM"='')
21 - access("SJ_SWJG_DM"=PRIOR "SWJG_DM")
23 - filter("NSRXX"."HJBZ_DM"&&'000' AND
"NSRXX"."HJBZ_DM"&&'001')
access("F"."SWJG_DM"="NSRXX"."NSR_SWJG_DM")
25 - filter("NFRXX"."NFRZT_DM"&&'50' AND
"NFRXX"."NFRZT_DM"&&'10')
26 - filter("DJXZ"."XZ"='45' AND
"DJXZ"."SXBZ"='Y' AND
"DJXZ"."YXBZ"='Y')
access("NSRXX"."NSRDZDAH"="NSRXXKZ"."NSRDZDAH")
54 rows selected.
可以看到,使用了IDX_DJ_NSRXX_NSR_SWJG_DM
索引,但由于没有有效分离表和索引的存储,有可能造成db file sequential read
顺序读取磁盘等待事件。个人认为就本例而言,db file sequential read
关系不会太大,但这也是值得注意的一方面。
使用vmstat 1
查看系统内存及cpu
并没有发生内存不足的现象,但CPU
却有高达38
的进程等待
查询了下当时系统中活动的session
,发现只在90
左右。并发数不高(查询记录没有及时保存)
09:00~~10:00 AWR
11:00~~12:00 AWR
可以看到cache buffers chains
很高,该等待事件是由于不同进程在buffer
中争用同一个数据块导致的,即热点对象,很容易引起CPU
进程堵塞,使用率升高。
查看数据库热点对象如下
热点对象有DJ_NSRXX
,FP _DEAL_INFO
查看这些表的pct_free
SQL& select table_name,pct_free from
dba_tables where table_name in ('DJ_NSRXX','SB_ZSXX','FP _DEAL_INFO');
TABLE_NAME
------------------------------
----------
8 rows selected.
这些表的pct_free
,可以考虑扩大pct_free
值,使数分布到多个数据块中以减少热点块争用。Pct_free
值需要逐步调整并同时进行观察才可确认调整为何值合适。
09:00~~10:00 AWR
Estd Interconnect traffic (KB)
119,976.69
11:00~~12:00 AWR
Estd Interconnect traffic (KB)
142,074.05
可以看到Estd Interconnect traffic (KB)
非常的高,表示节点之间频繁数据块,进行gc cr
块的构造。而且从上文提到的等待事件也可以看出,gc cr multi block request
中都有体现,是主要等待事件。
09:00~~10:00 AWR
11:00~~12:00 AWR
Logical reads
gc cr multi block request
等待事件相吻合。如上文所述,
gc cr multi block request
对内存的调度和管理,会消耗CPU
时间。结合gc cr multi block request
和逻辑读非常高以及
cache buffers chains
的情况来看,个人认为,gc cr multi block request
和热点块共同造成了CPU 100%
的问题,且gc cr multi block request
为主要问题,主要依据
:Estd Interconnect traffic (KB)异常的高,而且一般
cache buffers chains引起的问题都会在top 5等待事件中体现,故判断
gc cr multi block request为主要原因。
个人建议开放商在调整sql
时避免全表扫描,这样会避免内存的频繁调度。
gc cr multi block request
问题应在rac
层面上进行应用分离,即不同节点处理不同应用,节点之间通过配置,做为彼此的备用节点,在节点宕机时可以结果相关应用,提供高可用性。事实上在整理这篇报告给客户的时候也整理出了一些SQL ordered by CPU Time给客户,毕竟客户要的就是这个吗,总得满足下人家的需求
本文原创,转载请注明出处、作者
如有错误,欢迎指正
浏览 10123
浏览: 2694838 次
来自: 厦门
谢谢非常有用那
写的很详细
学习了,学习了
大写的赞..
作为初学者,我表示写得非常好,正在疑惑这些参数的意义呢!

参考资料

 

随机推荐