long长事务回滚的模拟与定位
大型事务的回滚 大型事务的回滚产生非常大的代价,不仅锁定需要的资源,并且消耗的CPU和IO,尤其是IO将极为密集。尤其在KILL大型事务之前检查事务究竟有多大可能是必要的,同时我们也需要知道回滚已经进行了多少程度。一、模拟:1.删除500多万的数据。SQL> conn hr/hr;Connected.SQL> create table test as select * from dba_objects;Table created.SQL> insert into test select * from test;87055 rows created.SQL> insert into test select * from test;174110 rows created.SQL> insert into test select * from test;348220 rows created.SQL> insert into test select * from test;696440 rows created.SQL> insert into test select * from test;1392880 rows created.SQL> select count(*) from test; COUNT(*)---------- 2785760SQL> insert into test select * from test;2785760 rows created.SQL> commit;Commit complete.SQL> select count(*) from test; COUNT(*)---------- 5571520SQL> analyze table test compute statistics;Table analyzed.SQL> SQL> select sid from v$mystat where rownum=1; SID---------- 30SQL>--模拟删除,不commitSQL> delete test; 5571520 rows deleted.--另开窗口,查询kill掉会话SQL> select sid,serial#,sql_id,event,blocking_session from v$session where sid=30; SID SERIAL# SQL_ID EVENT BLOCKING_SESSION---------- ---------- ------------- ---------------------------------------------------------------- ---------------- 30 165 7qqwcq9td6akt log buffer space 11SQL> select sql_text from v$sql where sql_id='7qqwcq9td6akt';SQL_TEXT---------------------------------------------------------------------- delete testSQL> alter system kill session '30,165' immediate;System altered.--回到原来窗口验证:SQL> select count(*) from test; select count(*) from test *ERROR at line 1:ORA-03135: connection lost contactProcess ID: 28346Session ID: 30 Serial number: 165二、定位:查看回滚进度:可以通过以下两个视图查看回滚的进度,通过单位时间内恢复的undo block来估算恢复时间:1. 通过x$ktuxealter session set nls_date_format='yyyy-mm-dd hh34:mi:ss';select ADDR,KTUXEUSN,KTUXESLT,KTUXESQN,KTUXESIZ,sysdate from x$ktuxe where KTUXECFL='DEAD' and KTUXESIZ >0;SQL> alter session set nls_date_format='yyyy-mm-dd hh34:mi:ss';Session altered.SQL> select ADDR,KTUXEUSN,KTUXESLT,KTUXESQN,KTUXESIZ,sysdate from x$ktuxe where KTUXECFL='DEAD' and KTUXESIZ >0;ADDR KTUXEUSN KTUXESLT KTUXESQN KTUXESIZ SYSDATE---------------- ---------- ---------- ---------- ---------- -------------------00007F170E8AAC20 5 11 7784 84438 2017-12-09 14:19:22SQL>可以通过KTUXESLT ,KTUXESQN这两个字段,然后用以下脚本回滚得出大概需要的时间:set serveroutput ondeclare l_start number; l_end number;begin select ktuxesiz into l_start from x$ktuxe where KTUXEUSN = 5 and KTUXESLT = 11; dbms_lock.sleep(60); select ktuxesiz into l_end from x$ktuxe where KTUXEUSN = 5 and KTUXESLT = 11; dbms_output.put_line('time est Day:' || round(l_end / (l_start - l_end) / 60 / 24, 2));end;/time est Day:.01PL/SQL procedure successfully completed.SQL> SQL> 2. 通过v$fast_start_trancsations 状态为recovering表示恢复中;select USN,SLT,SEQ,STATE,UNDOBLOCKSDONE,UNDOBLOCKSTOTAL,CPUTIME,XID,sysdate from V$FAST_START_TRANSACTIONS;SQL> select USN,SLT,SEQ,STATE,UNDOBLOCKSDONE,UNDOBLOCKSTOTAL,CPUTIME,XID,sysdate from V$FAST_START_TRANSACTIONS; USN SLT SEQ STATE UNDOBLOCKSDONE UNDOBLOCKSTOTAL CPUTIME XID SYSDATE---------- ---------- ---------- ---------------- -------------- --------------- ---------- ---------------- ------------------- 5 11 7784 RECOVERING 123491 179829 120 05000B00681E0000 2017-12-09 14:20:25 通过如下视图观察回滚是串行还是并行回滚的,如下图应是并行恢复的,V$FAST_START_SERVERS provides information about all the recovery slaves performing parallel transaction recovery.通过xid字段与v$fast_start_trancsations关联。select * from v$fast_start_servers where xid in (select XID from V$FAST_START_TRANSACTIONS);SQL> select * from v$fast_start_servers where xid in (select XID from V$FAST_START_TRANSACTIONS);STATE UNDOBLOCKSDONE PID XID----------- -------------- ---------- ----------------RECOVERING 133950 20 05000B00681E00003.查看回滚对象通过dump undo block方式查看回滚的对象:首先通过v$fast_start_trancsations的usn字段查询到使用的回滚段。SQL> select * from v$rollname where usn=5; USN NAME---------- ------------------------------ 5 _SYSSMU5_898567397$ dump这个undo block,因为dump的文件很大,在查询出对象的object_id后,kill掉这个dump会话。alter system dump undo block "" XID ;SQL> alter system dump undo block "_SYSSMU5_898567397$" xid 5 11 7784; 。。。。。。。。。[oracle@wang trace]$ ls -lrttotal 635992-rw-r----- 1 oracle oinstall 81 Apr 27 2017 DBdb_ora_9045.trm-rw-r----- 1 oracle oinstall 59 Apr 27 2017 DBdb_mman_9065.trm-rw-r----- 1 oracle oinstall 60 Apr 27 2017 DBdb_ora_9084.trm-rw-r----- 1 oracle oinstall 111 Apr 27 2017 DBdb_ora_9099.trm.......................................-rw-r----- 1 oracle oinstall 12306 Dec 9 14:23 DBdb_ora_28113.trm-rw-r----- 1 oracle oinstall 536798705 Dec 9 14:23 DBdb_ora_28113.trc[oracle@wang trace]$[oracle@wang trace]$[oracle@wang trace]$[oracle@wang trace]$ grep objn DBdb_ora_28113.trc | head -5* Rec #0x11 slt: 0x0b objn: 90373(0x00016105) objd: 90373 tblspc: 4(0x00000004)* Rec #0x10 slt: 0x0b objn: 90373(0x00016105) objd: 90373 tblspc: 4(0x00000004)* Rec #0xf slt: 0x0b objn: 90373(0x00016105) objd: 90373 tblspc: 4(0x00000004)* Rec #0xe slt: 0x0b objn: 90373(0x00016105) objd: 90373 tblspc: 4(0x00000004)* Rec #0xd slt: 0x0b objn: 90373(0x00016105) objd: 90373 tblspc: 4(0x00000004)[oracle@wang trace]$[oracle@wang trace]$可以查询到objn为121192,对应的就是dba_objects的object_id,即mosongtao.rollback_test,正是前边测试的对象。再查询v$session_longops配合username,last_update_time,target,可以大概定位到执行sql_id。
注意:在查询到object_id后手工停掉dump undo block 动作
SQL> alter system dump undo block "_SYSSMU5_898567397$" xid 5 11 7784;
^C^C^C^C^C^C
alter system dump undo block "_SYSSMU5_898567397$" xid 5 11 7784
*
ERROR at line 1:
ORA-01013: user requested cancel of current operation
SQL> SQL> SQL> SQL>
4.查询sqlSQL> select SID,TARGET,SQL_ID,START_TIME,LAST_UPDATE_TIME from v$session_longops where target like '%TEST%' order by LAST_UPDATE_TIME desc; SID TARGET SQL_ID START_TIME LAST_UPDATE_---------- ---------------------------------------------------------------- ------------- ------------ ------------ 30 HR.TEST 7qqwcq9td6akt 09-DEC-17 09-DEC-17SQL> select sql_text from v$sql where sql_id='7qqwcq9td6akt';SQL_TEXT------------------------------------------------------------------------- delete testSQL>
标题名称:long长事务回滚的模拟与定位
分享URL:
http://cdxtjz.cn/article/gijihs.html