7343696
2023/02/03阅读:95主题:默认主题
特殊恢复:ORA-00704、ORA-00604、ORA-01555的分析与处理
在上一篇文章中,我们介绍了数据库在启动时触发ORA-00704、ORA-00604、ORA-01555报错时,怎么通过Trace文件定位触发报错的sql的执行计划、等待事件及访问的数据块。这边文章继续上篇文章内容,讲讲在数据库启动过程中触发ORA-00704、ORA-00604、ORA-01555报错时,怎么去分析、定位原因、解决问题。报错的信息如下:
www.htz.pw > alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 6 with name "_SYSSMU6_3654194381$" too small
Process ID: 9202
Session ID: 1 Serial number: 3
4个ORA报错,怎么找到那个报错是原因呢?规则就是从最后一个报错往前面看,ORA-01555是整个报错的原因,其它ORA都是因为01555报错而引起的现象。这个报错的说明就是:数据库遇到回滚段过程过小,导致SQL执行失败,引起数据库启动初始化失败,数据库被异常关闭。所以在分析时,我们首先分析产生ORA-01555的原因。
下面简单回顾一下ORA-01555的报错。
ORA-01555是快照过久而触发的报错。其实可以认为是执行SQL时,需要访问历史版本的数据,此时利用UNDO来回滚数据块,在UNDO段中无找到需要的回滚信息而触发报错。
因Oracle 利用undo记录回滚有两种机制:
-
大对象段 大对象段利用数据块内的历史镜像来实现回滚,同PG的回滚机制类似。 -
非大对象段 非大对象段利用undo段来实现回滚,在Oracle中,undo段又分为系统表空间中的undo段和undo表空间中的undo段,所以针对系统表空间中的系统对象所在块的回滚会使用系统undo段,针对普通用户使用undo表空间的undo段。
回滚信息无法找到的原因有很多,这里举例简单说明几个现象:
-
undo段头的scn值异常,导致根据数据块xid找到的undo段不满足要求。 -
undo条目不存在,在undo段中找不到条目的信息。 -
undo段本身不存在 -
undo段有坏块。
那是否undo回滚记录找不到,就一定会触发ORA-01555的报错呢?这个是不一定的。
今天重点是分析数据库在启动过程中重发的ORA-01555的报错,这个ORA-01555的报错一般是因为系统对象所在的块需要做回滚,但是在SYSTEM的undo段中无法找到回滚记录而触发的报错。
什么条件下会触发呢?
-
SQL执行时遇到未提交的事务,此时需要回滚事务,就会使用到undo的信息。 -
SQL执行时需要读取前镜像的数据,此时需要回滚数据块,就会使用到undo的信息。
上面两种情况对应的解决方案也各不一样。
本次案例主要是分享有未提交的事务,导致数据库在启动过程中触发ORA-01555的报错。这个案例也是来至于一个网友,数据库无法打开,希望能帮助他把数据库打开。因为数据库比较小,所以网友是通过云盘的方式共享所有的数据文件,我在自己的环境中操作的。
1,平台与版本
www.htz.pw > select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
CORE 11.2.0.3.0 Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production
www.htz.pw > !lsb_release -a
LSB Version: :core-3.0-amd64:core-3.0-ia32:core-3.0-noarch:graphics-3.0-amd64:graphics-3.0-ia32:graphics-3.0-noarch
Distributor ID: RedHatEnterpriseAS
Description: Red Hat Enterprise Linux AS release 4 (Nahant Update 8)
Release: 4
Codename: NahantUpdate8
2,修改数据文件位置
两个环境的路径不一致,需要修改数据文件的位置和一些参数。
[oracle@www.htz.pw temp]$cp database.zip /oracle/app/oracle/oradata/
[oracle@www.htz.pw temp]$cd !$
cd /oracle/app/oracle/oradata/
[oracle@www.htz.pw oradata]$unzip database.zip
修改后的参数文件如下:
[oracle@www.htz.pw oradata]$cat /tmp/123.ora
cos11g.__db_cache_size=201326592
cos11g.__java_pool_size=4194304
cos11g.__large_pool_size=4194304
cos11g.__oracle_base='/oracle/app/oracle'#ORACLE_BASE set from environment
cos11g.__pga_aggregate_target=272629760
cos11g.__sga_target=511705088
cos11g.__shared_io_pool_size=0
cos11g.__shared_pool_size=289406976
cos11g.__streams_pool_size=4194304
*.audit_file_dest='/oracle/app/oracle/admin/cos11g/adump'
*.audit_trail='NONE'
*.compatible='11.2.0.0.0'
*.control_files='/oracle/app/oracle/oradata/database/cos11g/control01.ctl'
*.db_block_size=8192
*.db_domain=''
*.db_name='cos11g'
*.db_recovery_file_dest='/oracle/app/oracle/fast_recovery_area'
*.db_recovery_file_dest_size=4322230272
*.diagnostic_dest='/oracle/app/oracle'
*.dispatchers='(PROTOCOL=TCP) (SERVICE=cos11gXDB)'
*.memory_target=783286272
*.open_cursors=300
*.processes=150
*.remote_login_passwordfile='EXCLUSIVE'
*.undo_tablespace='UNDOTBS1'
*._allow_resetlogs_corruption=TRUE
*._allow_error_simulation=TRUE
*.undo_management='manual'
通过重建控制文件来修改数据文件的位置,修改数据文件位置方法很多,如alter database rename file的方式都可以的。
[oracle@www.htz.pw ~]$cd rs/sql
使用create_controlfile_sql.sh脚本来自动生成控制文件的内容,脚本就是调用alter database backup controflile to trace as '/tmp/control.ctl';
[oracle@www.htz.pw sql]$sh ./create_controlfile_sql.sh
please input direcotry default /tmp:
please input file name default control.ctl:
Database altered.
www.htz.pw > create spfile from pfile='/tmp/123.ora';
File created.
www.htz.pw > shutdown abort;
ORACLE instance shut down.
www.htz.pw > @/tmp/control.ctl
ORACLE instance started.
Total System Global Area 780824576 bytes
Fixed Size 2232432 bytes
Variable Size 574623632 bytes
Database Buffers 201326592 bytes
Redo Buffers 2641920 bytes
控制文件的内容:
STARTUP NOMOUNT
CREATE CONTROLFILE REUSE DATABASE "COS11G" NORESETLOGS NOARCHIVELOG
MAXLOGFILES 16
MAXLOGMEMBERS 3
MAXDATAFILES 100
MAXINSTANCES 8
MAXLOGHISTORY 292
LOGFILE
GROUP 1 '/oracle/app/oracle/oradata/database/cos11g/redo01.log' SIZE 50M BLOCKSIZE 512,
GROUP 2 '/oracle/app/oracle/oradata/database/cos11g/redo02.log' SIZE 50M BLOCKSIZE 512,
GROUP 3 '/oracle/app/oracle/oradata/database/cos11g/redo03.log' SIZE 50M BLOCKSIZE 512
DATAFILE
'/oracle/app/oracle/oradata/database/cos11g/system01.dbf',
'/oracle/app/oracle/oradata/database/cos11g/sysaux01.dbf',
'/oracle/app/oracle/oradata/database/cos11g/undotbs01.dbf',
'/oracle/app/oracle/oradata/database/cos11g/users01.dbf',
'/oracle/app/oracle/oradata/database/cos11g/example01.dbf'
CHARACTER SET AL32UTF8
;
RECOVER DATABASE
ALTER DATABASE OPEN;
www.htz.pw > shutdown abort;
ORACLE instance shut down.
www.htz.pw > @/tmp/control.ctl
ORACLE instance started.
Total System Global Area 780824576 bytes
Fixed Size 2232432 bytes
Variable Size 574623632 bytes
Database Buffers 201326592 bytes
Redo Buffers 2641920 bytes
CREATE CONTROLFILE REUSE DATABASE "COS11G" NORESETLOGS NOARCHIVELOG
*
ERROR at line 1:
ORA-01503: CREATE CONTROLFILE failed
ORA-00367: checksum error in log file header
ORA-01517: log member: '/oracle/app/oracle/oradata/database/cos11g/redo01.log'
这里看到报日志文件的checksum错误,因为原来的日志文件已经被重建,所以这里需要将控制文件的NORESETLOGS更改为RESETLOGS就可以了
ORA-01507: database not mounted
ALTER DATABASE OPEN
*
ERROR at line 1:
ORA-01507: database not mounted
更改控制文件
www.htz.pw > !vi /tmp/control.ctl
STARTUP NOMOUNT
CREATE CONTROLFILE REUSE DATABASE "COS11G" RESETLOGS NOARCHIVELOG
MAXLOGFILES 16
MAXLOGMEMBERS 3
MAXDATAFILES 100
MAXINSTANCES 8
MAXLOGHISTORY 292
LOGFILE
GROUP 1 '/oracle/app/oracle/oradata/database/cos11g/redo01.log' SIZE 50M BLOCKSIZE 512,
GROUP 2 '/oracle/app/oracle/oradata/database/cos11g/redo02.log' SIZE 50M BLOCKSIZE 512,
GROUP 3 '/oracle/app/oracle/oradata/database/cos11g/redo03.log' SIZE 50M BLOCKSIZE 512
DATAFILE
'/oracle/app/oracle/oradata/database/cos11g/system01.dbf',
'/oracle/app/oracle/oradata/database/cos11g/sysaux01.dbf',
'/oracle/app/oracle/oradata/database/cos11g/undotbs01.dbf',
'/oracle/app/oracle/oradata/database/cos11g/users01.dbf',
'/oracle/app/oracle/oradata/database/cos11g/example01.dbf'
CHARACTER SET AL32UTF8
;
RECOVER DATABASE
ALTER DATABASE OPEN;
www.htz.pw > shutdown abort;
ORACLE instance shut down.
www.htz.pw > @/tmp/control.ctl
ORACLE instance started.
Total System Global Area 780824576 bytes
Fixed Size 2232432 bytes
Variable Size 574623632 bytes
Database Buffers 201326592 bytes
Redo Buffers 2641920 bytes
Control file created.
ORA-00283: recovery session canceled due to errors
ORA-01610: recovery using the BACKUP CONTROLFILE option must be done
ALTER DATABASE OPEN
*
ERROR at line 1:
ORA-01589: must use RESETLOGS or NORESETLOGS option for database open
3,开始正常的恢复
www.htz.pw > select open_mode from v$database;
OPEN_MODE
--------------------
MOUNTED
www.htz.pw > recover database using backup controlfile until cancel;
ORA-00279: change 2223501 generated at 08/28/2014 11:22:42 needed for thread 1
ORA-00289: suggestion :
/oracle/app/oracle/fast_recovery_area/COS11G/archivelog/2014_08_28/o1_mf_1_1_%u_
.arc
ORA-00280: change 2223501 for thread 1 is in sequence #1
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1:
'/oracle/app/oracle/oradata/database/cos11g/system01.dbf'
ORA-01112: media recovery not started
www.htz.pw > alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 6 with name "_SYSSMU6_3654194381$" too small
Process ID: 9202
Session ID: 1 Serial number: 3
这里我们看到了触发了ORA-00704,ORA-00604,ORA-01555的报错,原因是由于oracle2进制中的一条sql语句执行,执行了ORA-01555报错,ORA-01555报错常见原因如下:
1,延迟块清除,这种情况我们通过增加SCN就可以解决。 2,事务回滚,这种情况我们通过BBED来处理就可以了。
此故障在故障中,是很常见的,但是有些特殊情况下,恢复起来不是那么容易,曾经遇到过一个数据库,bbed修改了2个小时的块,最后才将数据库打开
4,初步故障处理过程
4.1 修改undo_tablespace参数
这种修改undo_tablespace参数只是为了试一下
*.undo_tablespace='SYSTEM'
故障现象仍然存在
4.2 修改undo段的参数
_offline_rollback_segments=(_SYSSMU6_3654194381$)
_corrupted_rollback_segments=(_SYSSMU6_3654194381$)
此两个参数的作用是不一样的,一个是将整个UNDO段认识是损坏的,一个是可以正常读取UNDO段头的。更多关于此参数的作用可以见 ROLLBACK SEGMENT CORRUPTION with _OFFLINE & _CORRUPTED_ROLLBACK_SEGMENTS options (Doc ID 152085.1) 通过上面的2个参数,故障现象仍然存在。
5,故障分析过程
通过上面简单的2步解决,故障仍然存在,所以需要进一步详细的分析一下报错的详细过程与原因
5.1 查看alert文件
下面查看一个alert中的日志内容,获取更多详细的信息
SMON: enabling cache recovery
ORA-01555 caused by SQL statement below (SQL ID: 3nkd3g3ju5ph1, SCN: 0x0000.0021ed9c):
select obj#,type#,ctime,mtime,stime, status, dataobj#, flags, oid$, spare1, spare2 from obj$ where owner#=:1 and name=:2 and namespace=:3 and remoteowner is null and linkname is null and subname is null
Errors in file /oracle/app/oracle/diag/rdbms/cos11g/cos11g/trace/cos11g_ora_10655.trc:
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 6 with name "_SYSSMU6_3654194381$" too small
Errors in file /oracle/app/oracle/diag/rdbms/cos11g/cos11g/trace/cos11g_ora_10655.trc:
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 6 with name "_SYSSMU6_3654194381$" too small
Error 704 happened during db open, shutting down database
USER (ospid: 10655): terminating the instance due to error 704
从10655.trc日志中可以发现下面的信息
KCBR: Influx buffers flushed = 1 times
*** 2014-08-28 12:21:49.070
Completed Media Recovery
*** 2014-08-28 12:21:54.225
Prior to RESETLOGS processing...
ALTER SYSTEM ARCHIVE LOG ALL USING BACKUP CONTROLFILE start
Database is not in archivelog mode
ALTER SYSTEM ARCHIVE LOG ALL USING BACKUP CONTROLFILE complete
*** 2014-08-28 12:21:54.226 4320 krsh.c
Clearing online redo logfile 1 /oracle/app/oracle/oradata/database/cos11g/redo01.log
*** 2014-08-28 12:21:54.379 4320 krsh.c
Clearing online redo logfile 1 complete
*** 2014-08-28 12:21:54.380
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 6 with name "_SYSSMU6_3654194381$" too small
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 6 with name "_SYSSMU6_3654194381$" too small
*** 2014-08-28 12:21:54.458
USER (ospid: 10655): terminating the instance due to error 704
这么在报错的时候没有触发errorstack,日志中无任何有用的信息
5.2 配置event获取详细的日志内容
下面我们配置了10046与errorstack的event来获取详细的信息
www.htz.pw > oradebug setmypid
Statement processed.
www.htz.pw > oradebug event 1555 trace name errorstack level 12;
Statement processed.
www.htz.pw > oradebug event 10046 trace name context forever,level 12;
Statement processed.
www.htz.pw > recover database using backup controlfile until cancel;
ORA-00279: change 2223517 generated at 08/28/2014 12:26:58 needed for thread 1
ORA-00289: suggestion :
/oracle/app/oracle/fast_recovery_area/COS11G/archivelog/2014_08_28/o1_mf_1_1_%u_
.arc
ORA-00280: change 2223517 for thread 1 is in sequence #1
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1:
'/oracle/app/oracle/oradata/database/cos11g/system01.dbf'
ORA-01112: media recovery not started
www.htz.pw > alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 6 with name
"_SYSSMU6_3654194381$" too small
Process ID: 12125
Session ID: 1 Serial number: 3
5.3 详细分析trace文件
下面详细的分析trace文件中的信息,找到ORA-01555报错的原因
5.3.1 获取trace文件中的块的信息
因为是ORA-01555的报错,所以这里首先看的是块的信息,能大概知道那个块导致的问题。通过块的信息,我们估计是由于0x00405144这个块导致报错,不过这里没有直接去看块的内容,继续从trace文件中按正常的分析思路走。
[root@www.htz.pw ~]#grep -E "^Block he|^0x0" /oracle/app/oracle/diag/rdbms/cos11g/cos11g/trace/cos11g_ora_12125.trc
Block header dump: 0x00405144
0x01 0x0002.015.0000009f 0x00c03a8d.001c.01 CB-- 0 scn 0x0000.0002a810
0x02 0x0006.014.00000442 0x00c001b8.010e.11 ---- 1 fsc 0x0000.00000000
Block header dump: 0x00400132
0x01 0x0006.006.00000056 0x00c01899.0013.01 CB-- 0 scn 0x0000.0001784d
0x02 0x0000.000.00000000 0x00000000.0000.00 ---- 0 fsc 0x0000.00000000
Block header dump: 0x0040012b
0x01 0x0000.050.00000007 0x00400083.000c.27 --U- 1 fsc 0x0000.000005c8
0x02 0x0000.045.00000007 0x00400081.000c.46 --U- 5 fsc 0x0000.000005ca
Block header dump: 0x00400181
0x01 0x000a.00d.00000270 0x00c00531.0078.02 C--- 0 scn 0x0000.000f8ac8
Block header dump: 0x00400182
0x01 0x0008.009.0000000e 0x00c00252.0005.01 CBU- 0 scn 0x0000.00004576
0x02 0x0008.002.0000000e 0x00c001f7.0004.0f C-U- 0 scn 0x0000.00004576
Block header dump: 0x00400095
0x01 0x0007.00c.00000271 0x00c00a98.0080.04 C--- 0 scn 0x0000.000fbb30
0x02 0x0007.00b.00000271 0x00c00df9.0083.1d --U- 11 fsc 0x0027.00105067
Block header dump: 0x004000a9
0x01 0x0004.001.00000266 0x00c15c4a.0066.02 C--- 0 scn 0x0000.000ec0d5
Block header dump: 0x004000aa
0x01 0x0004.01a.00000013 0x00c00450.000b.01 CBU- 0 scn 0x0000.00004f16
0x02 0x0007.00e.00000271 0x00c00a98.0080.0f --U- 1 fsc 0x0000.000fbb44
Block header dump: 0x00400159
0x01 0x0006.015.00000398 0x00c10fec.0085.02 C--- 0 scn 0x0000.000d8b9a
Block header dump: 0x0040b2eb
0x01 0x0006.009.000003cd 0x00c000de.0099.03 C--- 0 scn 0x0000.000f3adb
Block header dump: 0x00405144
0x01 0x0002.015.0000009f 0x00c03a8d.001c.01 CB-- 0 scn 0x0000.0002a810
0x02 0x0006.014.00000442 0x00c001b8.010e.11 ---- 1 fsc 0x0000.00000000
Block header dump: 0x00400141
0x01 0x0000.000.00000000 0x00000000.0000.00 ---- 0 fsc 0x0000.00000000
0x02 0x0000.019.00000012 0x00400217.001a.06 --U- 1 fsc 0x0000.000f24ae
Block header dump: 0x004000e1
0x01 0x0000.018.00000014 0x00400220.001c.01 --U- 1 fsc 0x0000.00105ea7
5.3.2 获取当前会话的游标信息
因为是SQL语句触发的ORA-01555,所以这里直接去查询的当前会话的游标的信息,可以一步定位,由于版本不同,可能这一部分的信息会不一样。
******************************************************
----- Session Open Cursors -----
----------------------------------------
Cursor#2(0x2a97e11bb8) state=FETCH curiob=0x2a97e25ea0
curflg=a007 fl2=6200000 par=0x2a97e11c48 ses=0x8dc4bd60
----- Dump Cursor sql_id=3nkd3g3ju5ph1 xsc=0x2a97e25ea0 cur=0x2a97e11bb8 -----
LibraryHandle: Address=0x8e53caa0 Hash=e3a2d601 LockMode=N PinMode=0 LoadLockMode=0 Status=VALD
ObjectName: Name=select obj#,type#,ctime,mtime,stime, status, dataobj#, flags, oid$, spare1, spare2 from obj$ where owner#=:1 and name=:2 and namespace=:3 and remoteowner is null and linkname is null and subname is null
FullHashValue=cef0429a290691c83a49a378e3a2d601 Namespace=SQL AREA(00) Type=CURSOR(00) Identifier=3819099649 OwnerIdn=0
Statistics: InvalidationCount=0 ExecutionCount=7 LoadCount=2 ActiveLocks=1 TotalLockCount=1 TotalPinCount=1
Counters: BrokenCount=1 RevocablePointer=1 KeepDependency=1 BucketInUse=0 HandleInUse=0 HandleReferenceCount=0
Concurrency: DependencyMutex=0x8e53cb50(0, 1, 0, 0) Mutex=0x8e53cbd0(1, 24, 0, 6)
Flags=RON/PIN/TIM/PN0/DBN/[10012841]
WaitersLists:
Lock=0x8e53cb30[0x8e53cb30,0x8e53cb30]
Pin=0x8e53cb10[0x8e53cb10,0x8e53cb10]
LoadLock=0x8e53cb88[0x8e53cb88,0x8e53cb88]
Timestamp: Current=08-28-2014 12:30:54
HandleReference: Address=0x8e53cd08 Handle=(nil) Flags=[00]
LibraryObject: Address=0x89ac40b0 HeapMask=0000-0001-0001-0000 Flags=EXS[0000] Flags2=[0000] PublicFlags=[0000]
ChildTable: size='16'
Child: id='0' Table=0x89ac4f60 Reference=0x89ac4998 Handle=0x8e53c640
Children:
Child: childNum='0'
LibraryHandle: Address=0x8e53c640 Hash=0 LockMode=N PinMode=S LoadLockMode=0 Status=VALD
Name: Namespace=SQL AREA(00) Type=CURSOR(00)
Statistics: InvalidationCount=0 ExecutionCount=7 LoadCount=1 ActiveLocks=1 TotalLockCount=2 TotalPinCount=12
Counters: BrokenCount=1 RevocablePointer=1 KeepDependency=0 BucketInUse=0 HandleInUse=0 HandleReferenceCount=0
Concurrency: DependencyMutex=0x8e53c6f0(0, 0, 0, 0) Mutex=0x8e53cbd0(1, 24, 0, 6)
Flags=RON/PIN/PN0/EXP/CHD/[10012111]
WaitersLists:
Lock=0x8e53c6d0[0x8e53c6d0,0x8e53c6d0]
Pin=0x8e53c6b0[0x8e53c6b0,0x8e53c6b0]
LoadLock=0x8e53c728[0x8e53c728,0x8e53c728]
LibraryObject: Address=0x89ac30b0 HeapMask=0000-0001-0001-0000 Flags=EXS[0000] Flags2=[0000] PublicFlags=[0000]
DataBlocks:
Block: #='0' name=KGLH0^e3a2d601 pins=0 Change=NONE
Heap=0x8e53c588 Pointer=0x89ac3150 Extent=0x89ac3030 Flags=I/-/P/A/-/-
FreedLocation=0 Alloc=3.398438 Size=3.976562 LoadTime=4295814490
Block: #='6' name=SQLA^e3a2d601 pins=0 Change=NONE
Heap=0x89ac4768 Pointer=0x893b19b8 Extent=0x893b0d60 Flags=I/-/P/A/-/E
FreedLocation=0 Alloc=18.210938 Size=19.781250 LoadTime=0
NamespaceDump:
Child Cursor: Heap0=0x89ac3150 Heap6=0x893b19b8 Heap0 Load Time=08-28-2014 12:30:54 Heap6 Load Time=08-28-2014 12:30:54
NamespaceDump:
Parent Cursor: sql_id=3nkd3g3ju5ph1 parent=0x89ac4150 maxchild=1 plk=y ppn=n kkscs=0x89ac4628 nxt=(nil) flg=18 cld=0 hd=0x8e53c640 par=0x89ac4150
Mutex 0x89ac4628(0, 0) idn 3000000000
ct=0 hsh=0 unp=(nil) unn=0 hvl=89ac4ff8 nhv=0 ses=(nil)
hep=0x89ac46c0 flg=80 ld=1 ob=0x89ac30b0 ptr=0x893b19b8 fex=0x893b0d60
cursor instantiation=0x2a97e25ea0 used=1409200254 exec_id=16777222 exec=7
child#0(0x8e53c640) pcs=0x89ac4628
clk=0x8e5ae280 ci=0x89ac3150 pn=0x8e5af620 ctx=0x893b19b8
kgsccflg=9 llk[0x2a97e25ea8,0x2a97e25ea8] idx=2
xscflg=c03504f6 fl2=5040001 fl3=40222108 fl4=100
----- Bind Byte Code (IN) -----
Opcode = 1 Unoptimized
Offsi = 48, Offsi = 0
Opcode = 1 Unoptimized
Offsi = 48, Offsi = 32
Opcode = 1 Unoptimized
Offsi = 48, Offsi = 64
----- Bind Info (kkscoacd) -----
Bind#0
oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
oacflg=00 fl2=0001 frm=00 csi=00 siz=80 off=0
kxsbbbfp=2a98203d48 bln=22 avl=01 flg=05
value=0
Bind#1
oacdty=01 mxl=32(10) mxlc=00 mal=00 scl=00 pre=00
oacflg=10 fl2=0001 frm=01 csi=873 siz=0 off=24
kxsbbbfp=2a98203d60 bln=32 avl=10 flg=01
value="TAB_STATS$"
Bind#2
oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
oacflg=00 fl2=0001 frm=00 csi=00 siz=0 off=56
kxsbbbfp=2a98203d80 bln=22 avl=02 flg=01
value=1
Frames pfr 0x2a97e9a578 siz=7208 efr 0x2a97e9a5d0 siz=7192
Cursor frame dump
enxt: 5.0x00000010 enxt: 4.0x00000bf8 enxt: 3.0x00000588 enxt: 2.0x00000060
enxt: 1.0x00000a38
pnxt: 1.0x00000010
在这一部分,我们发现了当前正在执行的SQL语句,绑定变量的信息。 其实当前SQL在,在trace文件最开始部分就有,这一点是所有版本通用的
----- Current SQL Statement for this session (sql_id=3nkd3g3ju5ph1) -----
select obj#,type#,ctime,mtime,stime, status, dataobj#, flags, oid$, spare1, spare2 from obj$ where owner#=:1 and name=:2 and namespace=:3 and remoteowner is null and linkname is null and subname is null
sql_text_length=203
sql=select obj#,type#,ctime,mtime,stime, status, dataobj#, flags, oid$, spare1, spare2 from obj$ where owner#=:1 and name=:2 and namespace=:3 and remoteowner is null and linkname is null and subname is null
5.3.3 10046中分析访问的块
这里我们以之前发现的SQL语句在trace中搜10046event的输出信息,找到等待事件等信息,但是不知道为什么,在10046那一部分中,没有找到相应的SQL,可能是trace文件有那里有问题导致的,但是发现了绑定变量与等待事件的信息
BINDS #182936821408:
Bind#0
oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
oacflg=00 fl2=0001 frm=00 csi=00 siz=80 off=0
kxsbbbfp=2a98203d48 bln=22 avl=01 flg=05
value=0
Bind#1
oacdty=01 mxl=32(10) mxlc=00 mal=00 scl=00 pre=00
oacflg=10 fl2=0001 frm=01 csi=873 siz=0 off=24
kxsbbbfp=2a98203d60 bln=32 avl=10 flg=01
value="TAB_STATS$"
Bind#2
oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
oacflg=00 fl2=0001 frm=00 csi=00 siz=0 off=56
kxsbbbfp=2a98203d80 bln=22 avl=02 flg=01
value=1
EXEC #182936821408:c=0,e=61,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=2853959010,tim=1409200255032571
WAIT #182936821408: nam='db file sequential read' ela= 8 file#=1 block#=20804 blocks=1 obj#=37 tim=1409200255032603
=====================
PARSING IN CURSOR #182936803264 len=142 dep=2 uid=0 oct=3 lid=0 tim=1409200255032997 hv=361892850 ad='8e534360' sqlid='7bd391hat42zk'
select /*+ rule */ name,file#,block#,status$,user#,undosqn,xactsqn,scnbas,scnwrp,DECODE(inst#,0,NULL,inst#),ts#,spare1 from undo$ where us#=:1
END OF STMT
PARSE #182936803264:c=0,e=352,p=0,cr=0,cu=0,mis=1,r=0,dep=2,og=3,plh=0,tim=1409200255032996
BINDS #182936803264:
Bind#0
oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
kxsbbbfp=2a98202538 bln=22 avl=02 flg=05
value=6
EXEC #182936803264:c=1000,e=507,p=0,cr=0,cu=0,mis=1,r=0,dep=2,og=3,plh=906473769,tim=1409200255033560
WAIT #182936803264: nam='db file sequential read' ela= 10 file#=1 block#=321 blocks=1 obj#=34 tim=1409200255033606
WAIT #182936803264: nam='db file sequential read' ela= 13 file#=1 block#=225 blocks=1 obj#=15 tim=1409200255033655
FETCH #182936803264:c=0,e=99,p=2,cr=2,cu=0,mis=0,r=1,dep=2,og=3,plh=906473769,tim=1409200255033676
STAT #182936803264 id=1 cnt=1 pid=0 pos=1 obj=15 op='TABLE ACCESS BY INDEX ROWID UNDO$ (cr=2 pr=2 pw=0 time=96 us)'
STAT #182936803264 id=2 cnt=1 pid=1 pos=1 obj=34 op='INDEX UNIQUE SCAN I_UNDO1 (cr=1 pr=1 pw=0 time=51 us)'
CLOSE #182936803264:c=0,e=4,dep=2,type=0,tim=1409200255033718
WAIT #182936821408: nam='db file sequential read' ela= 13 file#=3 block#=208 blocks=1 obj#=0 tim=1409200255033756
WAIT #182936821408: nam='db file sequential read' ela= 6 file#=3 block#=440 blocks=1 obj#=0 tim=1409200255033873
从上面信息我们可以看到182936821408执行的时候,去访问 file#=1 block#=20804 blocks=1 obj#=37 信息。下一条SQL语句就是去访问报错的UNDO段的信息,跟之前的报错信息也是一致的。 通过上面的信息,我们报到报错的SQL,绑定变量,访问的块的信息
5.3.4 查看块dump的信息
这里以1/20804来搜索,可以发现trace文件中已经dump了块的信息,这里也看到块的rdba地址,跟5.3.1那里怀疑的块是一致的。
BH (0x7cbf9548) file#: 1 rdba: 0x00405144 (1/20804) class: 1 ba: 0x7cb84000
set: 3 pool: 3 bsz: 8192 bsi: 0 sflg: 2 pwc: 0,0
dbwrid: 0 obj: 37 objn: 37 tsn: 0 afn: 1 hint: f
hash: [0x8d7e0908,0x8d7e0908] lru: [0x7cbf9760,0x7cbf9500]
obj-flags: object_ckpt_list
ckptq: [0x8d19c920,0x8d19c920] fileq: [0x8d19c940,0x8d19c940] objq: [0x8a1ae3c8,0x8a1ae3c8] objaq: [0x7cbf98c8,0x8a1ae3a8]
use: [0x8d1633b0,0x8d1633b0] wait: [NULL]
st: XCURRENT md: SHR fpin: 'qeilwhnp: qeilbk' tch: 0
flags: buffer_dirty redo_since_read
LRBA: [0x1.3.0] LSCN: [0x0.21eda4] HSCN: [0x0.21eda4] HSUB: [1]
Using State Objects
----------------------------------------
SO: 0x8d163330, type: 38, owner: 0x8dc90ea0, flag: INIT/-/-/0x00 if: 0x1 c: 0x1
proc=0x8e098608, name=buffer handle, file=kcb2.h LINE:2677, pg=0
(buffer) (CR) PR: 0x8e098608 FLG: 0x100000
class bit: 0x0
cr[0]:
sh[0]:
kcbbfbp: [BH: 0x7cbf9548, LINK: 0x8d1633b0]
type: normal pin
where: qeilwhnp: qeilbk, why: 54104
buffer tsn: 0 rdba: 0x00405144 (1/20804)
scn: 0x0000.0021eda4 seq: 0x01 flg: 0x00 tail: 0xeda40601
frmt: 0x02 chkval: 0x0000 type: 0x06=trans data
下面继续查看块的信息
Block header dump: 0x00405144
Object id on Block? Y
seg/obj: 0x25 csc: 0x00.21eda4 itc: 2 flg: - typ: 2 - INDEX
fsl: 0 fnx: 0x0 ver: 0x01
Itl Xid Uba Flag Lck Scn/Fsc
0x01 0x0002.015.0000009f 0x00c03a8d.001c.01 CB-- 0 scn 0x0000.0002a810
0x02 0x0006.014.00000442 0x00c001b8.010e.11 ---- 1 fsc 0x0000.00000000
这里我们可以看到XID的信息,ITL 02上面有一行的数据正在被修改,使用的undo段为6,使用的uba地址为00c001b8,使用的插槽号为14,trap#为442,块为索引块。其实这里看到是索引块,如果正常情况下,我们重建索引就可以解决问题了。或者是不走索引也是可以的。下面是将uba中的地址转为数据文件与块号
www.htz.pw > @dba_to_fno_bno.sql
Enter value for dba: 00c001b8
FILE BLOCK
---------- ----------
3 440
下面继续查看块中行记录的信息
row#102[1300] flag: ------, lock: 2, len=32, data:(6): 00 41 42 1e 00 3d
col 0; len 1; (1): 80
col 1; len 5; (5): 54 45 53 54 32
col 2; len 2; (2): c1 02
col 3; NULL
col 4; NULL
col 5; NULL
col 6; len 2; (2): c1 03
col 7; len 1; (1): 80
col 8; len 4; (4): c3 08 45 2e
这里看到了索引中的行记录。 下面继续查看块中对象的信息
www.htz.pw > set echo off
Enter Search Object Id (i.e. 1235) : 37
Object Create Last_Ddl
OWNEROBJECT_NAME SUBOBJECT_NAME Type Time Time STATUS
----------------- ------------------------------ -------------------- -------------------- ----------
SYS I_OBJ2 INDEX 2013-08-24 11:37:35 2013-08-24 11:37:35 VALID
这里可以看到是索引的名字等详细的信息,
5.3.5 查看undo块与头的信息
从5.3.4中我们查看了undo块的信息,下面我们来看看undo的信息
BH (0x7cbf91b8) file#: 3 rdba: 0x00c000d0 (3/208) class: 27 ba: 0x7cb7e000
set: 3 pool: 3 bsz: 8192 bsi: 0 sflg: 2 pwc: 0,0
dbwrid: 0 obj: -1 objn: 0 tsn: 2 afn: 3 hint: f
hash: [0x8d040fa0,0x8d040fa0] lru: [0x7cbf93d0,0x7cbf9170]
ckptq: [NULL] fileq: [NULL] objq: [0x8a1adeb8,0x8a1adeb8] objaq: [0x8a1adea8,0x8a1adea8]
st: XCURRENT md: NULL fpin: 'ktuwh05: ktugct' tch: 1
flags:
LRBA: [0x0.0.0] LSCN: [0x0.0] HSCN: [0xffff.ffffffff] HSUB: [65535]
buffer tsn: 2 rdba: 0x00c000d0 (3/208)
scn: 0x0000.0012a779 seq: 0x02 flg: 0x04 tail: 0xa7792602
frmt: 0x02 chkval: 0x33bf type: 0x26=KTU SMU HEADER BLOCK
07CB7FFF0 00000000 00000000 00000000 A7792602 [.............&y.]
Extent Control Header
-----------------------------------------------------------------
Extent Header:: spare1: 0 spare2: 0 #extents: 5 #blocks: 39
last map 0x00000000 #maps: 0 offset: 4080
Highwater:: 0x00c001a5 ext#: 1 blk#: 5 ext size: 8
#blocks in seg. hdr's freelists: 0
#blocks below: 0
mapblk 0x00000000 offset: 1
Unlocked
Map Header:: next 0x00000000 #extents: 5 obj#: 0 flag: 0x40000000
Extent Map
-----------------------------------------------------------------
0x00c000d1 length: 7
0x00c001a0 length: 8
0x00c001b8 length: 8
0x00c000d8 length: 8
0x00c00138 length: 8
Retention Table
-----------------------------------------------------------
Extent Number:0 Commit Time: 1409188509
Extent Number:1 Commit Time: 1409188509
Extent Number:2 Commit Time: 0
Extent Number:3 Commit Time: 1409188428
Extent Number:4 Commit Time: 1409188492
TRN CTL:: seq: 0x010d chd: 0x000d ctl: 0x0015 inc: 0x00000000 nfb: 0x0001
mgc: 0xb000 xts: 0x0068 flg: 0x0001 opt: 2147483646 (0x7ffffffe)
uba: 0x00c001a5.010d.25 scn: 0x0000.0012a420
Version: 0x01
FREE BLOCK POOL::
uba: 0x00c001a5.010d.25 ext: 0x1 spc: 0xe10
uba: 0x00000000.0109.07 ext: 0x0 spc: 0x1a20
uba: 0x00000000.00d1.25 ext: 0x5 spc: 0x608
uba: 0x00000000.00b0.01 ext: 0x6 spc: 0x1ed0
uba: 0x00000000.00b0.01 ext: 0x6 spc: 0x1ed0
TRN TBL::
index state cflags wrap# uel scn dba parent-xid nub stmt_num cmt
------------------------------------------------------------------------------------------------
0x00 9 0x00 0x0442 0x001f 0x0000.0012a6b9 0x00c001a4 0x0000.000.00000000 0x00000001 0x00000000 1409188524
0x01 9 0x00 0x0442 0x0015 0x0000.0012a760 0x00c001a5 0x0000.000.00000000 0x00000001 0x00000000 1409188658
0x02 9 0x00 0x0441 0x0014 0x0000.0012a515 0x00c0013b 0x0000.000.00000000 0x00000003 0x00000000 1409188485
0x03 9 0x00 0x0441 0x000b 0x0000.0012a4b5 0x00c00138 0x0000.000.00000000 0x00000001 0x00000000 1409188478
0x04 9 0x00 0x0441 0x0003 0x0000.0012a4a1 0x00c00138 0x0000.000.00000000 0x00000001 0x00000000 1409188478
0x05 9 0x00 0x0441 0x0007 0x0000.0012a451 0x00c00138 0x0000.000.00000000 0x00000001 0x00000000 1409188477
0x06 9 0x00 0x0442 0x0008 0x0000.0012a71f 0x00c001a5 0x0000.000.00000000 0x00000002 0x00000000 1409188606
0x07 9 0x00 0x0441 0x0010 0x0000.0012a477 0x00c00138 0x0000.000.00000000 0x00000001 0x00000000 1409188478
0x08 9 0x00 0x0441 0x0001 0x0000.0012a73d 0x00c001a5 0x0000.000.00000000 0x00000001 0x00000000 1409188608
0x09 9 0x00 0x0441 0x0002 0x0000.0012a502 0x00c00139 0x0000.000.00000000 0x00000001 0x00000000 1409188485
0x0a 9 0x00 0x0442 0x0020 0x0000.0012a6de 0x00c001a4 0x0000.000.00000000 0x00000001 0x00000000 1409188545
0x0b 9 0x00 0x0441 0x0011 0x0000.0012a4cc 0x00c00138 0x0000.000.00000000 0x00000001 0x00000000 1409188481
0x0c 9 0x00 0x0441 0x001a 0x0000.0012a5af 0x00c000d3 0x0000.000.00000000 0x00000006 0x00000000 1409188492
0x0d 9 0x00 0x0440 0x0005 0x0000.0012a43b 0x00c00138 0x0000.000.00000000 0x00000001 0x00000000 1409188471
0x0e 9 0x00 0x0441 0x0013 0x0000.0012a552 0x00c0013b 0x0000.000.00000000 0x00000001 0x00000000 1409188486
0x0f 9 0x00 0x0441 0x0009 0x0000.0012a4f4 0x00c00139 0x0000.000.00000000 0x00000002 0x00000000 1409188485
0x10 9 0x00 0x0441 0x0004 0x0000.0012a48a 0x00c00138 0x0000.000.00000000 0x00000001 0x00000000 1409188478
0x11 9 0x00 0x0441 0x000f 0x0000.0012a4e2 0x00c00138 0x0000.000.00000000 0x00000001 0x00000000 1409188485
0x12 9 0x00 0x0441 0x0000 0x0000.0012a6a9 0x00c001a4 0x0000.000.00000000 0x00000001 0x00000000 1409188523
0x13 9 0x00 0x0441 0x0016 0x0000.0012a56a 0x00c0013c 0x0000.000.00000000 0x00000002 0x00000000 1409188487
0x14 9 0x00 0x0441 0x0017 0x0000.0012a52a 0x00c0013b 0x0000.000.00000000 0x00000001 0x00000000 1409188485
0x15 9 0x00 0x0441 0xffff 0x0000.0012a779 0x00c001a5 0x0000.000.00000000 0x00000001 0x00000000 1409188665
0x16 9 0x00 0x0441 0x0019 0x0000.0012a57e 0x00c0013c 0x0000.000.00000000 0x00000001 0x00000000 1409188489
0x17 9 0x00 0x0441 0x000e 0x0000.0012a541 0x00c0013b 0x0000.000.00000000 0x00000001 0x00000000 1409188486
0x18 9 0x00 0x0441 0x001c 0x0000.0012a67e 0x00c001a1 0x0000.000.00000000 0x00000007 0x00000000 1409188509
0x19 9 0x00 0x0441 0x000c 0x0000.0012a59b 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1409188490
0x1a 9 0x00 0x0441 0x001b 0x0000.0012a5c5 0x00c000d3 0x0000.000.00000000 0x00000001 0x00000000 1409188494
0x1b 9 0x00 0x0441 0x0021 0x0000.0012a5eb 0x00c000d3 0x0000.000.00000000 0x00000001 0x00000000 1409188494
0x1c 9 0x00 0x0441 0x0012 0x0000.0012a689 0x00c001a4 0x0000.000.00000000 0x00000003 0x00000000 1409188509
0x1d 9 0x00 0x0441 0x001e 0x0000.0012a617 0x00c000d3 0x0000.000.00000000 0x00000001 0x00000000 1409188499
0x1e 9 0x00 0x0441 0x0018 0x0000.0012a632 0x00c000d3 0x0000.000.00000000 0x00000001 0x00000000 1409188507
0x1f 9 0x00 0x0441 0x000a 0x0000.0012a6cb 0x00c001a4 0x0000.000.00000000 0x00000001 0x00000000 1409188538
0x20 9 0x00 0x0441 0x0006 0x0000.0012a701 0x00c001a4 0x0000.000.00000000 0x00000001 0x00000000 1409188586
0x21 9 0x00 0x0440 0x001d 0x0000.0012a601 0x00c000d3 0x0000.000.00000000 0x00000001 0x00000000 1409188499
这里可以看到undo段头中index为14的事务状态是9(非活动的),wrap#值为0441,比之前的块中的442还小。所以此时的undo段头块比我们所需要的段头块还要旧。 下面继续查看undo块的信息
********************************************************************************
UNDO BLK:
xid: 0x0008.007.00000389 seq: 0xdb cnt: 0x4 irb: 0x4 icl: 0x0 flg: 0x0000
Rec Offset Rec Offset Rec Offset Rec Offset Rec Offset
---------------------------------------------------------------------------
0x01 0x1f38 0x02 0x1eb0 0x03 0x1e50 0x04 0x1dc8
*-----------------------------
* Rec #0x1 slt: 0x18 objn: 457(0x000001c9) objd: 457 tblspc: 0(0x00000000)
* Layer: 11 (Row) opc: 1 rci 0x00
Undo type: Regular undo Begin trans Last buffer split: No
Temp Object: No
Tablespace Undo: No
rdba: 0x00000000Ext idx: 0
flg2: 0
*-----------------------------
uba: 0x00c0014f.00da.3d ctl max scn: 0x0000.00126bb5 prv tx scn: 0x0000.00126bcd
txn start scn: scn: 0x0000.00126f93 logon user: 0
prev brb: 12583242 prev bcl: 0
KDO undo record:
KTB Redo
op: 0x04 ver: 0x01
compat bit: 4 (post-11) padding: 1
op: L itl: xid: 0x0005.020.00000375 uba: 0x00c001ae.00ba.2e
flg: C--- lkc: 0 scn: 0x0000.00126b8b
KDO Op code: URP row dependencies Disabled
xtype: XAxtype KDO_KDOM2 flags: 0x00000080 bdba: 0x00400c21 hdba: 0x00400c18
itli: 1 ispac: 0 maxfr: 4863
tabn: 0 slot: 173(0xad) flag: 0x2c lock: 0 ckix: 11
ncol: 9 nnew: 7 size: 0
Vector content:
col 2: [ 2] c1 07
col 3: [ 2] c1 04
col 4: [ 1] 80
col 5: [ 1] 80
col 6: [ 1] 80
col 7: [ 1] 80
col 8: [ 7] 78 72 08 1a 0c 01 10
*-----------------------------
* Rec #0x2 slt: 0x1b objn: 461(0x000001cd) objd: 461 tblspc: 0(0x00000000)
* Layer: 11 (Row) opc: 1 rci 0x00
Undo type: Regular undo Begin trans Last buffer split: No
Temp Object: No
Tablespace Undo: No
rdba: 0x00000000Ext idx: 0
flg2: 0
*-----------------------------
uba: 0x00c001b8.00db.01 ctl max scn: 0x0000.00126bcd prv tx scn: 0x0000.00126bd9
txn start scn: scn: 0x0000.00126f9d logon user: 0
prev brb: 12583244 prev bcl: 0
KDO undo record:
KTB Redo
op: 0x04 ver: 0x01
compat bit: 4 (post-11) padding: 1
op: L itl: xid: 0x0001.00c.000002b9 uba: 0x00c0023a.00cc.36
flg: C--- lkc: 0 scn: 0x0000.00126f9b
KDO Op code: DRP row dependencies Disabled
xtype: XA flags: 0x00000000 bdba: 0x00400c72 hdba: 0x00400c70
itli: 1 ispac: 0 maxfr: 4863
tabn: 0 slot: 171(0xab)
*-----------------------------
* Rec #0x3 slt: 0x1b objn: 462(0x000001ce) objd: 462 tblspc: 0(0x00000000)
* Layer: 10 (Index) opc: 22 rci 0x02
Undo type: Regular undo Last buffer split: No
Temp Object: No
Tablespace Undo: No
rdba: 0x00000000
*-----------------------------
index undo for leaf key operations
KTB Redo
op: 0x04 ver: 0x01
compat bit: 4 (post-11) padding: 1
op: L itl: xid: 0x0009.006.00000394 uba: 0x00c00155.00e6.29
flg: C--- lkc: 0 scn: 0x0000.00126f9c
Dump kdilk : itl=2, kdxlkflg=0x1 sdc=32655 indexid=0x400c90 block=0x00400c91
(kdxlpu): purge leaf row
key :(5): 04 c3 08 13 29
*-----------------------------
* Rec #0x4 slt: 0x07 objn: 71834(0x0001189a) objd: 71834 tblspc: 1(0x00000001)
* Layer: 11 (Row) opc: 1 rci 0x00
Undo type: Regular undo Begin trans Last buffer split: No
Temp Object: No
Tablespace Undo: No
rdba: 0x00000000Ext idx: 0
flg2: 0
*-----------------------------
uba: 0x00c001b8.00db.02 ctl max scn: 0x0000.00126bd9 prv tx scn: 0x0000.00126c09
txn start scn: scn: 0x0000.00126ec1 logon user: 71
prev brb: 12583244 prev bcl: 0
KDO undo record:
KTB Redo
op: 0x04 ver: 0x01
compat bit: 4 (post-11) padding: 1
op: L itl: xid: 0x000a.009.000002d3 uba: 0x00c00356.009a.0c
flg: C--- lkc: 0 scn: 0x0000.0012666e
KDO Op code: DRP row dependencies Disabled
xtype: XA flags: 0x00000000 bdba: 0x00810933 hdba: 0x00810932
itli: 1 ispac: 0 maxfr: 4858
tabn: 0 slot: 1(0x1)
在undo数据块中,也没有查看我们所需要的事务的UNDO记录数
5.3.6 分析结果
由于undo中没有包含事务所需要的undo记录,导致事务rollback的时候,触发了ORA-01555报错。这里知道原因后,解决问题就很简单了
5.3.7 bbed验证一下块中数据
[oracle@www.htz.pw trace]$bbed
Password:
BBED: Release 2.0.0.0.0 - Limited Production on Thu Aug 28 12:39:42 2014
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
************* !!! For Oracle Internal Use only !!! ***************
BBED> set filename '/oracle/app/oracle/oradata/database/cos11g/system01.dbf';
FILENAME /oracle/app/oracle/oradata/database/cos11g/system01.dbf
BBED> set block 20804
BLOCK# 20804
BBED> map
File: /oracle/app/oracle/oradata/database/cos11g/system01.dbf (0)
Block: 20804 Dba:0x00000000
------------------------------------------------------------
KTB Data Block (Index Leaf)
struct kcbh, 20 bytes @0
struct ktbbh, 72 bytes @20
struct kdxle, 32 bytes @92
sb2 kd_off[156] @124
ub1 freespace[952] @436
ub1 rowdata[6732] @1388
ub4 tailchk @8188
BBED> p ktbbh
struct ktbbh, 72 bytes @20
ub1 ktbbhtyp @20 0x02 (KDDBTINDEX)
union ktbbhsid, 4 bytes @24
ub4 ktbbhsg1 @24 0x00000025
ub4 ktbbhod1 @24 0x00000025
struct ktbbhcsc, 8 bytes @28
ub4 kscnbas @28 0x0012a91c
ub2 kscnwrp @32 0x0000
sb2 ktbbhict @36 7938
ub1 ktbbhflg @38 0x02 (NONE)
ub1 ktbbhfsl @39 0x00
ub4 ktbbhfnx @40 0x00000000
struct ktbbhitl[0], 24 bytes @44
struct ktbitxid, 8 bytes @44
ub2 kxidusn @44 0x0002
ub2 kxidslt @46 0x0015
ub4 kxidsqn @48 0x0000009f
struct ktbituba, 8 bytes @52
ub4 kubadba @52 0x00c03a8d
ub2 kubaseq @56 0x001c
ub1 kubarec @58 0x01
ub2 ktbitflg @60 0xc000 (KTBFIBI, KTBFCOM)
union _ktbitun, 2 bytes @62
sb2 _ktbitfsc @62 0
ub2 _ktbitwrp @62 0x0000
ub4 ktbitbas @64 0x0002a810
struct ktbbhitl[1], 24 bytes @68
struct ktbitxid, 8 bytes @68
ub2 kxidusn @68 0x0006
ub2 kxidslt @70 0x0014
ub4 kxidsqn @72 0x00000442
struct ktbituba, 8 bytes @76
ub4 kubadba @76 0x00c001b8
ub2 kubaseq @80 0x010e
ub1 kubarec @82 0x11
ub2 ktbitflg @84 0x0001 (NONE)
union _ktbitun, 2 bytes @86
sb2 _ktbitfsc @86 0
ub2 _ktbitwrp @86 0x0000
ub4 ktbitbas @88 0x00000000
BBED> x /rncnn *kd_off[104]
rowdata[4] @1392
----------
flag@1392: 0x00 (NONE)
lock@1393: 0x02
keydata[6]: 0x00 0x41 0x42 0x1e 0x00 0x3d
data key:
col 0[1] @1401: 0
col 1[5] @1403: TEST2
col 2[2] @1409: 1
col 3[0] @1412: *NULL*
col 4[0] @1413: *NULL*
col 5[0] @1414: *NULL*
col 6[2] @1415: 2
col 7[1] @1418: 0
col 8[4] @1420: 76845
6 故障处理过程
此故障处理的方法一般有2种
-
走全表扫描 -
手动提交事务信息
6.1 走全表扫描
由于这里是oracle2进制中的sql触发的报错,所以要走全表扫描,需要修改oracle2进制文件,见曾经的笔记ORA-08103,使用ue修改oracle2进制文件来完美解决,连接ue修改oracle文件
6.2 手动提交事务
相信这种情况下,大家一般会选择bbed的方法来解决,因为这种方法更简单,但是这种方法风险更高。如果要修改oracle2进制文件,需要在sql中增加full提示或者修改where后面列的信息,上面sql中引用的列都是字符集,增加更改起来比较麻烦,不仅需要改SQL内容,还需要更改其它地方,不然会open的时候会触发ORA-07445报错。
BBED> x /rncnn *kd_off[104]
rowdata[4] @1392
----------
flag@1392: 0x00 (NONE)
lock@1393: 0x02
keydata[6]: 0x00 0x41 0x42 0x1e 0x00 0x3d
data key:
col 0[1] @1401: 0
col 1[5] @1403: TEST2
col 2[2] @1409: 1
col 3[0] @1412: *NULL*
col 4[0] @1413: *NULL*
col 5[0] @1414: *NULL*
col 6[2] @1415: 2
col 7[1] @1418: 0
col 8[4] @1420: 76845
BBED> set offset 1393
OFFSET 1393
BBED> set count 10
COUNT 10
BBED> set mode edit
MODE Edit
BBED> dump
File: /oracle/app/oracle/oradata/database/cos11g/system01.dbf (0)
Block: 20804 Offsets: 1393 to 1402 Dba:0x00000000
------------------------------------------------------------------------
02004142 1e003d01 8005
<32 bytes per line>
BBED> modify /x 00 offset 1393
File: /oracle/app/oracle/oradata/database/cos11g/system01.dbf (0)
Block: 20804 Offsets: 1393 to 1402 Dba:0x00000000
------------------------------------------------------------------------
00004142 1e003d01 8005
<32 bytes per line>
BBED> verify
DBVERIFY - Verification starting
FILE = /oracle/app/oracle/oradata/database/cos11g/system01.dbf
BLOCK = 20804
Block Checking: DBA = 4215108, Block Type = KTB-managed data block
**** actual rows locked by itl 2 = 0 != # in trans. header = 1
---- end index block validation
Block 20804 failed with check code 6401
DBVERIFY - Verification complete
Total Blocks Examined : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing (Data) : 0
Total Blocks Processed (Index): 1
Total Blocks Failing (Index): 1
Total Blocks Empty : 0
Total Blocks Marked Corrupt : 0
Total Blocks Influx : 0
Message 531 not found; product=RDBMS; facility=BBED
BBED> p ktbbh
struct ktbbh, 72 bytes @20
ub1 ktbbhtyp @20 0x02 (KDDBTINDEX)
union ktbbhsid, 4 bytes @24
ub4 ktbbhsg1 @24 0x00000025
ub4 ktbbhod1 @24 0x00000025
struct ktbbhcsc, 8 bytes @28
ub4 kscnbas @28 0x0012a91c
ub2 kscnwrp @32 0x0000
sb2 ktbbhict @36 7938
ub1 ktbbhflg @38 0x02 (NONE)
ub1 ktbbhfsl @39 0x00
ub4 ktbbhfnx @40 0x00000000
struct ktbbhitl[0], 24 bytes @44
struct ktbitxid, 8 bytes @44
ub2 kxidusn @44 0x0002
ub2 kxidslt @46 0x0015
ub4 kxidsqn @48 0x0000009f
struct ktbituba, 8 bytes @52
ub4 kubadba @52 0x00c03a8d
ub2 kubaseq @56 0x001c
ub1 kubarec @58 0x01
ub2 ktbitflg @60 0xc000 (KTBFIBI, KTBFCOM)
union _ktbitun, 2 bytes @62
sb2 _ktbitfsc @62 0
ub2 _ktbitwrp @62 0x0000
ub4 ktbitbas @64 0x0002a810
struct ktbbhitl[1], 24 bytes @68
struct ktbitxid, 8 bytes @68
ub2 kxidusn @68 0x0006
ub2 kxidslt @70 0x0014
ub4 kxidsqn @72 0x00000442
struct ktbituba, 8 bytes @76
ub4 kubadba @76 0x00c001b8
ub2 kubaseq @80 0x010e
ub1 kubarec @82 0x11
ub2 ktbitflg @84 0x0001 (NONE)
union _ktbitun, 2 bytes @86
sb2 _ktbitfsc @86 0
ub2 _ktbitwrp @86 0x0000
ub4 ktbitbas @88 0x00000000
BBED> modify /x offset 84
BBED-00209: invalid number (offset)
BBED> modify /x 0080 offset 84
File: /oracle/app/oracle/oradata/database/cos11g/system01.dbf (0)
Block: 20804 Offsets: 84 to 93 Dba:0x00000000
------------------------------------------------------------------------
00800000 00000000 0000
<32 bytes per line>
BBED> p ktbbh
struct ktbbh, 72 bytes @20
ub1 ktbbhtyp @20 0x02 (KDDBTINDEX)
union ktbbhsid, 4 bytes @24
ub4 ktbbhsg1 @24 0x00000025
ub4 ktbbhod1 @24 0x00000025
struct ktbbhcsc, 8 bytes @28
ub4 kscnbas @28 0x0012a91c
ub2 kscnwrp @32 0x0000
sb2 ktbbhict @36 7938
ub1 ktbbhflg @38 0x02 (NONE)
ub1 ktbbhfsl @39 0x00
ub4 ktbbhfnx @40 0x00000000
struct ktbbhitl[0], 24 bytes @44
struct ktbitxid, 8 bytes @44
ub2 kxidusn @44 0x0002
ub2 kxidslt @46 0x0015
ub4 kxidsqn @48 0x0000009f
struct ktbituba, 8 bytes @52
ub4 kubadba @52 0x00c03a8d
ub2 kubaseq @56 0x001c
ub1 kubarec @58 0x01
ub2 ktbitflg @60 0xc000 (KTBFIBI, KTBFCOM)
union _ktbitun, 2 bytes @62
sb2 _ktbitfsc @62 0
ub2 _ktbitwrp @62 0x0000
ub4 ktbitbas @64 0x0002a810
struct ktbbhitl[1], 24 bytes @68
struct ktbitxid, 8 bytes @68
ub2 kxidusn @68 0x0006
ub2 kxidslt @70 0x0014
ub4 kxidsqn @72 0x00000442
struct ktbituba, 8 bytes @76
ub4 kubadba @76 0x00c001b8
ub2 kubaseq @80 0x010e
ub1 kubarec @82 0x11
ub2 ktbitflg @84 0x8000 (KTBFCOM)
union _ktbitun, 2 bytes @86
sb2 _ktbitfsc @86 0
ub2 _ktbitwrp @86 0x0000
ub4 ktbitbas @88 0x00000000
BBED> verify
DBVERIFY - Verification starting
FILE = /oracle/app/oracle/oradata/database/cos11g/system01.dbf
BLOCK = 20804
DBVERIFY - Verification complete
Total Blocks Examined : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing (Data) : 0
Total Blocks Processed (Index): 1
Total Blocks Failing (Index): 0
Total Blocks Empty : 0
Total Blocks Marked Corrupt : 0
Total Blocks Influx : 0
Message 531 not found; product=RDBMS; facility=BBED
7,数据库打开
ww.htz.pw > select open_mode from v$database;
OPEN_MODE
--------------------
MOUNTED
www.htz.pw > recover database using backup controlfile until cancel;
ORA-00279: change 2223521 generated at 08/28/2014 12:30:54 needed for thread 1
ORA-00289: suggestion :
/oracle/app/oracle/fast_recovery_area/COS11G/archivelog/2014_08_28/o1_mf_1_1_%u_
.arc
ORA-00280: change 2223521 for thread 1 is in sequence #1
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1:
'/oracle/app/oracle/oradata/database/cos11g/system01.dbf'
ORA-01112: media recovery not started
www.htz.pw > alter database open resetlogs;
Database altered.
这里看到数据库已经打开了,下面我们需要增加TEMP文件,观察ALERT中是否有报错,如果没有报错,取消参数,以正常的方式打开数据库。如果alert中没有任何报错,一般情况下,取消参数是可以正常打开数据库的,当然因为数据库采用特殊方式打开,所以建议将数据库重建。
8,后续分享
本次我们分享的是bbed提交内容,下一次我们再分享通过修改oracle二进制文件的方式来修改此故障。
作者介绍