在上一篇文章中，我们介绍了数据库在启动时触发ORA-00704、ORA-00604、ORA-01555报错时，怎么通过Trace文件定位触发报错的sql的执行计划、等待事件及访问的数据块。这边文章继续上篇文章内容，讲讲在数据库启动过程中触发ORA-00704、ORA-00604、ORA-01555报错时，怎么去分析、定位原因、解决问题。报错的信息如下：

www.htz.pw > alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 6 with name "_SYSSMU6_3654194381$" too small
Process ID: 9202
Session ID: 1 Serial number: 3

4个ORA报错，怎么找到那个报错是原因呢？规则就是从最后一个报错往前面看，ORA-01555是整个报错的原因，其它ORA都是因为01555报错而引起的现象。这个报错的说明就是：数据库遇到回滚段过程过小，导致SQL执行失败，引起数据库启动初始化失败，数据库被异常关闭。所以在分析时，我们首先分析产生ORA-01555的原因。

下面简单回顾一下ORA-01555的报错。

ORA-01555是快照过久而触发的报错。其实可以认为是执行SQL时，需要访问历史版本的数据，此时利用UNDO来回滚数据块，在UNDO段中无找到需要的回滚信息而触发报错。

因Oracle 利用undo记录回滚有两种机制：

大对象段大对象段利用数据块内的历史镜像来实现回滚，同PG的回滚机制类似。
非大对象段非大对象段利用undo段来实现回滚，在Oracle中，undo段又分为系统表空间中的undo段和undo表空间中的undo段，所以针对系统表空间中的系统对象所在块的回滚会使用系统undo段，针对普通用户使用undo表空间的undo段。

回滚信息无法找到的原因有很多，这里举例简单说明几个现象：

undo段头的scn值异常，导致根据数据块xid找到的undo段不满足要求。
undo条目不存在，在undo段中找不到条目的信息。
undo段本身不存在
undo段有坏块。

那是否undo回滚记录找不到，就一定会触发ORA-01555的报错呢？这个是不一定的。

今天重点是分析数据库在启动过程中重发的ORA-01555的报错，这个ORA-01555的报错一般是因为系统对象所在的块需要做回滚，但是在SYSTEM的undo段中无法找到回滚记录而触发的报错。

什么条件下会触发呢？

SQL执行时遇到未提交的事务，此时需要回滚事务，就会使用到undo的信息。
SQL执行时需要读取前镜像的数据，此时需要回滚数据块，就会使用到undo的信息。

上面两种情况对应的解决方案也各不一样。

本次案例主要是分享有未提交的事务，导致数据库在启动过程中触发ORA-01555的报错。这个案例也是来至于一个网友，数据库无法打开，希望能帮助他把数据库打开。因为数据库比较小，所以网友是通过云盘的方式共享所有的数据文件，我在自己的环境中操作的。

1，平台与版本

www.htz.pw > select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production

www.htz.pw > !lsb_release -a
LSB Version:    :core-3.0-amd64:core-3.0-ia32:core-3.0-noarch:graphics-3.0-amd64:graphics-3.0-ia32:graphics-3.0-noarch
Distributor ID: RedHatEnterpriseAS
Description:    Red Hat Enterprise Linux AS release 4 (Nahant Update 8)
Release:        4
Codename:       NahantUpdate8

2，修改数据文件位置

两个环境的路径不一致，需要修改数据文件的位置和一些参数。

[oracle@www.htz.pw temp]$cp database.zip  /oracle/app/oracle/oradata/
[oracle@www.htz.pw temp]$cd !$
cd /oracle/app/oracle/oradata/
[oracle@www.htz.pw oradata]$unzip database.zip

修改后的参数文件如下：

[oracle@www.htz.pw oradata]$cat /tmp/123.ora
cos11g.__db_cache_size=201326592
cos11g.__java_pool_size=4194304
cos11g.__large_pool_size=4194304
cos11g.__oracle_base='/oracle/app/oracle'#ORACLE_BASE set from environment
cos11g.__pga_aggregate_target=272629760
cos11g.__sga_target=511705088
cos11g.__shared_io_pool_size=0
cos11g.__shared_pool_size=289406976
cos11g.__streams_pool_size=4194304
*.audit_file_dest='/oracle/app/oracle/admin/cos11g/adump'
*.audit_trail='NONE'
*.compatible='11.2.0.0.0'
*.control_files='/oracle/app/oracle/oradata/database/cos11g/control01.ctl'
*.db_block_size=8192
*.db_domain=''
*.db_name='cos11g'
*.db_recovery_file_dest='/oracle/app/oracle/fast_recovery_area'
*.db_recovery_file_dest_size=4322230272
*.diagnostic_dest='/oracle/app/oracle'
*.dispatchers='(PROTOCOL=TCP) (SERVICE=cos11gXDB)'
*.memory_target=783286272
*.open_cursors=300
*.processes=150
*.remote_login_passwordfile='EXCLUSIVE'
*.undo_tablespace='UNDOTBS1'
*._allow_resetlogs_corruption=TRUE
*._allow_error_simulation=TRUE
*.undo_management='manual'

通过重建控制文件来修改数据文件的位置，修改数据文件位置方法很多，如alter database rename file的方式都可以的。

[oracle@www.htz.pw ~]$cd rs/sql

使用create_controlfile_sql.sh脚本来自动生成控制文件的内容，脚本就是调用alter database backup controflile to trace as '/tmp/control.ctl'；

[oracle@www.htz.pw sql]$sh ./create_controlfile_sql.sh 
please input direcotry default /tmp:
please input file name default control.ctl:
Database altered.
www.htz.pw > create spfile from pfile='/tmp/123.ora';
File created.
www.htz.pw > shutdown abort;
ORACLE instance shut down.
www.htz.pw > @/tmp/control.ctl
ORACLE instance started.

Total System Global Area  780824576 bytes
Fixed Size                  2232432 bytes
Variable Size             574623632 bytes
Database Buffers          201326592 bytes
Redo Buffers                2641920 bytes

控制文件的内容：

STARTUP NOMOUNT
CREATE CONTROLFILE REUSE DATABASE "COS11G" NORESETLOGS  NOARCHIVELOG
MAXLOGFILES 16
MAXLOGMEMBERS 3
MAXDATAFILES 100
MAXINSTANCES 8
MAXLOGHISTORY 292
LOGFILE
GROUP 1 '/oracle/app/oracle/oradata/database/cos11g/redo01.log'  SIZE 50M BLOCKSIZE 512,
GROUP 2 '/oracle/app/oracle/oradata/database/cos11g/redo02.log'  SIZE 50M BLOCKSIZE 512,
GROUP 3 '/oracle/app/oracle/oradata/database/cos11g/redo03.log'  SIZE 50M BLOCKSIZE 512
DATAFILE
'/oracle/app/oracle/oradata/database/cos11g/system01.dbf',
'/oracle/app/oracle/oradata/database/cos11g/sysaux01.dbf',
'/oracle/app/oracle/oradata/database/cos11g/undotbs01.dbf',
'/oracle/app/oracle/oradata/database/cos11g/users01.dbf',
'/oracle/app/oracle/oradata/database/cos11g/example01.dbf'
CHARACTER SET AL32UTF8
;
RECOVER DATABASE
ALTER DATABASE OPEN;

www.htz.pw > shutdown abort;
ORACLE instance shut down.
www.htz.pw > @/tmp/control.ctl
ORACLE instance started.

Total System Global Area  780824576 bytes
Fixed Size                  2232432 bytes
Variable Size             574623632 bytes
Database Buffers          201326592 bytes
Redo Buffers                2641920 bytes
CREATE CONTROLFILE REUSE DATABASE "COS11G" NORESETLOGS  NOARCHIVELOG
*
ERROR at line 1:
ORA-01503: CREATE CONTROLFILE failed
ORA-00367: checksum error in log file header
ORA-01517: log member: '/oracle/app/oracle/oradata/database/cos11g/redo01.log'

这里看到报日志文件的checksum错误，因为原来的日志文件已经被重建，所以这里需要将控制文件的NORESETLOGS更改为RESETLOGS就可以了

ORA-01507: database not mounted


ALTER DATABASE OPEN
*
ERROR at line 1:
ORA-01507: database not mounted

更改控制文件

www.htz.pw > !vi /tmp/control.ctl
STARTUP NOMOUNT
CREATE CONTROLFILE REUSE DATABASE "COS11G" RESETLOGS  NOARCHIVELOG
MAXLOGFILES 16
MAXLOGMEMBERS 3
MAXDATAFILES 100
MAXINSTANCES 8
MAXLOGHISTORY 292
LOGFILE
GROUP 1 '/oracle/app/oracle/oradata/database/cos11g/redo01.log'  SIZE 50M BLOCKSIZE 512,
GROUP 2 '/oracle/app/oracle/oradata/database/cos11g/redo02.log'  SIZE 50M BLOCKSIZE 512,
GROUP 3 '/oracle/app/oracle/oradata/database/cos11g/redo03.log'  SIZE 50M BLOCKSIZE 512
DATAFILE
'/oracle/app/oracle/oradata/database/cos11g/system01.dbf',
'/oracle/app/oracle/oradata/database/cos11g/sysaux01.dbf',
'/oracle/app/oracle/oradata/database/cos11g/undotbs01.dbf',
'/oracle/app/oracle/oradata/database/cos11g/users01.dbf',
'/oracle/app/oracle/oradata/database/cos11g/example01.dbf'
CHARACTER SET AL32UTF8
;
RECOVER DATABASE
ALTER DATABASE OPEN;
www.htz.pw > shutdown abort;
ORACLE instance shut down.
www.htz.pw > @/tmp/control.ctl
ORACLE instance started.

Total System Global Area  780824576 bytes
Fixed Size                  2232432 bytes
Variable Size             574623632 bytes
Database Buffers          201326592 bytes
Redo Buffers                2641920 bytes

Control file created.

ORA-00283: recovery session canceled due to errors
ORA-01610: recovery using the BACKUP CONTROLFILE option must be done


ALTER DATABASE OPEN
*
ERROR at line 1:
ORA-01589: must use RESETLOGS or NORESETLOGS option for database open

3，开始正常的恢复

www.htz.pw > select open_mode from v$database;

OPEN_MODE
--------------------
MOUNTED

www.htz.pw > recover database using backup controlfile until cancel;
ORA-00279: change 2223501 generated at 08/28/2014 11:22:42 needed for thread 1
ORA-00289: suggestion :
/oracle/app/oracle/fast_recovery_area/COS11G/archivelog/2014_08_28/o1_mf_1_1_%u_
.arc
ORA-00280: change 2223501 for thread 1 is in sequence #1


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1:
'/oracle/app/oracle/oradata/database/cos11g/system01.dbf'


ORA-01112: media recovery not started


www.htz.pw > alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 6 with name "_SYSSMU6_3654194381$" too small
Process ID: 9202
Session ID: 1 Serial number: 3

这里我们看到了触发了ORA-00704,ORA-00604,ORA-01555的报错，原因是由于oracle2进制中的一条sql语句执行，执行了ORA-01555报错，ORA-01555报错常见原因如下：

1，延迟块清除，这种情况我们通过增加SCN就可以解决。 2，事务回滚，这种情况我们通过BBED来处理就可以了。

此故障在故障中，是很常见的，但是有些特殊情况下，恢复起来不是那么容易，曾经遇到过一个数据库，bbed修改了2个小时的块，最后才将数据库打开

4，初步故障处理过程

4.1 修改undo_tablespace参数

这种修改undo_tablespace参数只是为了试一下

*.undo_tablespace='SYSTEM'

故障现象仍然存在

4.2 修改undo段的参数

_offline_rollback_segments=(_SYSSMU6_3654194381$)
_corrupted_rollback_segments=(_SYSSMU6_3654194381$)

此两个参数的作用是不一样的，一个是将整个UNDO段认识是损坏的，一个是可以正常读取UNDO段头的。更多关于此参数的作用可以见 ROLLBACK SEGMENT CORRUPTION with _OFFLINE & _CORRUPTED_ROLLBACK_SEGMENTS options (Doc ID 152085.1) 通过上面的2个参数，故障现象仍然存在。

5，故障分析过程

通过上面简单的2步解决，故障仍然存在，所以需要进一步详细的分析一下报错的详细过程与原因

5.1 查看alert文件

下面查看一个alert中的日志内容，获取更多详细的信息

SMON: enabling cache recovery
ORA-01555 caused by SQL statement below (SQL ID: 3nkd3g3ju5ph1, SCN: 0x0000.0021ed9c):
select obj#,type#,ctime,mtime,stime, status, dataobj#, flags, oid$, spare1, spare2 from obj$ where owner#=:1 and name=:2 and namespace=:3 and remoteowner is null and linkname is null and subname is null
Errors in file /oracle/app/oracle/diag/rdbms/cos11g/cos11g/trace/cos11g_ora_10655.trc:
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 6 with name "_SYSSMU6_3654194381$" too small
Errors in file /oracle/app/oracle/diag/rdbms/cos11g/cos11g/trace/cos11g_ora_10655.trc:
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 6 with name "_SYSSMU6_3654194381$" too small
Error 704 happened during db open, shutting down database
USER (ospid: 10655): terminating the instance due to error 704

从10655.trc日志中可以发现下面的信息

KCBR: Influx buffers flushed = 1 times

*** 2014-08-28 12:21:49.070
Completed Media Recovery

*** 2014-08-28 12:21:54.225
Prior to RESETLOGS processing...
ALTER SYSTEM ARCHIVE LOG ALL USING BACKUP CONTROLFILE start
Database is not in archivelog mode
ALTER SYSTEM ARCHIVE LOG ALL USING BACKUP CONTROLFILE complete
*** 2014-08-28 12:21:54.226 4320 krsh.c
Clearing online redo logfile 1 /oracle/app/oracle/oradata/database/cos11g/redo01.log
*** 2014-08-28 12:21:54.379 4320 krsh.c
Clearing online redo logfile 1 complete

*** 2014-08-28 12:21:54.380
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 6 with name "_SYSSMU6_3654194381$" too small
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 6 with name "_SYSSMU6_3654194381$" too small

*** 2014-08-28 12:21:54.458
USER (ospid: 10655): terminating the instance due to error 704

这么在报错的时候没有触发errorstack，日志中无任何有用的信息

5.2 配置event获取详细的日志内容

下面我们配置了10046与errorstack的event来获取详细的信息

www.htz.pw > oradebug setmypid
Statement processed.
www.htz.pw > oradebug event 1555 trace name errorstack level 12;
Statement processed.
www.htz.pw > oradebug event 10046 trace name context forever,level 12;
Statement processed.
www.htz.pw > recover database using backup controlfile until cancel;
ORA-00279: change 2223517 generated at 08/28/2014 12:26:58 needed for thread 1
ORA-00289: suggestion :
/oracle/app/oracle/fast_recovery_area/COS11G/archivelog/2014_08_28/o1_mf_1_1_%u_
.arc
ORA-00280: change 2223517 for thread 1 is in sequence #1


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1:
'/oracle/app/oracle/oradata/database/cos11g/system01.dbf'


ORA-01112: media recovery not started


www.htz.pw > alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 6 with name
"_SYSSMU6_3654194381$" too small
Process ID: 12125
Session ID: 1 Serial number: 3

5.3 详细分析trace文件

下面详细的分析trace文件中的信息，找到ORA-01555报错的原因

5.3.1 获取trace文件中的块的信息

因为是ORA-01555的报错，所以这里首先看的是块的信息，能大概知道那个块导致的问题。通过块的信息，我们估计是由于0x00405144这个块导致报错，不过这里没有直接去看块的内容，继续从trace文件中按正常的分析思路走。

[root@www.htz.pw ~]#grep -E "^Block he|^0x0"   /oracle/app/oracle/diag/rdbms/cos11g/cos11g/trace/cos11g_ora_12125.trc
Block header dump:  0x00405144
0x01   0x0002.015.0000009f  0x00c03a8d.001c.01  CB--    0  scn 0x0000.0002a810
0x02   0x0006.014.00000442  0x00c001b8.010e.11  ----    1  fsc 0x0000.00000000
Block header dump:  0x00400132
0x01   0x0006.006.00000056  0x00c01899.0013.01  CB--    0  scn 0x0000.0001784d
0x02   0x0000.000.00000000  0x00000000.0000.00  ----    0  fsc 0x0000.00000000
Block header dump:  0x0040012b
0x01   0x0000.050.00000007  0x00400083.000c.27  --U-    1  fsc 0x0000.000005c8
0x02   0x0000.045.00000007  0x00400081.000c.46  --U-    5  fsc 0x0000.000005ca
Block header dump:  0x00400181
0x01   0x000a.00d.00000270  0x00c00531.0078.02  C---    0  scn 0x0000.000f8ac8
Block header dump:  0x00400182
0x01   0x0008.009.0000000e  0x00c00252.0005.01  CBU-    0  scn 0x0000.00004576
0x02   0x0008.002.0000000e  0x00c001f7.0004.0f  C-U-    0  scn 0x0000.00004576
Block header dump:  0x00400095
0x01   0x0007.00c.00000271  0x00c00a98.0080.04  C---    0  scn 0x0000.000fbb30
0x02   0x0007.00b.00000271  0x00c00df9.0083.1d  --U-   11  fsc 0x0027.00105067
Block header dump:  0x004000a9
0x01   0x0004.001.00000266  0x00c15c4a.0066.02  C---    0  scn 0x0000.000ec0d5
Block header dump:  0x004000aa
0x01   0x0004.01a.00000013  0x00c00450.000b.01  CBU-    0  scn 0x0000.00004f16
0x02   0x0007.00e.00000271  0x00c00a98.0080.0f  --U-    1  fsc 0x0000.000fbb44
Block header dump:  0x00400159
0x01   0x0006.015.00000398  0x00c10fec.0085.02  C---    0  scn 0x0000.000d8b9a
Block header dump:  0x0040b2eb
0x01   0x0006.009.000003cd  0x00c000de.0099.03  C---    0  scn 0x0000.000f3adb
Block header dump:  0x00405144
0x01   0x0002.015.0000009f  0x00c03a8d.001c.01  CB--    0  scn 0x0000.0002a810
0x02   0x0006.014.00000442  0x00c001b8.010e.11  ----    1  fsc 0x0000.00000000
Block header dump:  0x00400141
0x01   0x0000.000.00000000  0x00000000.0000.00  ----    0  fsc 0x0000.00000000
0x02   0x0000.019.00000012  0x00400217.001a.06  --U-    1  fsc 0x0000.000f24ae
Block header dump:  0x004000e1
0x01   0x0000.018.00000014  0x00400220.001c.01  --U-    1  fsc 0x0000.00105ea7

5.3.2 获取当前会话的游标信息

因为是SQL语句触发的ORA-01555，所以这里直接去查询的当前会话的游标的信息，可以一步定位，由于版本不同，可能这一部分的信息会不一样。

******************************************************
----- Session Open Cursors -----

----------------------------------------
Cursor#2(0x2a97e11bb8) state=FETCH curiob=0x2a97e25ea0
 curflg=a007 fl2=6200000 par=0x2a97e11c48 ses=0x8dc4bd60
----- Dump Cursor sql_id=3nkd3g3ju5ph1 xsc=0x2a97e25ea0 cur=0x2a97e11bb8 -----

LibraryHandle:  Address=0x8e53caa0 Hash=e3a2d601 LockMode=N PinMode=0 LoadLockMode=0 Status=VALD
  ObjectName:  Name=select obj#,type#,ctime,mtime,stime, status, dataobj#, flags, oid$, spare1, spare2 from obj$ where owner#=:1 and name=:2 and namespace=:3 and remoteowner is null and linkname is null and subname is null
    FullHashValue=cef0429a290691c83a49a378e3a2d601 Namespace=SQL AREA(00) Type=CURSOR(00) Identifier=3819099649 OwnerIdn=0
  Statistics:  InvalidationCount=0 ExecutionCount=7 LoadCount=2 ActiveLocks=1 TotalLockCount=1 TotalPinCount=1
  Counters:  BrokenCount=1 RevocablePointer=1 KeepDependency=1 BucketInUse=0 HandleInUse=0 HandleReferenceCount=0
  Concurrency:  DependencyMutex=0x8e53cb50(0, 1, 0, 0) Mutex=0x8e53cbd0(1, 24, 0, 6)
  Flags=RON/PIN/TIM/PN0/DBN/[10012841]
  WaitersLists:
    Lock=0x8e53cb30[0x8e53cb30,0x8e53cb30]
    Pin=0x8e53cb10[0x8e53cb10,0x8e53cb10]
    LoadLock=0x8e53cb88[0x8e53cb88,0x8e53cb88]
  Timestamp:  Current=08-28-2014 12:30:54
  HandleReference:  Address=0x8e53cd08 Handle=(nil) Flags=[00]
  LibraryObject:  Address=0x89ac40b0 HeapMask=0000-0001-0001-0000 Flags=EXS[0000] Flags2=[0000] PublicFlags=[0000]
    ChildTable:  size='16'
      Child:  id='0' Table=0x89ac4f60 Reference=0x89ac4998 Handle=0x8e53c640
    Children:
      Child:  childNum='0'
        LibraryHandle:  Address=0x8e53c640 Hash=0 LockMode=N PinMode=S LoadLockMode=0 Status=VALD
          Name:  Namespace=SQL AREA(00) Type=CURSOR(00)
          Statistics:  InvalidationCount=0 ExecutionCount=7 LoadCount=1 ActiveLocks=1 TotalLockCount=2 TotalPinCount=12
          Counters:  BrokenCount=1 RevocablePointer=1 KeepDependency=0 BucketInUse=0 HandleInUse=0 HandleReferenceCount=0
          Concurrency:  DependencyMutex=0x8e53c6f0(0, 0, 0, 0) Mutex=0x8e53cbd0(1, 24, 0, 6)
          Flags=RON/PIN/PN0/EXP/CHD/[10012111]
          WaitersLists:
            Lock=0x8e53c6d0[0x8e53c6d0,0x8e53c6d0]
            Pin=0x8e53c6b0[0x8e53c6b0,0x8e53c6b0]
            LoadLock=0x8e53c728[0x8e53c728,0x8e53c728]
          LibraryObject:  Address=0x89ac30b0 HeapMask=0000-0001-0001-0000 Flags=EXS[0000] Flags2=[0000] PublicFlags=[0000]
            DataBlocks: 
              Block:  #='0' name=KGLH0^e3a2d601 pins=0 Change=NONE  
                Heap=0x8e53c588 Pointer=0x89ac3150 Extent=0x89ac3030 Flags=I/-/P/A/-/-
                FreedLocation=0 Alloc=3.398438 Size=3.976562 LoadTime=4295814490
              Block:  #='6' name=SQLA^e3a2d601 pins=0 Change=NONE  
                Heap=0x89ac4768 Pointer=0x893b19b8 Extent=0x893b0d60 Flags=I/-/P/A/-/E
                FreedLocation=0 Alloc=18.210938 Size=19.781250 LoadTime=0
          NamespaceDump: 
            Child Cursor:  Heap0=0x89ac3150 Heap6=0x893b19b8 Heap0 Load Time=08-28-2014 12:30:54 Heap6 Load Time=08-28-2014 12:30:54
  NamespaceDump: 
    Parent Cursor:  sql_id=3nkd3g3ju5ph1 parent=0x89ac4150 maxchild=1 plk=y ppn=n    kkscs=0x89ac4628 nxt=(nil) flg=18 cld=0 hd=0x8e53c640 par=0x89ac4150
   Mutex 0x89ac4628(0, 0) idn 3000000000
   ct=0 hsh=0 unp=(nil) unn=0 hvl=89ac4ff8 nhv=0 ses=(nil)
   hep=0x89ac46c0 flg=80 ld=1 ob=0x89ac30b0 ptr=0x893b19b8 fex=0x893b0d60
cursor instantiation=0x2a97e25ea0 used=1409200254 exec_id=16777222 exec=7
 child#0(0x8e53c640) pcs=0x89ac4628
  clk=0x8e5ae280 ci=0x89ac3150 pn=0x8e5af620 ctx=0x893b19b8
 kgsccflg=9 llk[0x2a97e25ea8,0x2a97e25ea8] idx=2
 xscflg=c03504f6 fl2=5040001 fl3=40222108 fl4=100
----- Bind Byte Code (IN) -----
  Opcode = 1   Unoptimized
  Offsi = 48, Offsi = 0
  Opcode = 1   Unoptimized
  Offsi = 48, Offsi = 32
  Opcode = 1   Unoptimized
  Offsi = 48, Offsi = 64
----- Bind Info (kkscoacd) -----
 Bind#0
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=00 fl2=0001 frm=00 csi=00 siz=80 off=0
  kxsbbbfp=2a98203d48  bln=22  avl=01  flg=05
  value=0
 Bind#1
  oacdty=01 mxl=32(10) mxlc=00 mal=00 scl=00 pre=00
  oacflg=10 fl2=0001 frm=01 csi=873 siz=0 off=24
  kxsbbbfp=2a98203d60  bln=32  avl=10  flg=01
  value="TAB_STATS$"
 Bind#2
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=00 fl2=0001 frm=00 csi=00 siz=0 off=56
  kxsbbbfp=2a98203d80  bln=22  avl=02  flg=01
  value=1
 Frames pfr 0x2a97e9a578 siz=7208 efr 0x2a97e9a5d0 siz=7192
 Cursor frame dump
  enxt: 5.0x00000010  enxt: 4.0x00000bf8  enxt: 3.0x00000588  enxt: 2.0x00000060
  enxt: 1.0x00000a38
  pnxt: 1.0x00000010

在这一部分，我们发现了当前正在执行的SQL语句，绑定变量的信息。其实当前SQL在，在trace文件最开始部分就有，这一点是所有版本通用的

----- Current SQL Statement for this session (sql_id=3nkd3g3ju5ph1) -----
select obj#,type#,ctime,mtime,stime, status, dataobj#, flags, oid$, spare1, spare2 from obj$ where owner#=:1 and name=:2 and namespace=:3 and remoteowner is null and linkname is null and subname is null
sql_text_length=203
sql=select obj#,type#,ctime,mtime,stime, status, dataobj#, flags, oid$, spare1, spare2 from obj$ where owner#=:1 and name=:2 and namespace=:3 and remoteowner is null and linkname is null and subname is null

5.3.3 10046中分析访问的块

这里我们以之前发现的SQL语句在trace中搜10046event的输出信息，找到等待事件等信息，但是不知道为什么，在10046那一部分中，没有找到相应的SQL，可能是trace文件有那里有问题导致的，但是发现了绑定变量与等待事件的信息

BINDS #182936821408:
 Bind#0
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=00 fl2=0001 frm=00 csi=00 siz=80 off=0
  kxsbbbfp=2a98203d48  bln=22  avl=01  flg=05
  value=0
 Bind#1
  oacdty=01 mxl=32(10) mxlc=00 mal=00 scl=00 pre=00
  oacflg=10 fl2=0001 frm=01 csi=873 siz=0 off=24
  kxsbbbfp=2a98203d60  bln=32  avl=10  flg=01
  value="TAB_STATS$"
 Bind#2
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=00 fl2=0001 frm=00 csi=00 siz=0 off=56
  kxsbbbfp=2a98203d80  bln=22  avl=02  flg=01
  value=1
EXEC #182936821408:c=0,e=61,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=2853959010,tim=1409200255032571
WAIT #182936821408: nam='db file sequential read' ela= 8 file#=1 block#=20804 blocks=1 obj#=37 tim=1409200255032603
=====================
PARSING IN CURSOR #182936803264 len=142 dep=2 uid=0 oct=3 lid=0 tim=1409200255032997 hv=361892850 ad='8e534360' sqlid='7bd391hat42zk'
select /*+ rule */ name,file#,block#,status$,user#,undosqn,xactsqn,scnbas,scnwrp,DECODE(inst#,0,NULL,inst#),ts#,spare1 from undo$ where us#=:1
END OF STMT
PARSE #182936803264:c=0,e=352,p=0,cr=0,cu=0,mis=1,r=0,dep=2,og=3,plh=0,tim=1409200255032996
BINDS #182936803264:
 Bind#0
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=2a98202538  bln=22  avl=02  flg=05
  value=6
EXEC #182936803264:c=1000,e=507,p=0,cr=0,cu=0,mis=1,r=0,dep=2,og=3,plh=906473769,tim=1409200255033560
WAIT #182936803264: nam='db file sequential read' ela= 10 file#=1 block#=321 blocks=1 obj#=34 tim=1409200255033606
WAIT #182936803264: nam='db file sequential read' ela= 13 file#=1 block#=225 blocks=1 obj#=15 tim=1409200255033655
FETCH #182936803264:c=0,e=99,p=2,cr=2,cu=0,mis=0,r=1,dep=2,og=3,plh=906473769,tim=1409200255033676
STAT #182936803264 id=1 cnt=1 pid=0 pos=1 obj=15 op='TABLE ACCESS BY INDEX ROWID UNDO$ (cr=2 pr=2 pw=0 time=96 us)'
STAT #182936803264 id=2 cnt=1 pid=1 pos=1 obj=34 op='INDEX UNIQUE SCAN I_UNDO1 (cr=1 pr=1 pw=0 time=51 us)'
CLOSE #182936803264:c=0,e=4,dep=2,type=0,tim=1409200255033718
WAIT #182936821408: nam='db file sequential read' ela= 13 file#=3 block#=208 blocks=1 obj#=0 tim=1409200255033756
WAIT #182936821408: nam='db file sequential read' ela= 6 file#=3 block#=440 blocks=1 obj#=0 tim=1409200255033873

从上面信息我们可以看到182936821408执行的时候，去访问 file#=1 block#=20804 blocks=1 obj#=37 信息。下一条SQL语句就是去访问报错的UNDO段的信息，跟之前的报错信息也是一致的。通过上面的信息，我们报到报错的SQL，绑定变量，访问的块的信息

5.3.4 查看块dump的信息

这里以1/20804来搜索，可以发现trace文件中已经dump了块的信息，这里也看到块的rdba地址，跟5.3.1那里怀疑的块是一致的。

    BH (0x7cbf9548) file#: 1 rdba: 0x00405144 (1/20804) class: 1 ba: 0x7cb84000
         set: 3 pool: 3 bsz: 8192 bsi: 0 sflg: 2 pwc: 0,0
         dbwrid: 0 obj: 37 objn: 37 tsn: 0 afn: 1 hint: f
         hash: [0x8d7e0908,0x8d7e0908] lru: [0x7cbf9760,0x7cbf9500]
         obj-flags: object_ckpt_list
         ckptq: [0x8d19c920,0x8d19c920] fileq: [0x8d19c940,0x8d19c940] objq: [0x8a1ae3c8,0x8a1ae3c8] objaq: [0x7cbf98c8,0x8a1ae3a8]
         use: [0x8d1633b0,0x8d1633b0] wait: [NULL]
         st: XCURRENT md: SHR fpin: 'qeilwhnp: qeilbk' tch: 0
         flags: buffer_dirty redo_since_read
         LRBA: [0x1.3.0] LSCN: [0x0.21eda4] HSCN: [0x0.21eda4] HSUB: [1]
         Using State Objects
           ----------------------------------------
           SO: 0x8d163330, type: 38, owner: 0x8dc90ea0, flag: INIT/-/-/0x00 if: 0x1 c: 0x1
            proc=0x8e098608, name=buffer handle, file=kcb2.h LINE:2677, pg=0
           (buffer) (CR) PR: 0x8e098608 FLG: 0x100000
           class bit: 0x0
            cr[0]:
            sh[0]:
           kcbbfbp: [BH: 0x7cbf9548, LINK: 0x8d1633b0]
           type: normal pin
           where: qeilwhnp: qeilbk, why: 54104
         buffer tsn: 0 rdba: 0x00405144 (1/20804)
         scn: 0x0000.0021eda4 seq: 0x01 flg: 0x00 tail: 0xeda40601
         frmt: 0x02 chkval: 0x0000 type: 0x06=trans data

下面继续查看块的信息

Block header dump:  0x00405144
 Object id on Block? Y
 seg/obj: 0x25  csc: 0x00.21eda4  itc: 2  flg: -  typ: 2 - INDEX
     fsl: 0  fnx: 0x0 ver: 0x01

 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x0002.015.0000009f  0x00c03a8d.001c.01  CB--    0  scn 0x0000.0002a810
0x02   0x0006.014.00000442  0x00c001b8.010e.11  ----    1  fsc 0x0000.00000000

这里我们可以看到XID的信息，ITL 02上面有一行的数据正在被修改，使用的undo段为6，使用的uba地址为00c001b8，使用的插槽号为14，trap#为442，块为索引块。其实这里看到是索引块，如果正常情况下，我们重建索引就可以解决问题了。或者是不走索引也是可以的。下面是将uba中的地址转为数据文件与块号

www.htz.pw > @dba_to_fno_bno.sql
Enter value for dba: 00c001b8

      FILE      BLOCK
---------- ----------
         3        440

下面继续查看块中行记录的信息

row#102[1300] flag: ------, lock: 2, len=32, data:(6):  00 41 42 1e 00 3d
col 0; len 1; (1):  80
col 1; len 5; (5):  54 45 53 54 32
col 2; len 2; (2):  c1 02
col 3; NULL
col 4; NULL
col 5; NULL
col 6; len 2; (2):  c1 03
col 7; len 1; (1):  80
col 8; len 4; (4):  c3 08 45 2e

这里看到了索引中的行记录。下面继续查看块中对象的信息

www.htz.pw > set echo off
Enter Search Object Id (i.e. 1235) : 37

                                 Object          Create               Last_Ddl
OWNEROBJECT_NAME  SUBOBJECT_NAME Type            Time                 Time                 STATUS
----------------- ------------------------------ -------------------- -------------------- ----------
SYS  I_OBJ2                      INDEX           2013-08-24 11:37:35  2013-08-24 11:37:35  VALID

这里可以看到是索引的名字等详细的信息，

5.3.5 查看undo块与头的信息

从5.3.4中我们查看了undo块的信息，下面我们来看看undo的信息

BH (0x7cbf91b8) file#: 3 rdba: 0x00c000d0 (3/208) class: 27 ba: 0x7cb7e000
  set: 3 pool: 3 bsz: 8192 bsi: 0 sflg: 2 pwc: 0,0
  dbwrid: 0 obj: -1 objn: 0 tsn: 2 afn: 3 hint: f
  hash: [0x8d040fa0,0x8d040fa0] lru: [0x7cbf93d0,0x7cbf9170]
  ckptq: [NULL] fileq: [NULL] objq: [0x8a1adeb8,0x8a1adeb8] objaq: [0x8a1adea8,0x8a1adea8]
  st: XCURRENT md: NULL fpin: 'ktuwh05: ktugct' tch: 1
  flags:
  LRBA: [0x0.0.0] LSCN: [0x0.0] HSCN: [0xffff.ffffffff] HSUB: [65535]
  buffer tsn: 2 rdba: 0x00c000d0 (3/208)
  scn: 0x0000.0012a779 seq: 0x02 flg: 0x04 tail: 0xa7792602
  frmt: 0x02 chkval: 0x33bf type: 0x26=KTU SMU HEADER BLOCK
 
 07CB7FFF0 00000000 00000000 00000000 A7792602  [.............&y.]  
  Extent Control Header
  -----------------------------------------------------------------
  Extent Header:: spare1: 0      spare2: 0      #extents: 5      #blocks: 39    
                  last map  0x00000000  #maps: 0      offset: 4080  
      Highwater::  0x00c001a5  ext#: 1      blk#: 5      ext size: 8     
  #blocks in seg. hdr's freelists: 0     
  #blocks below: 0     
  mapblk  0x00000000  offset: 1       
                   Unlocked
     Map Header:: next  0x00000000  #extents: 5    obj#: 0      flag: 0x40000000
  Extent Map    
  -----------------------------------------------------------------
   0x00c000d1  length: 7      
   0x00c001a0  length: 8      
   0x00c001b8  length: 8      
   0x00c000d8  length: 8      
   0x00c00138  length: 8      
   
 Retention Table 
  -----------------------------------------------------------
 Extent Number:0  Commit Time: 1409188509
 Extent Number:1  Commit Time: 1409188509
 Extent Number:2  Commit Time: 0      
 Extent Number:3  Commit Time: 1409188428
 Extent Number:4  Commit Time: 1409188492
   
  TRN CTL:: seq: 0x010d chd: 0x000d ctl: 0x0015 inc: 0x00000000 nfb: 0x0001
            mgc: 0xb000 xts: 0x0068 flg: 0x0001 opt: 2147483646 (0x7ffffffe)
            uba: 0x00c001a5.010d.25 scn: 0x0000.0012a420
Version: 0x01
  FREE BLOCK POOL::
    uba: 0x00c001a5.010d.25 ext: 0x1  spc: 0xe10   
    uba: 0x00000000.0109.07 ext: 0x0  spc: 0x1a20
    uba: 0x00000000.00d1.25 ext: 0x5  spc: 0x608   
    uba: 0x00000000.00b0.01 ext: 0x6  spc: 0x1ed0  
    uba: 0x00000000.00b0.01 ext: 0x6  spc: 0x1ed0  
  TRN TBL::
   
  index  state cflags  wrap#    uel         scn            dba            parent-xid    nub     stmt_num    cmt
  ------------------------------------------------------------------------------------------------
   0x00    9    0x00  0x0442  0x001f  0x0000.0012a6b9  0x00c001a4  0x0000.000.00000000  0x00000001   0x00000000  1409188524
   0x01    9    0x00  0x0442  0x0015  0x0000.0012a760  0x00c001a5  0x0000.000.00000000  0x00000001   0x00000000  1409188658
   0x02    9    0x00  0x0441  0x0014  0x0000.0012a515  0x00c0013b  0x0000.000.00000000  0x00000003   0x00000000  1409188485
   0x03    9    0x00  0x0441  0x000b  0x0000.0012a4b5  0x00c00138  0x0000.000.00000000  0x00000001   0x00000000  1409188478
   0x04    9    0x00  0x0441  0x0003  0x0000.0012a4a1  0x00c00138  0x0000.000.00000000  0x00000001   0x00000000  1409188478
   0x05    9    0x00  0x0441  0x0007  0x0000.0012a451  0x00c00138  0x0000.000.00000000  0x00000001   0x00000000  1409188477
   0x06    9    0x00  0x0442  0x0008  0x0000.0012a71f  0x00c001a5  0x0000.000.00000000  0x00000002   0x00000000  1409188606
   0x07    9    0x00  0x0441  0x0010  0x0000.0012a477  0x00c00138  0x0000.000.00000000  0x00000001   0x00000000  1409188478
   0x08    9    0x00  0x0441  0x0001  0x0000.0012a73d  0x00c001a5  0x0000.000.00000000  0x00000001   0x00000000  1409188608
   0x09    9    0x00  0x0441  0x0002  0x0000.0012a502  0x00c00139  0x0000.000.00000000  0x00000001   0x00000000  1409188485
   0x0a    9    0x00  0x0442  0x0020  0x0000.0012a6de  0x00c001a4  0x0000.000.00000000  0x00000001   0x00000000  1409188545
   0x0b    9    0x00  0x0441  0x0011  0x0000.0012a4cc  0x00c00138  0x0000.000.00000000  0x00000001   0x00000000  1409188481
   0x0c    9    0x00  0x0441  0x001a  0x0000.0012a5af  0x00c000d3  0x0000.000.00000000  0x00000006   0x00000000  1409188492
   0x0d    9    0x00  0x0440  0x0005  0x0000.0012a43b  0x00c00138  0x0000.000.00000000  0x00000001   0x00000000  1409188471
   0x0e    9    0x00  0x0441  0x0013  0x0000.0012a552  0x00c0013b  0x0000.000.00000000  0x00000001   0x00000000  1409188486
   0x0f    9    0x00  0x0441  0x0009  0x0000.0012a4f4  0x00c00139  0x0000.000.00000000  0x00000002   0x00000000  1409188485
   0x10    9    0x00  0x0441  0x0004  0x0000.0012a48a  0x00c00138  0x0000.000.00000000  0x00000001   0x00000000  1409188478
   0x11    9    0x00  0x0441  0x000f  0x0000.0012a4e2  0x00c00138  0x0000.000.00000000  0x00000001   0x00000000  1409188485
   0x12    9    0x00  0x0441  0x0000  0x0000.0012a6a9  0x00c001a4  0x0000.000.00000000  0x00000001   0x00000000  1409188523
   0x13    9    0x00  0x0441  0x0016  0x0000.0012a56a  0x00c0013c  0x0000.000.00000000  0x00000002   0x00000000  1409188487
   0x14    9    0x00  0x0441  0x0017  0x0000.0012a52a  0x00c0013b  0x0000.000.00000000  0x00000001   0x00000000  1409188485
   0x15    9    0x00  0x0441  0xffff  0x0000.0012a779  0x00c001a5  0x0000.000.00000000  0x00000001   0x00000000  1409188665
   0x16    9    0x00  0x0441  0x0019  0x0000.0012a57e  0x00c0013c  0x0000.000.00000000  0x00000001   0x00000000  1409188489
   0x17    9    0x00  0x0441  0x000e  0x0000.0012a541  0x00c0013b  0x0000.000.00000000  0x00000001   0x00000000  1409188486
   0x18    9    0x00  0x0441  0x001c  0x0000.0012a67e  0x00c001a1  0x0000.000.00000000  0x00000007   0x00000000  1409188509
   0x19    9    0x00  0x0441  0x000c  0x0000.0012a59b  0x00000000  0x0000.000.00000000  0x00000000   0x00000000  1409188490
   0x1a    9    0x00  0x0441  0x001b  0x0000.0012a5c5  0x00c000d3  0x0000.000.00000000  0x00000001   0x00000000  1409188494
   0x1b    9    0x00  0x0441  0x0021  0x0000.0012a5eb  0x00c000d3  0x0000.000.00000000  0x00000001   0x00000000  1409188494
   0x1c    9    0x00  0x0441  0x0012  0x0000.0012a689  0x00c001a4  0x0000.000.00000000  0x00000003   0x00000000  1409188509
   0x1d    9    0x00  0x0441  0x001e  0x0000.0012a617  0x00c000d3  0x0000.000.00000000  0x00000001   0x00000000  1409188499
   0x1e    9    0x00  0x0441  0x0018  0x0000.0012a632  0x00c000d3  0x0000.000.00000000  0x00000001   0x00000000  1409188507
   0x1f    9    0x00  0x0441  0x000a  0x0000.0012a6cb  0x00c001a4  0x0000.000.00000000  0x00000001   0x00000000  1409188538
   0x20    9    0x00  0x0441  0x0006  0x0000.0012a701  0x00c001a4  0x0000.000.00000000  0x00000001   0x00000000  1409188586
   0x21    9    0x00  0x0440  0x001d  0x0000.0012a601  0x00c000d3  0x0000.000.00000000  0x00000001   0x00000000  1409188499

这里可以看到undo段头中index为14的事务状态是9(非活动的)，wrap#值为0441，比之前的块中的442还小。所以此时的undo段头块比我们所需要的段头块还要旧。下面继续查看undo块的信息

********************************************************************************
UNDO BLK:
xid: 0x0008.007.00000389  seq: 0xdb  cnt: 0x4   irb: 0x4   icl: 0x0   flg: 0x0000

 Rec Offset      Rec Offset      Rec Offset      Rec Offset      Rec Offset
---------------------------------------------------------------------------
0x01 0x1f38     0x02 0x1eb0     0x03 0x1e50     0x04 0x1dc8

*-----------------------------
* Rec #0x1  slt: 0x18  objn: 457(0x000001c9)  objd: 457  tblspc: 0(0x00000000)
*       Layer:  11 (Row)   opc: 1   rci 0x00
Undo type:  Regular undo    Begin trans    Last buffer split:  No
Temp Object:  No
Tablespace Undo:  No
rdba: 0x00000000Ext idx: 0
flg2: 0
*-----------------------------
uba: 0x00c0014f.00da.3d ctl max scn: 0x0000.00126bb5 prv tx scn: 0x0000.00126bcd
txn start scn: scn: 0x0000.00126f93 logon user: 0
 prev brb: 12583242 prev bcl: 0
KDO undo record:
KTB Redo
op: 0x04  ver: 0x01
compat bit: 4 (post-11) padding: 1
op: L  itl: xid:  0x0005.020.00000375 uba: 0x00c001ae.00ba.2e
                      flg: C---    lkc:  0     scn: 0x0000.00126b8b
KDO Op code: URP row dependencies Disabled
  xtype: XAxtype KDO_KDOM2 flags: 0x00000080  bdba: 0x00400c21  hdba: 0x00400c18
itli: 1  ispac: 0  maxfr: 4863
tabn: 0 slot: 173(0xad) flag: 0x2c lock: 0 ckix: 11
ncol: 9 nnew: 7 size: 0
Vector content:
col  2: [ 2]  c1 07
col  3: [ 2]  c1 04
col  4: [ 1]  80
col  5: [ 1]  80
col  6: [ 1]  80
col  7: [ 1]  80
col  8: [ 7]  78 72 08 1a 0c 01 10

*-----------------------------
* Rec #0x2  slt: 0x1b  objn: 461(0x000001cd)  objd: 461  tblspc: 0(0x00000000)
*       Layer:  11 (Row)   opc: 1   rci 0x00
Undo type:  Regular undo    Begin trans    Last buffer split:  No
Temp Object:  No
Tablespace Undo:  No
rdba: 0x00000000Ext idx: 0
flg2: 0
*-----------------------------
uba: 0x00c001b8.00db.01 ctl max scn: 0x0000.00126bcd prv tx scn: 0x0000.00126bd9
txn start scn: scn: 0x0000.00126f9d logon user: 0
 prev brb: 12583244 prev bcl: 0
KDO undo record:
KTB Redo
op: 0x04  ver: 0x01
compat bit: 4 (post-11) padding: 1
op: L  itl: xid:  0x0001.00c.000002b9 uba: 0x00c0023a.00cc.36
                      flg: C---    lkc:  0     scn: 0x0000.00126f9b
KDO Op code: DRP row dependencies Disabled
  xtype: XA flags: 0x00000000  bdba: 0x00400c72  hdba: 0x00400c70
itli: 1  ispac: 0  maxfr: 4863
tabn: 0 slot: 171(0xab)

*-----------------------------
* Rec #0x3  slt: 0x1b  objn: 462(0x000001ce)  objd: 462  tblspc: 0(0x00000000)
*       Layer:  10 (Index)   opc: 22   rci 0x02
Undo type:  Regular undo   Last buffer split:  No
Temp Object:  No
Tablespace Undo:  No
rdba: 0x00000000
*-----------------------------
index undo for leaf key operations
KTB Redo
op: 0x04  ver: 0x01
compat bit: 4 (post-11) padding: 1
op: L  itl: xid:  0x0009.006.00000394 uba: 0x00c00155.00e6.29
                      flg: C---    lkc:  0     scn: 0x0000.00126f9c
Dump kdilk : itl=2, kdxlkflg=0x1 sdc=32655 indexid=0x400c90 block=0x00400c91
(kdxlpu): purge leaf row
key :(5):  04 c3 08 13 29

*-----------------------------
* Rec #0x4  slt: 0x07  objn: 71834(0x0001189a)  objd: 71834  tblspc: 1(0x00000001)
*       Layer:  11 (Row)   opc: 1   rci 0x00
Undo type:  Regular undo    Begin trans    Last buffer split:  No
Temp Object:  No
Tablespace Undo:  No
rdba: 0x00000000Ext idx: 0
flg2: 0
*-----------------------------
uba: 0x00c001b8.00db.02 ctl max scn: 0x0000.00126bd9 prv tx scn: 0x0000.00126c09
txn start scn: scn: 0x0000.00126ec1 logon user: 71
 prev brb: 12583244 prev bcl: 0
KDO undo record:
KTB Redo
op: 0x04  ver: 0x01
compat bit: 4 (post-11) padding: 1
op: L  itl: xid:  0x000a.009.000002d3 uba: 0x00c00356.009a.0c
                      flg: C---    lkc:  0     scn: 0x0000.0012666e
KDO Op code: DRP row dependencies Disabled
  xtype: XA flags: 0x00000000  bdba: 0x00810933  hdba: 0x00810932
itli: 1  ispac: 0  maxfr: 4858
tabn: 0 slot: 1(0x1)

在undo数据块中，也没有查看我们所需要的事务的UNDO记录数

5.3.6 分析结果

由于undo中没有包含事务所需要的undo记录，导致事务rollback的时候，触发了ORA-01555报错。这里知道原因后，解决问题就很简单了

5.3.7 bbed验证一下块中数据

[oracle@www.htz.pw trace]$bbed
Password: 

BBED: Release 2.0.0.0.0 - Limited Production on Thu Aug 28 12:39:42 2014

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

************* !!! For Oracle Internal Use only !!! ***************

BBED> set filename '/oracle/app/oracle/oradata/database/cos11g/system01.dbf';
        FILENAME        /oracle/app/oracle/oradata/database/cos11g/system01.dbf

BBED> set block 20804
        BLOCK#          20804

BBED> map
 File: /oracle/app/oracle/oradata/database/cos11g/system01.dbf (0)
 Block: 20804                                 Dba:0x00000000
------------------------------------------------------------
 KTB Data Block (Index Leaf)

 struct kcbh, 20 bytes                      @0       

 struct ktbbh, 72 bytes                     @20      

 struct kdxle, 32 bytes                     @92      

 sb2 kd_off[156]                            @124     

 ub1 freespace[952]                         @436     

 ub1 rowdata[6732]                          @1388    

 ub4 tailchk                                @8188    


BBED> p ktbbh
struct ktbbh, 72 bytes                      @20      
   ub1 ktbbhtyp                             @20       0x02 (KDDBTINDEX)
   union ktbbhsid, 4 bytes                  @24      
      ub4 ktbbhsg1                          @24       0x00000025
      ub4 ktbbhod1                          @24       0x00000025
   struct ktbbhcsc, 8 bytes                 @28      
      ub4 kscnbas                           @28       0x0012a91c
      ub2 kscnwrp                           @32       0x0000
   sb2 ktbbhict                             @36       7938
   ub1 ktbbhflg                             @38       0x02 (NONE)
   ub1 ktbbhfsl                             @39       0x00
   ub4 ktbbhfnx                             @40       0x00000000
   struct ktbbhitl[0], 24 bytes             @44      
      struct ktbitxid, 8 bytes              @44      
         ub2 kxidusn                        @44       0x0002
         ub2 kxidslt                        @46       0x0015
         ub4 kxidsqn                        @48       0x0000009f
      struct ktbituba, 8 bytes              @52      
         ub4 kubadba                        @52       0x00c03a8d
         ub2 kubaseq                        @56       0x001c
         ub1 kubarec                        @58       0x01
      ub2 ktbitflg                          @60       0xc000 (KTBFIBI, KTBFCOM)
      union _ktbitun, 2 bytes               @62      
         sb2 _ktbitfsc                      @62       0
         ub2 _ktbitwrp                      @62       0x0000
      ub4 ktbitbas                          @64       0x0002a810
   struct ktbbhitl[1], 24 bytes             @68      
      struct ktbitxid, 8 bytes              @68      
         ub2 kxidusn                        @68       0x0006
         ub2 kxidslt                        @70       0x0014
         ub4 kxidsqn                        @72       0x00000442
      struct ktbituba, 8 bytes              @76      
         ub4 kubadba                        @76       0x00c001b8
         ub2 kubaseq                        @80       0x010e
         ub1 kubarec                        @82       0x11
      ub2 ktbitflg                          @84       0x0001 (NONE)
      union _ktbitun, 2 bytes               @86      
         sb2 _ktbitfsc                      @86       0
         ub2 _ktbitwrp                      @86       0x0000
      ub4 ktbitbas                          @88       0x00000000
      
      
     

BBED>  x /rncnn *kd_off[104]
rowdata[4]                                  @1392    
----------
flag@1392:     0x00 (NONE)
lock@1393:     0x02
keydata[6]:    0x00  0x41  0x42  0x1e  0x00  0x3d 
data key:
col    0[1] @1401: 0 
col    1[5] @1403: TEST2
col    2[2] @1409: 1 
col    3[0] @1412: *NULL*
col    4[0] @1413: *NULL*
col    5[0] @1414: *NULL*
col    6[2] @1415: 2 
col    7[1] @1418: 0 
col    8[4] @1420: 76845

6 故障处理过程

此故障处理的方法一般有2种

走全表扫描
手动提交事务信息

6.1 走全表扫描

由于这里是oracle2进制中的sql触发的报错，所以要走全表扫描，需要修改oracle2进制文件，见曾经的笔记ORA-08103，使用ue修改oracle2进制文件来完美解决，连接ue修改oracle文件

6.2 手动提交事务

相信这种情况下，大家一般会选择bbed的方法来解决，因为这种方法更简单，但是这种方法风险更高。如果要修改oracle2进制文件，需要在sql中增加full提示或者修改where后面列的信息，上面sql中引用的列都是字符集，增加更改起来比较麻烦，不仅需要改SQL内容，还需要更改其它地方，不然会open的时候会触发ORA-07445报错。

BBED> x /rncnn *kd_off[104]
rowdata[4]                                  @1392    
----------
flag@1392:     0x00 (NONE)
lock@1393:     0x02
keydata[6]:    0x00  0x41  0x42  0x1e  0x00  0x3d 
data key:
col    0[1] @1401: 0 
col    1[5] @1403: TEST2
col    2[2] @1409: 1 
col    3[0] @1412: *NULL*
col    4[0] @1413: *NULL*
col    5[0] @1414: *NULL*
col    6[2] @1415: 2 
col    7[1] @1418: 0 
col    8[4] @1420: 76845 


BBED> set offset 1393
        OFFSET          1393

BBED> set count 10
        COUNT           10

BBED> set mode edit
        MODE            Edit

BBED> dump
 File: /oracle/app/oracle/oradata/database/cos11g/system01.dbf (0)
 Block: 20804            Offsets: 1393 to 1402           Dba:0x00000000
------------------------------------------------------------------------
 02004142 1e003d01 8005 

 <32 bytes per line>

BBED> modify /x 00 offset 1393
 File: /oracle/app/oracle/oradata/database/cos11g/system01.dbf (0)
 Block: 20804            Offsets: 1393 to 1402           Dba:0x00000000
------------------------------------------------------------------------
 00004142 1e003d01 8005 

 <32 bytes per line>
 

BBED> verify
DBVERIFY - Verification starting
FILE = /oracle/app/oracle/oradata/database/cos11g/system01.dbf
BLOCK = 20804

Block Checking: DBA = 4215108, Block Type = KTB-managed data block
**** actual rows locked by itl 2  = 0 != # in trans. header = 1
---- end index block validation
Block 20804 failed with check code 6401

DBVERIFY - Verification complete

Total Blocks Examined         : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 1
Total Blocks Failing   (Index): 1
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 0
Total Blocks Influx           : 0
Message 531 not found;  product=RDBMS; facility=BBED


BBED> p ktbbh
struct ktbbh, 72 bytes                      @20      
   ub1 ktbbhtyp                             @20       0x02 (KDDBTINDEX)
   union ktbbhsid, 4 bytes                  @24      
      ub4 ktbbhsg1                          @24       0x00000025
      ub4 ktbbhod1                          @24       0x00000025
   struct ktbbhcsc, 8 bytes                 @28      
      ub4 kscnbas                           @28       0x0012a91c
      ub2 kscnwrp                           @32       0x0000
   sb2 ktbbhict                             @36       7938
   ub1 ktbbhflg                             @38       0x02 (NONE)
   ub1 ktbbhfsl                             @39       0x00
   ub4 ktbbhfnx                             @40       0x00000000
   struct ktbbhitl[0], 24 bytes             @44      
      struct ktbitxid, 8 bytes              @44      
         ub2 kxidusn                        @44       0x0002
         ub2 kxidslt                        @46       0x0015
         ub4 kxidsqn                        @48       0x0000009f
      struct ktbituba, 8 bytes              @52      
         ub4 kubadba                        @52       0x00c03a8d
         ub2 kubaseq                        @56       0x001c
         ub1 kubarec                        @58       0x01
      ub2 ktbitflg                          @60       0xc000 (KTBFIBI, KTBFCOM)
      union _ktbitun, 2 bytes               @62      
         sb2 _ktbitfsc                      @62       0
         ub2 _ktbitwrp                      @62       0x0000
      ub4 ktbitbas                          @64       0x0002a810
   struct ktbbhitl[1], 24 bytes             @68      
      struct ktbitxid, 8 bytes              @68      
         ub2 kxidusn                        @68       0x0006
         ub2 kxidslt                        @70       0x0014
         ub4 kxidsqn                        @72       0x00000442
      struct ktbituba, 8 bytes              @76      
         ub4 kubadba                        @76       0x00c001b8
         ub2 kubaseq                        @80       0x010e
         ub1 kubarec                        @82       0x11
      ub2 ktbitflg                          @84       0x0001 (NONE)
      union _ktbitun, 2 bytes               @86      
         sb2 _ktbitfsc                      @86       0
         ub2 _ktbitwrp                      @86       0x0000
      ub4 ktbitbas                          @88       0x00000000

BBED> modify /x offset 84
BBED-00209: invalid number (offset)


BBED> modify /x 0080 offset 84
 File: /oracle/app/oracle/oradata/database/cos11g/system01.dbf (0)
 Block: 20804            Offsets:   84 to   93           Dba:0x00000000
------------------------------------------------------------------------
 00800000 00000000 0000 

 <32 bytes per line>

BBED> p ktbbh
struct ktbbh, 72 bytes                      @20      
   ub1 ktbbhtyp                             @20       0x02 (KDDBTINDEX)
   union ktbbhsid, 4 bytes                  @24      
      ub4 ktbbhsg1                          @24       0x00000025
      ub4 ktbbhod1                          @24       0x00000025
   struct ktbbhcsc, 8 bytes                 @28      
      ub4 kscnbas                           @28       0x0012a91c
      ub2 kscnwrp                           @32       0x0000
   sb2 ktbbhict                             @36       7938
   ub1 ktbbhflg                             @38       0x02 (NONE)
   ub1 ktbbhfsl                             @39       0x00
   ub4 ktbbhfnx                             @40       0x00000000
   struct ktbbhitl[0], 24 bytes             @44      
      struct ktbitxid, 8 bytes              @44      
         ub2 kxidusn                        @44       0x0002
         ub2 kxidslt                        @46       0x0015
         ub4 kxidsqn                        @48       0x0000009f
      struct ktbituba, 8 bytes              @52      
         ub4 kubadba                        @52       0x00c03a8d
         ub2 kubaseq                        @56       0x001c
         ub1 kubarec                        @58       0x01
      ub2 ktbitflg                          @60       0xc000 (KTBFIBI, KTBFCOM)
      union _ktbitun, 2 bytes               @62      
         sb2 _ktbitfsc                      @62       0
         ub2 _ktbitwrp                      @62       0x0000
      ub4 ktbitbas                          @64       0x0002a810
   struct ktbbhitl[1], 24 bytes             @68      
      struct ktbitxid, 8 bytes              @68      
         ub2 kxidusn                        @68       0x0006
         ub2 kxidslt                        @70       0x0014
         ub4 kxidsqn                        @72       0x00000442
      struct ktbituba, 8 bytes              @76      
         ub4 kubadba                        @76       0x00c001b8
         ub2 kubaseq                        @80       0x010e
         ub1 kubarec                        @82       0x11
      ub2 ktbitflg                          @84       0x8000 (KTBFCOM)
      union _ktbitun, 2 bytes               @86      
         sb2 _ktbitfsc                      @86       0
         ub2 _ktbitwrp                      @86       0x0000
      ub4 ktbitbas                          @88       0x00000000
      
BBED> verify
DBVERIFY - Verification starting
FILE = /oracle/app/oracle/oradata/database/cos11g/system01.dbf
BLOCK = 20804


DBVERIFY - Verification complete

Total Blocks Examined         : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 1
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 0
Total Blocks Influx           : 0
Message 531 not found;  product=RDBMS; facility=BBED

7，数据库打开

ww.htz.pw > select open_mode from v$database;

OPEN_MODE
--------------------
MOUNTED

www.htz.pw > recover database using backup controlfile until cancel;
ORA-00279: change 2223521 generated at 08/28/2014 12:30:54 needed for thread 1
ORA-00289: suggestion :
/oracle/app/oracle/fast_recovery_area/COS11G/archivelog/2014_08_28/o1_mf_1_1_%u_
.arc
ORA-00280: change 2223521 for thread 1 is in sequence #1


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1:
'/oracle/app/oracle/oradata/database/cos11g/system01.dbf'
        

ORA-01112: media recovery not started


www.htz.pw > alter database open resetlogs;

Database altered.

这里看到数据库已经打开了，下面我们需要增加TEMP文件，观察ALERT中是否有报错，如果没有报错，取消参数，以正常的方式打开数据库。如果alert中没有任何报错，一般情况下，取消参数是可以正常打开数据库的，当然因为数据库采用特殊方式打开，所以建议将数据库重建。

8，后续分享

本次我们分享的是bbed提交内容，下一次我们再分享通过修改oracle二进制文件的方式来修改此故障。