精华内容
下载资源
问答
  • Oracle 坏块修复

    2014-01-03 12:47:48
    Oracle 坏块修复 Oracle 坏块 总结
  • oracle坏块修复

    2020-11-02 10:49:47
    oracle坏块修复坏块模拟问题现象表坏块修复处理原则blockrecoverevent 10231,跳过坏块dbms_rowid 跳过坏块DBMS_REPAIR标记坏块,跳过坏块索引坏块修复 坏块模拟 SQL> select SEGMENT_NAME,EXTENT_ID,FILE_ID,...

    坏块模拟

    SQL> select SEGMENT_NAME,EXTENT_ID,FILE_ID,BLOCK_ID,BLOCKS from dba_extents where SEGMENT_NAME='TEST';
    SQL> set pages 1000 lines 1000
    SEGMENT_NAME                                                                       EXTENT_ID    FILE_ID   BLOCK_ID     BLOCKS
    --------------------------------------------------------------------------------- ---------- ---------- ---------- ----------
    TEST                                                                                       0          5        128          8
    TEST                                                                                       1          5        136          8
    TEST                                                                                       2          5        144          8
    TEST                                                                                       3          5        152          8
    TEST                                                                                       4          5        160          8
    TEST                                                                                       5          5        168          8
    TEST                                                                                       6          5        176          8
    TEST                                                                                       7          5        184          8
    TEST                                                                                       8          5        192          8
    TEST                                                                                       9          5        200          8
    TEST                                                                                      10          5        208          8
    TEST                                                                                      11          5        216          8
    TEST                                                                                      12          5        224          8
    TEST                                                                                      13          5        232          8
    TEST                                                                                      14          5        240          8
    TEST                                                                                      15          5        248          8
    TEST                                                                                      16          5        256        128
    TEST                                                                                      17          5        384        128
    TEST                                                                                      18          5        512        128
    TEST                                                                                      19          5        640        128
    TEST                                                                                      20          5        768        128
    TEST                                                                                      21          5        896        128
    TEST                                                                                      22          5       1024        128
    TEST                                                                                      23          5       1152        128
    TEST                                                                                      24          5       1280        128
    
    SQL> select count(*) from zhuo.test;
    
      COUNT(*)
    ----------
         80065
    25 rows selected.
    
    BBED> p seq_kcbh 
    ub1 seq_kcbh                                @14       0x02
    
    BBED> set file 5 block 152
    BBED> m /x 04
    BBED> sum apply
    Check value for File 5, Block 152:
    current = 0xed83, required = 0xed83
    
    BBED> v
    SQL> select count(*) from test;
    select count(*) from test
                         *
    ERROR at line 1:
    ORA-01578: ORACLE data block corrupted (file # 5, block # 152)
    ORA-01110: data file 5:
    '/u01/app/oracle/oradata/ZHUO/datafile/o1_mf_zhuo_gxdcfr5s_.dbf'
    

    alert日志报错:

    Thu Oct 29 20:37:25 2020
    Hex dump of (file 5, block 152) in trace file /u01/app/oracle/diag/rdbms/zhuo/zhuo/trace/zhuo_ora_3004.trc
    Corrupt block relative dba: 0x01400098 (file 5, block 152)
    Fractured block found during user buffer read
    Data in bad block:
     type: 6 format: 2 rdba: 0x01400098
     last change scn: 0x0000.000ba240 seq: 0x4 flg: 0x04
     spare1: 0x0 spare2: 0x0 spare3: 0x0
     consistency value in tail: 0xa2400602
     check value in block header: 0xed83
     computed block checksum: 0x0
    Reading datafile '/u01/app/oracle/oradata/ZHUO/datafile/o1_mf_zhuo_gxdcfr5s_.dbf' for corruption at rdba: 0x01400098 (file 5, block 152)
    Reread (file 5, block 152) found same corrupt data (no logical check)
    Thu Oct 29 20:37:25 2020
    Corrupt Block Found
             TSN = 5, TSNAME = ZHUO
             RFN = 5, BLK = 152, RDBA = 20971672
             OBJN = 80610, OBJD = 80610, OBJECT = TEST, SUBOBJECT = 
             SEGMENT OWNER = ZHUO, SEGMENT TYPE = Table Segment
    Errors in file /u01/app/oracle/diag/rdbms/zhuo/zhuo/trace/zhuo_ora_3004.trc  (incident=20161):
    ORA-01578: ORACLE data block corrupted (file # 5, block # 152)
    ORA-01110: data file 5: '/u01/app/oracle/oradata/ZHUO/datafile/o1_mf_zhuo_gxdcfr5s_.dbf'
    Incident details in: /u01/app/oracle/diag/rdbms/zhuo/zhuo/incident/incdir_20161/zhuo_ora_3004_i20161.trc
    Thu Oct 29 20:37:27 2020
    Sweep [inc][20161]: completed
    Thu Oct 29 20:37:27 2020
    Sweep [inc2][20161]: completed
    Hex dump of (file 5, block 152) in trace file /u01/app/oracle/diag/rdbms/zhuo/zhuo/incident/incdir_20161/zhuo_m000_3126_i20161_a.trc
    Corrupt block relative dba: 0x01400098 (file 5, block 152)
    Fractured block found during validation
    Data in bad block:
     type: 6 format: 2 rdba: 0x01400098
     last change scn: 0x0000.000ba240 seq: 0x4 flg: 0x04
     spare1: 0x0 spare2: 0x0 spare3: 0x0
     consistency value in tail: 0xa2400602
     check value in block header: 0xed83
     computed block checksum: 0x0
    Reread of blocknum=152, file=/u01/app/oracle/oradata/ZHUO/datafile/o1_mf_zhuo_gxdcfr5s_.dbf. found same corrupt data
    Reread of blocknum=152, file=/u01/app/oracle/oradata/ZHUO/datafile/o1_mf_zhuo_gxdcfr5s_.dbf. found same corrupt data
    Reread of blocknum=152, file=/u01/app/oracle/oradata/ZHUO/datafile/o1_mf_zhuo_gxdcfr5s_.dbf. found same corrupt data
    Reread of blocknum=152, file=/u01/app/oracle/oradata/ZHUO/datafile/o1_mf_zhuo_gxdcfr5s_.dbf. found same corrupt data
    Reread of blocknum=152, file=/u01/app/oracle/oradata/ZHUO/datafile/o1_mf_zhuo_gxdcfr5s_.dbf. found same corrupt data
    Checker run found 1 new persistent data failures
    Thu Oct 29 20:37:28 2020
    Dumping diagnostic data in directory=[cdmp_20201029203728], requested by (instance=1, osid=3004), summary=[incident=20161].
    Thu Oct 29 20:38:57 2020
    WARNING: Heavy swapping observed on system in last 5 mins.
    pct of memory swapped in [19.00%] pct of memory swapped out [16.85%].
    Please make sure there is no memory pressure and the SGA and PGA 
    are configured correctly. Look at DBRM trace file for more details.
    

    问题现象

    尝试导出:

    [oracle@oracle11g ~]$ exp zhuo/zhuo file=test.dmp tables=test
    
    Export: Release 11.2.0.4.0 - Production on Fri Oct 30 00:33:35 2020
    
    Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
    
    
    EXP-00056: ORACLE error 28002 encountered
    ORA-28002: the password will expire within 6 days
    Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
    With the Partitioning, OLAP, Data Mining and Real Application Testing options
    Export done in ZHS16GBK character set and AL16UTF16 NCHAR character set
    
    About to export specified tables via Conventional Path ...
    . . exporting table                           TEST
    EXP-00056: ORACLE error 1578 encountered
    ORA-01578: ORACLE data block corrupted (file # 5, block # 152)
    ORA-01110: data file 5: '/u01/app/oracle/oradata/ZHUO/datafile/o1_mf_zhuo_gxdcfr5s_.dbf'
    Export terminated successfully with warnings.
    

    导出时alert日志报错:

    Fri Oct 30 00:39:10 2020
    Hex dump of (file 5, block 152) in trace file /u01/app/oracle/diag/rdbms/zhuo/zhuo/trace/zhuo_ora_3691.trc
    Corrupt block relative dba: 0x01400098 (file 5, block 152)
    Fractured block found during user buffer read
    Data in bad block:
     type: 6 format: 2 rdba: 0x01400098
     last change scn: 0x0000.000ba240 seq: 0x4 flg: 0x04
     spare1: 0x0 spare2: 0x0 spare3: 0x0
     consistency value in tail: 0xa2400602
     check value in block header: 0xed83
     computed block checksum: 0x0
    Reading datafile '/u01/app/oracle/oradata/ZHUO/datafile/o1_mf_zhuo_gxdcfr5s_.dbf' for corruption at rdba: 0x01400098 (file 5, block 152)
    Reread (file 5, block 152) found same corrupt data (no logical check)
    Fri Oct 30 00:39:10 2020
    Corrupt Block Found
             TSN = 5, TSNAME = ZHUO
             RFN = 5, BLK = 152, RDBA = 20971672
             OBJN = 80610, OBJD = 80610, OBJECT = TEST, SUBOBJECT = 
             SEGMENT OWNER = ZHUO, SEGMENT TYPE = Table Segment
    Errors in file /u01/app/oracle/diag/rdbms/zhuo/zhuo/trace/zhuo_ora_3691.trc  (incident=20226):
    ORA-01578: ORACLE data block corrupted (file # 5, block # 152)
    ORA-01110: data file 5: '/u01/app/oracle/oradata/ZHUO/datafile/o1_mf_zhuo_gxdcfr5s_.dbf'
    Incident details in: /u01/app/oracle/diag/rdbms/zhuo/zhuo/incident/incdir_20226/zhuo_ora_3691_i20226.trc
    Fri Oct 30 00:39:11 2020
    Dumping diagnostic data in directory=[cdmp_20201030003911], requested by (instance=1, osid=3691), summary=[incident=20226].
    

    坏块修复方法论表

    首先判断坏块影响的数据库对象是否是已经不使用的数据对象了,如果是,则啥也不用做了。
    其次,判断坏块影响的数据库对象是否处于临时表空间,如果是,创建一个新的临时表空间,并将受到影响的用户的临时表空间设置为新的临时表空间。
    CREATE TEMPORARY TABLESPACE temp2 TEMPFILE ‘/u01/oradata/temp02.dbf’ SIZE 500M;
    Alter user <user_name> temporary tablespace temp2;
    第三,如果坏块影响的数据库对象是索引,则进一步判断索引所在的表是否也有坏块。如果有,则先解决表的坏块问题。如果没有,则可以通过索引重建方式进行恢复。
    若该索引有外键存在,则需要按如下步骤进行:
    – For each foreign key
    ALTER TABLE <child_table> DISABLE CONSTRAINT <fk_constraint>;
    – Rebuild the primary key using
    ALTER TABLE

    DISABLE CONSTRAINT <pk_constraint>;
    DROP INDEX <index_name>;
    CREATE INDEX <index_name> … with appropriate storage clause
    ALTER TABLE
    ENABLE CONSTRAINT <pk_constraint>;
    – Enable the foreign key constraints
    ALTER TABLE <child_table> ENABLE CONSTRAINT <fk_constraint>;
    如果是分区索引,则重建索引语句如下:
    ALTER INDEX … REBUILD PARTITION …
    需要注意的几点是:
    (1) 尽量不要使用“ALTER INDEX … REBUILD”语句去重建非分区索引,因为该语句可能通过已经含有坏块的旧索引数据进行重建,而“ALTER INDEX … REBUILD ONLINE”和“ALTER INDEX … REBUILD PARTITION …”则不会通过已经含有坏块的旧索引数据进行重建,因此应以后两种语句方式进行索引重建。
    (2)假设有坏块的索引字段是另外一个复合索引字段的子集,则Oracle可能利用该复合索引的数据进行重建。若该复合索引也有坏块,那就太不幸了。此时,最好将这两个索引都删除掉,并重建。
    (3)在重新创建索引时,一定要正确设置相关存储属性,例如将新索引创建在确保没有硬件故障的表空间中。

    第四,此时可考虑数据库的完全恢复了。但应该满足如下条件:
    1)数据库处于归档状态
    2)备份数据是完整的
    建议通过dbv程序对备份数据检查其完整性。如果最新备份数据也含有坏块数据,则需要查找更旧的备份数据。
    1)归档日志必须是完整的
    从备份数据到当前时间的归档日志必须是完整的。
    1)联机日志必须是完整的
    2)没有对实施了nologging操作的数据对象进行recover操作。
    例如,若坏块只出现在少量数据块上,则建议进行数据块级恢复。以下是数据块级恢复的相关命令:
    blockrecover datafile 8 block 13;
    Select * from v$database_block_corruption
    blockrecover corruption list;
    请注意:数据块级恢复只能做到完全恢复,而不能做到不完全恢复。
    若坏块只出现在少数几个数据文件上,则建议进行数据文件级恢复。以下是数据文件级恢复的步骤和相关命令:
    — 将含坏块的数据文件设置为OFFLINE状态
    ALTER DATABASE DATAFILE ‘name_of_file’ OFFLINE;
    — 将该文件复制到安全位置,以防备份数据也包含了坏块
    cp < name_of_file > <安全位置>
    — 从最新的备份数据中restore该文件至安全位置
    — 通过DBVERIFY检查该文件是否包含坏块
    — 假设该文件不包含坏块,则对该文件目录进行RENAME操作:
    ALTER DATABASE RENAME FILE ‘old_name’ TO ‘new_name’;
    — 对该文件进行recover操作
    RECOVER DATAFILE ‘name_of_file’;
    — 将该数据文件恢复为ONLINE状态
    ALTER DATABASE DATAFILE ‘name_of_file’ ONLINE;
    若坏块出现在多个数据文件上,则可以考虑进行数据库级恢复。以下是数据库级恢复的步骤和相关命令:
    — 关闭数据库
    Shutdown (Immediate or Abort)
    — 将所有文件复制到安全位置,以防备份数据也包含了坏块
    cp < name_of_file > <安全位置>
    — 从最新的备份数据中restore所有文件至安全位置,但不要restore控制文件和联机日志文件
    — 通过DBVERIFY检查所有文件是否包含坏块
    — 将数据库启动到mount状态
    Startup MOUNT
    — 假设所有文件不包含坏块,则对被改动位置的文件进行RENAME操作:
    ALTER DATABASE RENAME FILE ‘old_name’ TO ‘new_name’;
    — 确保所有文件处于ONLINE状态:
    ALTER DATABASE DATAFILE ‘name_of_file’ ONLINE;
    — 对数据库进行recover操作
    RECOVER DATABASE
    — 打开数据库
    ALTER DATABASE OPEN;
    数据库完全恢复之后,建议对受影响的数据对象进行完整性检查,例如:
    ANALYZE <table_name> VALIDATE STRUCTURE CASCADE;
    确认是否有数据和索引不匹配的情况存在。进一步,建议在应用级检查数据的逻辑完整性。

    第五,如果上述完全恢复仍然不能恢复坏块数据,而且被损坏的表为关键业务数据,则此时需要考虑尽可能先确保这些表的正常对外访问,并且从这些表中抢救尽可能多的数据。
    此时,可供选择的办法包括:
    通过DBMS_REPAIR包或设置10231事件,抢救坏块之外数据。
    通过ROWID扫描方法,抢救坏块之外数据。

    表坏块修复

    blockrecover

    该命令是从oracle 9i引入的。可以针对坏块进行单独的修复。
    而不用我们去进行restore datafile,然后再进行recover datafile.

    ocp关于块修复的解释
    在这里插入图片描述
    在这里插入图片描述
    在这里插入图片描述
    在这里插入图片描述
    在这里插入图片描述
    在这里插入图片描述
    请注意,限制条件:
    1、备份必须完整。
    2、数据块级恢复只能做到完全恢复,而不能做到不完全恢复。

    先备份;

    RMAN> backup datafile 5 format '/home/oracle/zhuo.bak';
    
    Starting backup at 30-OCT-20
    using target database control file instead of recovery catalog
    allocated channel: ORA_DISK_1
    channel ORA_DISK_1: SID=39 device type=DISK
    channel ORA_DISK_1: starting full datafile backup set
    channel ORA_DISK_1: specifying datafile(s) in backup set
    input datafile file number=00005 name=/u01/app/oracle/oradata/ZHUO/datafile/o1_mf_zhuo_gxdcfr5s_.dbf
    channel ORA_DISK_1: starting piece 1 at 30-OCT-20
    channel ORA_DISK_1: finished piece 1 at 30-OCT-20
    piece handle=/home/oracle/zhuo.bak tag=TAG20201030T171734 comment=NONE
    channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
    Finished backup at 30-OCT-20
    
    1. blockrecover datafile block

    blockrecover进行恢复:

    RMAN> recover datafile 5 block 151;
    
    Starting recover at 30-OCT-20
    using channel ORA_DISK_1
    
    channel ORA_DISK_1: restoring block(s)
    channel ORA_DISK_1: specifying block(s) to restore from backup set
    restoring blocks of datafile 00005
    channel ORA_DISK_1: reading from backup piece /home/oracle/zhuo.bak
    channel ORA_DISK_1: piece handle=/home/oracle/zhuo.bak tag=TAG20201030T171734
    channel ORA_DISK_1: restored block(s) from backup piece 1             --restore block**加粗样式**
    channel ORA_DISK_1: block restore complete, elapsed time: 00:00:01
    
    starting media recovery                                               --开始recover
    media recovery complete, elapsed time: 00:00:03
    
    Finished recover at 30-OCT-20
    RMAN> blockrecover datafile 5 block 151;
    
    Starting recover at 30-OCT-20
    using channel ORA_DISK_1
    
    channel ORA_DISK_1: restoring block(s)
    channel ORA_DISK_1: specifying block(s) to restore from backup set
    restoring blocks of datafile 00005
    channel ORA_DISK_1: reading from backup piece /home/oracle/zhuo.bak
    channel ORA_DISK_1: piece handle=/home/oracle/zhuo.bak tag=TAG20201030T171734
    channel ORA_DISK_1: restored block(s) from backup piece 1
    channel ORA_DISK_1: block restore complete, elapsed time: 00:00:01
    
    starting media recovery
    media recovery complete, elapsed time: 00:00:03
    
    Finished recover at 30-OCT-20
    

    恢复完成。
    blockrecover和recover命令执行是一样的。
    blockrecover其实也要先进行restore,然后再进行recover操作,所以一般耗时很长
    2. blockrecover corrupt list
    也可以使用下面命令恢复。

    SQL> Select * from v$database_block_corruption;
    
         FILE#     BLOCK#     BLOCKS CORRUPTION_CHANGE# CORRUPTIO
    ---------- ---------- ---------- ------------------ ---------
             5        152          1                  0 FRACTURED
    
    RMAN> blockrecover corruption list;
    
    Starting recover at 30-OCT-20
    using channel ORA_DISK_1
    
    channel ORA_DISK_1: restoring block(s)
    channel ORA_DISK_1: specifying block(s) to restore from backup set
    restoring blocks of datafile 00005
    channel ORA_DISK_1: reading from backup piece /home/oracle/zhuo.bak
    channel ORA_DISK_1: piece handle=/home/oracle/zhuo.bak tag=TAG20201030T171734
    channel ORA_DISK_1: restored block(s) from backup piece 1
    channel ORA_DISK_1: block restore complete, elapsed time: 00:00:01
    
    starting media recovery
    media recovery complete, elapsed time: 00:00:03
    
    Finished recover at 30-OCT-20
    

    必须视图里面有内容,blockrecover corruption list才能够恢复。
    如果视图里面没内容,则此命令恢复没有用
    所以可以通过RMAN> blockrecover corruption list进行块的恢复,这是在大量块损坏时或全部块损坏时使用,前提是先执行RMAN>backup validate database,在V$DATABASE_BLOCK_CORRUPTION里有对应的坏块的列表。

    1. blockrecover只能做完全恢复,归档日志要全。
    RMAN> delete archivelog all;
    
    released channel: ORA_DISK_1
    allocated channel: ORA_DISK_1
    channel ORA_DISK_1: SID=29 device type=DISK
    List of Archived Log Copies for database with db_unique_name ZHUO
    =====================================================================
    
    Key     Thrd Seq     S Low Time 
    ------- ---- ------- - ---------
    1       1    53      A 21-NOV-19
            Name: /u01/app/oracle/product/11.2.0/dbhome_1/dbs/arch1_53_1024919054.dbf
    
    2       1    54      A 30-OCT-20
            Name: /u01/app/oracle/product/11.2.0/dbhome_1/dbs/arch1_54_1024919054.dbf
    
    3       1    55      A 30-OCT-20
            Name: /u01/app/oracle/product/11.2.0/dbhome_1/dbs/arch1_55_1024919054.dbf
    
    4       1    56      A 30-OCT-20
            Name: /u01/app/oracle/product/11.2.0/dbhome_1/dbs/arch1_56_1024919054.dbf
    
    5       1    57      A 30-OCT-20
            Name: /u01/app/oracle/product/11.2.0/dbhome_1/dbs/arch1_57_1024919054.dbf
    
    
    Do you really want to delete the above objects (enter YES or NO)? yes
    deleted archived log
    archived log file name=/u01/app/oracle/product/11.2.0/dbhome_1/dbs/arch1_53_1024919054.dbf RECID=1 STAMP=1055179540
    deleted archived log
    archived log file name=/u01/app/oracle/product/11.2.0/dbhome_1/dbs/arch1_54_1024919054.dbf RECID=2 STAMP=1055179540
    deleted archived log
    archived log file name=/u01/app/oracle/product/11.2.0/dbhome_1/dbs/arch1_55_1024919054.dbf RECID=3 STAMP=1055179541
    deleted archived log
    archived log file name=/u01/app/oracle/product/11.2.0/dbhome_1/dbs/arch1_56_1024919054.dbf RECID=4 STAMP=1055179542
    deleted archived log
    archived log file name=/u01/app/oracle/product/11.2.0/dbhome_1/dbs/arch1_57_1024919054.dbf RECID=5 STAMP=1055179542
    Deleted 5 objects
    RMAN>  blockrecover datafile 5 block 151;
    
    Starting recover at 30-OCT-20
    using channel ORA_DISK_1
    
    channel ORA_DISK_1: restoring block(s)
    channel ORA_DISK_1: specifying block(s) to restore from backup set
    restoring blocks of datafile 00005
    channel ORA_DISK_1: reading from backup piece /home/oracle/zhuo.bak
    channel ORA_DISK_1: piece handle=/home/oracle/zhuo.bak tag=TAG20201030T171734
    channel ORA_DISK_1: restored block(s) from backup piece 1
    channel ORA_DISK_1: block restore complete, elapsed time: 00:00:01
    
    starting media recovery
    
    RMAN-00571: ===========================================================
    RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
    RMAN-00571: ===========================================================
    RMAN-03002: failure of recover command at 10/30/2020 17:58:29
    RMAN-06053: unable to perform media recovery because of missing log
    RMAN-06025: no backup of archived log for thread 1 with sequence 57 and starting SCN of 764196 found to restore
    RMAN-06025: no backup of archived log for thread 1 with sequence 56 and starting SCN of 764193 found to restore
    RMAN-06025: no backup of archived log for thread 1 with sequence 55 and starting SCN of 764190 found to restore
    RMAN-06025: no backup of archived log for thread 1 with sequence 54 and starting SCN of 764187 found to restore
    RMAN-06025: no backup of archived log for thread 1 with sequence 53 and starting SCN of 753214 found to restore
    

    观察alert:
    alter database recover datafile list clear
    Sun Apr 14 07:42:14 PDT 2013
    Completed: alter database recover datafile list clear
    Sun Apr 14 07:42:14 PDT 2013
    Starting block media recovery
    Sun Apr 14 07:42:16 PDT 2013
    Media Recovery Log /home/ora10g/archivelog/0001_1_2_812148360.dbf
    Sun Apr 14 07:42:16 PDT 2013
    Media Recovery Log /home/ora10g/archivelog/0001_1_3_812148360.dbf
    Sun Apr 14 07:42:16 PDT 2013
    Media Recovery Log /home/ora10g/archivelog/0001_1_4_812148360.dbf
    Sun Apr 14 07:42:16 PDT 2013
    Media Recovery Log /home/ora10g/archivelog/0001_1_5_812148360.dbf
    Sun Apr 14 07:42:17 PDT 2013
    Media Recovery Log /home/ora10g/archivelog/0001_1_6_812148360.dbf
    Sun Apr 14 07:42:17 PDT 2013
    Media Recovery Log /home/ora10g/archivelog/0001_1_7_812148360.dbf
    Sun Apr 14 07:42:18 PDT 2013
    Recovery of Online Redo Log: Thread 1 Group 3 Seq 8 Reading mem 0
    Mem# 0: /home/ora10g/oradata/roger/redo03.log
    Sun Apr 14 07:42:19 PDT 2013
    Recovery of Online Redo Log: Thread 1 Group 2 Seq 9 Reading mem 0
    Mem# 0: /home/ora10g/oradata/roger/redo02.log
    Sun Apr 14 07:42:20 PDT 2013
    Completed block media recovery
    从alert日志也可以看出,必须应用全部所有的归档日志,所以归档日志要全。

    在这里插入图片描述

    [oracle@oracle11g datafile]$ rman target /
    
    Recovery Manager: Release 11.2.0.4.0 - Production on Wed May 20 14:55:37 2020
    
    Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
    
    connected to target database: ZHUO (DBID=3982294478)
    
    RMAN> list failure;
    
    using target database control file instead of recovery catalog
    List of Database Failures
    =========================
    
    Failure ID Priority Status    Time Detected Summary
    ---------- -------- --------- ------------- -------
    1942       HIGH     OPEN      20-MAY-20     Datafile 5: '/u01/app/oracle/oradata/ZHUO/datafile/o1_mf_zhuo_gxdcfr5s_.dbf' contains one or more corrupt blocks
    
    RMAN> advise failure;
    
    List of Database Failures
    =========================
    
    Failure ID Priority Status    Time Detected Summary
    ---------- -------- --------- ------------- -------
    1942       HIGH     OPEN      20-MAY-20     Datafile 5: '/u01/app/oracle/oradata/ZHUO/datafile/o1_mf_zhuo_gxdcfr5s_.dbf' contains one or more corrupt blocks
    
    analyzing automatic repair options; this may take some time
    allocated channel: ORA_DISK_1
    channel ORA_DISK_1: SID=35 device type=DISK
    analyzing automatic repair options complete
    
    Mandatory Manual Actions
    ========================
    no manual actions available
    
    Optional Manual Actions
    =======================
    no manual actions available
    
    Automated Repair Options
    ========================
    Option Repair Description
    ------ ------------------
    1      Recover multiple corrupt blocks in datafile 5  
      Strategy: The repair includes complete media recovery with no data loss
      Repair script: /u01/app/oracle/diag/rdbms/zhuo/zhuo/hm/reco_1838466509.hm
    
    RMAN> repair failure 
    2> ;
    
    Strategy: The repair includes complete media recovery with no data loss
    Repair script: /u01/app/oracle/diag/rdbms/zhuo/zhuo/hm/reco_1838466509.hm
    
    contents of repair script:
       # block media recovery for multiple blocks
       recover datafile 5 block 1428, 1500;
    
    Do you really want to execute the above repair (enter YES or NO)? yes
    executing repair script
    
    Starting recover at 20-MAY-20
    using channel ORA_DISK_1
    
    channel ORA_DISK_1: restoring block(s) from datafile copy /home/oracle/tbs.bak
    
    starting media recovery
    
    archived log for thread 1 with sequence 63 is already on disk as file /u01/app/oracle/oradata/arch/ZHUO/archivelog/2020_05_19/o1_mf_1_63_hd6qrj61_.arc
    archived log for thread 1 with sequence 64 is already on disk as file /u01/app/oracle/oradata/arch/ZHUO/archivelog/2020_05_19/o1_mf_1_64_hd6qrnrd_.arc
    archived log for thread 1 with sequence 65 is already on disk as file /u01/app/oracle/oradata/arch/ZHUO/archivelog/2020_05_19/o1_mf_1_65_hd6sjrhy_.arc
    archived log for thread 1 with sequence 66 is already on disk as file /u01/app/oracle/oradata/arch/ZHUO/archivelog/2020_05_19/o1_mf_1_66_hd7n1cd5_.arc
    archived log for thread 1 with sequence 67 is already on disk as file /u01/app/oracle/oradata/arch/ZHUO/archivelog/2020_05_19/o1_mf_1_67_hd7n1go0_.arc
    archived log for thread 1 with sequence 68 is already on disk as file /u01/app/oracle/oradata/arch/ZHUO/archivelog/2020_05_20/o1_mf_1_68_hd87pr90_.arc
    archived log for thread 1 with sequence 69 is already on disk as file /u01/app/oracle/oradata/arch/ZHUO/archivelog/2020_05_20/o1_mf_1_69_hd9h75gc_.arc
    archived log for thread 1 with sequence 70 is already on disk as file /u01/app/oracle/oradata/arch/ZHUO/archivelog/2020_05_20/o1_mf_1_70_hd9hjxoh_.arc
    media recovery complete, elapsed time: 00:00:06
    Finished recover at 20-MAY-20
    repair failure complete
    
    RMAN> list failure;
    
    no failures found that match specification
    

    event 10231,跳过坏块

    SQL> ALTER SYSTEM SET EVENTS='10231 trace name context forever,level 10';
    
    System altered.
    [oracle@oracle11g dpdump]$ exp zhuo/zhuo file=test.dmp tables=test
    
    Export: Release 11.2.0.4.0 - Production on Fri Oct 30 00:40:55 2020
    
    Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
    
    
    EXP-00056: ORACLE error 28002 encountered
    ORA-28002: the password will expire within 6 days
    Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
    With the Partitioning, OLAP, Data Mining and Real Application Testing options
    Export done in ZHS16GBK character set and AL16UTF16 NCHAR character set
    
    About to export specified tables via Conventional Path ...
    . . exporting table                           TEST      79984 rows exported
    Export terminated successfully without warnings.
    

    79984,丢失了数据,坏块里面的内容已经丢失。
    补充:
    expdp有没有问题。
    接上面出现坏块的时间。

    [oracle@oracle11g ~]$ expdp \'/ as sysdba\' directory=DATA_PUMP_DIR dumpfile=test.dmp tables=zhuo.test          
    
    Export: Release 11.2.0.4.0 - Production on Fri Oct 30 00:36:41 2020
    
    Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
    
    Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
    With the Partitioning, OLAP, Data Mining and Real Application Testing options
    Starting "SYS"."SYS_EXPORT_TABLE_01":  "/******** AS SYSDBA" directory=DATA_PUMP_DIR dumpfile=test.dmp tables=zhuo.test 
    Estimate in progress using BLOCKS method...
    Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
    Total estimation using BLOCKS method: 10 MB
    Processing object type TABLE_EXPORT/TABLE/TABLE
    . . exported "ZHUO"."TEST"                               7.771 MB   79984 rows
    Master table "SYS"."SYS_EXPORT_TABLE_01" successfully loaded/unloaded
    ******************************************************************************
    Dump file set for SYS.SYS_EXPORT_TABLE_01 is:
      /u01/app/oracle/admin/zhuo/dpdump/test.dmp
    Job "SYS"."SYS_EXPORT_TABLE_01" successfully completed at Fri Oct 30 00:36:49 2020 elapsed 0 00:00:07
    

    导出时候的alert日志:

    Fri Oct 30 00:36:42 2020
    DM00 started with pid=33, OS id=3677, job SYS.SYS_EXPORT_TABLE_01
    Fri Oct 30 00:36:43 2020
    DW00 started with pid=34, OS id=3679, wid=1, job SYS.SYS_EXPORT_TABLE_01
    Hex dump of (file 5, block 152) in trace file /u01/app/oracle/diag/rdbms/zhuo/zhuo/trace/zhuo_dw00_3679.trc
    Corrupt block relative dba: 0x01400098 (file 5, block 152)
    Fractured block found during user buffer read
    Data in bad block:
     type: 6 format: 2 rdba: 0x01400098
     last change scn: 0x0000.000ba240 seq: 0x4 flg: 0x04
     spare1: 0x0 spare2: 0x0 spare3: 0x0
     consistency value in tail: 0xa2400602
     check value in block header: 0xed83
     computed block checksum: 0x0
    Reading datafile '/u01/app/oracle/oradata/ZHUO/datafile/o1_mf_zhuo_gxdcfr5s_.dbf' for corruption at rdba: 0x01400098 (file 5, block 152)
    Reread (file 5, block 152) found same corrupt data (no logical check)
    Fri Oct 30 00:36:49 2020
    Corrupt Block Found
             TSN = 5, TSNAME = ZHUO
             RFN = 5, BLK = 152, RDBA = 20971672
             OBJN = 80610, OBJD = 80610, OBJECT = TEST, SUBOBJECT = 
             SEGMENT OWNER = ZHUO, SEGMENT TYPE = Table Segment
    

    虽然alert里边有报错,但是还是能正常导出。

    之前expdp测试能否导入:

    SQL> drop table zhuo.test purge
      2  ;
    
    Table dropped.
    [oracle@oracle11g ~]$ impdp \'/ as sysdba\' directory=DATA_PUMP_DIR dumpfile=test.dmp  
    
    Import: Release 11.2.0.4.0 - Production on Fri Oct 30 15:14:49 2020
    
    Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
    
    Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
    With the Partitioning, OLAP, Data Mining and Real Application Testing options
    Master table "SYS"."SYS_IMPORT_FULL_01" successfully loaded/unloaded
    Starting "SYS"."SYS_IMPORT_FULL_01":  "/******** AS SYSDBA" directory=DATA_PUMP_DIR dumpfile=test.dmp 
    Processing object type TABLE_EXPORT/TABLE/TABLE
    Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
    . . imported "ZHUO"."TEST"                               7.771 MB   79984 rows
    Job "SYS"."SYS_IMPORT_FULL_01" successfully completed at Fri Oct 30 15:14:53 2020 elapsed 0 00:00:02
    SQL> select count(*) from zhuo.test;
    
      COUNT(*)
    ----------
         79984
    

    发现只是丢失了数据,而不再报错。alert日志也不再报错。

    expdp能正常导出含有坏块的表,再导入的时候,也是会损失坏块里面的数据。

    dbms_rowid 跳过坏块

    SQL> select object_id,data_object_id,object_name from dba_objects where object_name='TEST';
    
     OBJECT_ID DATA_OBJECT_ID
    ---------- --------------
    OBJECT_NAME
    --------------------------------------------------------------------------------
         80610          80610
    TEST
    
    
    SQL> SELECT dbms_rowid.rowid_create(1,80610,5,152,0) LOW_RID from DUAL;
    
    LOW_RID
    ------------------
    AAATriAAFAAAACYAAA
    
    SQL> SELECT dbms_rowid.rowid_create(1,80610,5,153,0) HI_RID  from DUAL;
    
    HI_RID
    ------------------
    AAATriAAFAAAACZAAA
    
    SQL> CREATE TABLE salvage_table AS SELECT /*+ ROWID(A) */ * FROM ZHUO.TEST A WHERE rowid < 'AAATriAAFAAAACYAAA';
    
    Table created.
    
    SQL> INSERT INTO salvage_table SELECT /*+ ROWID(A) */ * FROM ZHUO.TEST A WHERE rowid >= 'AAATriAAFAAAACZAAA' ;
    
    78407 rows created.
    
    SQL> commit;
    
    Commit complete.
    
    SQL> select count(*) from salvage_table;
    
      COUNT(*)
    ----------
         79984
    

    79984,也丢失了坏块里面的内容。

    具体的使用方法和原理参考另一篇文章。

    DBMS_REPAIR标记坏块,跳过坏块

    利用dbms_repair包来修复:
    admin_tables —创建或管理repair table,用于记录检测到的坏块信息
    check_object —用于检测对象是否存在坏块
    fix_corrupt_blocks —修复数据库坏块(实际上仅仅是对坏块进行标记,将坏块信息写入到repair table中)
    skip_corrupt_blocks —用于开启或关闭全表扫描或index fast full scan时是否跳过坏块
    segment_fix_status —修复segment状态
    dump_orphan_keys —通过检查index entry对应到的数据库坏块

    1. 创建repair table
    BEGIN
    DBMS_REPAIR.ADMIN_TABLES (
    TABLE_NAME => 'REPAIR_TABLE',
    TABLE_TYPE => dbms_repair.repair_table,
    ACTION => dbms_repair.create_action,
    TABLESPACE => 'ZHUO');
    END;
    /
    
    1. 检查对象上是否存在坏块
    set serveroutput on
    DECLARE num_corrupt INT;
    BEGIN
    num_corrupt := 0;
    DBMS_REPAIR.CHECK_OBJECT (
    SCHEMA_NAME => 'ZHUO',
    OBJECT_NAME => 'TEST',
    REPAIR_TABLE_NAME => 'REPAIR_TABLE',
    corrupt_count => num_corrupt);
    DBMS_OUTPUT.PUT_LINE('number corrupt: ' || TO_CHAR (num_corrupt));
    END;
    /
    

    通过check后,就可以查询repair table了,如下:

    SQL> select OBJECT_ID,RELATIVE_FILE_ID,BLOCK_ID,CORRUPT_TYPE,OBJECT_NAME,CORRUPT_DESCRIPTION
      2  from repair_table;
    
     OBJECT_ID RELATIVE_FILE_ID   BLOCK_ID CORRUPT_TYPE
    ---------- ---------------- ---------- ------------
    OBJECT_NAME
    ------------------------------
    CORRUPT_DESCRIPTION
    --------------------------------------------------------------------------------
         80610                5        152         6148
    TEST
    

    实际上跟dbv的检测是一致的
    3) 标记坏块

    declare       
      fix_count int;
      begin
        fix_count := 0;
        dbms_repair.fix_corrupt_blocks (
        schema_name => 'ZHUO',
        object_name => 'TEST',
        object_type => dbms_repair.table_object,
        repair_table_name => 'REPAIR_TABLE',
        fix_count => fix_count);
      dbms_output.put_line('fix count: ' || to_char(fix_count));
    end;
    /
    

    另一个session查询,报错:

    
    SQL> select count(*) from zhuo.test;                
    select count(*) from zhuo.test
                              *
    ERROR at line 1:
    ORA-01578: ORACLE data block corrupted (file # 5, block # 152)
    ORA-01110: data file 5:
    '/u01/app/oracle/oradata/ZHUO/datafile/o1_mf_zhuo_gxdcfr5s_.dbf'
    
    
    1. 设置全表扫描跳过标记的坏块
    begin
      dbms_repair.skip_corrupt_blocks (
        schema_name => 'ZHUO',
        object_name => 'TEST',
        object_type => dbms_repair.table_object,
        flags => dbms_repair.skip_flag);
    end;
    /
    

    查询:

    SQL> select count(*) from zhuo.test;   
    
      COUNT(*)
    ----------
         79984
    

    这个skip的本质是什么
    实际上就是标记为坏块,把块的seq值标记为0xff
    是不是多块读的
    一旦这样标记坏块后,该坏块内的数据都将丢失,那么是否有其他方法可以最大程度的恢复数据呢?
    这个块内的数据都没了。
    5)做完上述修复,记得清除标记
    BEGIN
    DBMS_REPAIR.SKIP_CORRUPT_BLOCKS (
    SCHEMA_NAME => ‘&schema_name’,
    OBJECT_NAME => ‘&object_name’,
    OBJECT_TYPE => dbms_repair.table_object,
    FLAGS => dbms_repair.NOSKIP_FLAG);
    END;
    /

    索引坏块修复

    创建索引

    SQL> create index idx_obj on test(owner);
    
    Index created.
    

    模拟坏块

    SEGMENT_NAME                                                                       EXTENT_ID    FILE_ID   BLOCK_ID     BLOCKS
    --------------------------------------------------------------------------------- ---------- ---------- ---------- ----------
    IDX_OBJ                                                                                    0          5       1416          8
    IDX_OBJ                                                                                    1          5       1424          8
    IDX_OBJ                                                                                    2          5       1432          8
    IDX_OBJ                                                                                    3          5       1440          8
    IDX_OBJ                                                                                    4          5       1448          8
    IDX_OBJ                                                                                    5          5       1456          8
    IDX_OBJ                                                                                    6          5       1464          8
    IDX_OBJ                                                                                    7          5       1472          8
    IDX_OBJ                                                                                    8          5       1480          8
    IDX_OBJ                                                                                    9          5       1488          8
    IDX_OBJ                                                                                   10          5       1496          8
    IDX_OBJ                                                                                   11          5       1504          8
    IDX_OBJ                                                                                   12          5       1512          8
    IDX_OBJ                                                                                   13          5       1520          8
    IDX_OBJ                                                                                   14          5       1528          8
    IDX_OBJ                                                                                   15          5       1536          8
    IDX_OBJ                                                                                   16          5       1664        128
    BBED> set file 5 block 1430
    BBED> p seq_kcbh
    ub1 seq_kcbh                                @14       0x02
    
    BBED> m /x 04
    
    SQL>  select * from v$database_block_corruption;
    
         FILE#     BLOCK#     BLOCKS CORRUPTION_CHANGE# CORRUPTIO
    ---------- ---------- ---------- ------------------ ---------
             5       1430          1                  0 FRACTURED
    

    索引坏块出现,rebuild进行修复:

    SQL>  alter index zhuo.idx_obj rebuild;
    
    Index altered.
    

    alert日志里面还有报错:

    Mon Nov 02 10:35:26 2020
    Hex dump of (file 5, block 1430) in trace file /u01/app/oracle/diag/rdbms/zhuo/zhuo/trace/zhuo_ora_1630.trc
    Corrupt block relative dba: 0x01400596 (file 5, block 1430)
    Fractured block found during multiblock buffer read
    Data in bad block:
     type: 6 format: 2 rdba: 0x01400596
     last change scn: 0x0000.000ba8e3 seq: 0x4 flg: 0x04
     spare1: 0x0 spare2: 0x0 spare3: 0x0
     consistency value in tail: 0xa8e30602
     check value in block header: 0x7655
     computed block checksum: 0x0
    Reading datafile '/u01/app/oracle/oradata/ZHUO/datafile/o1_mf_zhuo_gxdcfr5s_.dbf' for corruption at rdba: 0x01400596 (file 5, block 1430)
    Reread (file 5, block 1430) found same corrupt data (no logical check)
    Mon Nov 02 10:35:26 2020
    Corrupt Block Found
             TSN = 5, TSNAME = ZHUO
             RFN = 5, BLK = 1430, RDBA = 20972950
             OBJN = 80624, OBJD = 80624, OBJECT = IDX_OBJ, SUBOBJECT = 
             SEGMENT OWNER = ZHUO, SEGMENT TYPE = Index Segment
    Corrupt Block Found
             TSN = 5, TSNAME = ZHUO
             RFN = 5, BLK = 1430, RDBA = 20972950
             OBJN = 80624, OBJD = 80624, OBJECT = IDX_OBJ, SUBOBJECT = 
             SEGMENT OWNER = ZHUO, SEGMENT TYPE = Index Segment
    

    重新检测后,坏块没有消失

    SQL>  select * from v$database_block_corruption;
    
         FILE#     BLOCK#     BLOCKS CORRUPTION_CHANGE# CORRUPTIO
    ---------- ---------- ---------- ------------------ ---------
             5       1430          1                  0 FRACTURED
    

    我们采用rebuild online进行修复

    SQL> alter index zhuo.idx_obj rebuild online;
    
    Index altered.
    

    alert日志里面在没有报错。
    重新校验后,坏块消失:

    SQL>  select * from v$database_block_corruption;
    no rows
    

    尽量不要使用“ALTER INDEX … REBUILD”语句去重建非分区索引,因为该语句可能通过已经含有坏块的旧索引数据进行重建,而“ALTER INDEX … REBUILD ONLINE”和“ALTER INDEX … REBUILD PARTITION …”则不会通过已经含有坏块的旧索引数据进行重建,因此应以后两种语句方式进行索引重建。

    抢救数据之后的处理
    一旦将数据抢救完毕,例如重建新表,或者将数据export出来,则应进行如下事后处理:
    对被抢救数据进行备份
    保存重建表、索引的SQL脚本
    将诊断和处理过程中,Oracle技术支持人员需要的相关诊断信息加以保存
    删除10231事件或将清除SKIP_CORRPUPT标志
    对原来有问题的表进行RENAME或DROP操作。如果空间富裕,最好是RENAME操作。
    通过import操作等重建原表。
    重建相关索引、触发器等其它对象

    展开全文
  • Oracle坏块修复

    2019-12-15 17:46:10
    数据库坏块(corruption) 的类型可以按照坏块所属对象的不同,分为用户数据坏块,数据字典坏块,Undo坏块,控制文件坏块,Redo坏块,Lob坏块,index坏块等等;也可以按照坏块产生的原因,分为物理坏块(physical ...

     
    数据库坏块(corruption) 的类型可以按照坏块所属对象的不同,分为用户数据坏块,数据字典坏块,Undo坏块,控制文件坏块,Redo坏块,Lob坏块,index坏块等等;也可以按照坏块产生的原因,分为物理坏块(physical corruption)和逻辑坏块(logical corruption )。

    物理坏块(physical corruption)
    常见的物理坏块(Physical Block Corruptions)有块头和块尾信息不一致(Fractured/Incomplete),checksum值无效,数据块信息全部为0等情况,并且可能伴随错误ORA-1578和ORA-1110

    为了及时发现物理坏块和准确定位坏块产生的原因,oracle建议设置初始化参数DB_BLOCK_CHECKSUM=TYPICAL(默认值)。一般情况下,物理坏块是由于底层OS/disk系统错误/损坏,导致数据块被修改,数据块标志为坏块(corruption)。
    数据块的Checksum值无效是一种常见的物理坏块,当数据库初始化参数DB_BLOCK_CHECKSUM=TYPICAL(默认值)时,DBWR进程将数据块写入disk时会计算数据块的Checksum,并且将Checksum值记录在数据块的位置offset 16和17;当从disk读取该数据块时,oracle重新计算数据块的Checksum,并且与记录在数据块中的Checksum做异或运算(Xor),如果异或结果为非0,说明数据块被修改过,数据块为坏块(corruption)。

    一、坏块的产生原因:


    1.硬件问题


    Oracle进程在处理一个数据块时,首先将其读入物理内存空间,在处理完成后,再由特定进程将其写回磁盘;如果在这个过程中,出现内存故障,CPU计算失误,都会导致内存数据块的内容混乱,最后反映到写回磁盘的数据块内容有误。同样,如果存储子系统出现异常,数据块损坏也就随之出现了。


    2.操作系统BUG


    由于Oracle进程对数据块的读写,都是以操作系统内核调用(system call)的方式完成的,如果操作系统在内核调用存在问题,必然导致Oracle进程写入非法的内容。


    3.操作系统的I/O错误或缓冲问题


    4.内存或paging问题


    Oracle软件BUG Oracle软件特定版本上,可能出现导致数据块的内容出现异常BUG。


    5.非Oracle进程扰乱Oracle共享内存区域


    在当数据块的内容被读入主机的物理内存时,如果其他非Oracle进程,对Oracle使用的共享内存区域形成了扰乱,最终导致写回磁盘的数据块内容混乱。


    6.异常关机,掉电,终止服务


    异常关机,掉电,终止服务使进程异常终止,而破坏数据块的完整性,导致坏块产生。
    注:这也是为什么突然断电会导致数据库无法启动
    由上可见,坏块的形成原因复杂。当出现坏块时,为了找到确切的原因,需要大量的分析时间和排查操作,甚至需要多次重现才能找出根本原因。但当故障发生在生产系统上,我们为了减少停机时间,会尽快实施应急权变措施以保证系统的可用性,这样就破坏了故障现场,对根本原因的分析因而也更加困难了。

    二、坏块预防(检查)


    1.对于Oracle bug问题引起的物理坏块问题,Oracle会对这些BUG以严重(Noticable)问题标出(标记为*或+)相应的patch。
    2.使用 RMAN进行检查:
    RMAN> BACKUP CHECK LOGICAL VALIDATE DATAFILE n ; --可以检查数据文件是否包含坏块,同时并不产生实际的备份输出。
    3.使用dbv工具检查
    ANALYZE TABLE tablename VALIDATE STRUCTURE CASCADE
    它执行坏块的检查,但是不会标记坏块为corrupt,检测的结果保存在USER_DUMP_DEST目录下的用户trace文件中。
    dbv file=d:\oracle\oradata\mydb\RONLY.DBF blocksize=8192
    4.使用dbms包检查
    ①根据alert中的报错file_id和block_id查询对象
    SELECT tablespace_name, segment_type, owner, segment_name
    FROM dba_extents
    WHERE file_id = &fileid
    and &blockid between block_id AND block_id + blocks - 1;

    -If V$DATABASE_BLOCK_CORRUPTION contains rows please run this query to find the objects that contains the corrupted blocks:
    SELECT e.owner,
    e.segment_type,
    e.segment_name,
    e.partition_name,
    c.file#,
    greatest(e.block_id, c.block#) corr_start_block#,
    least(e.block_id + e.blocks - 1, c.block# + c.blocks - 1) corr_end_block#,
    least(e.block_id + e.blocks - 1, c.block# + c.blocks - 1) -
    greatest(e.block_id, c.block#) + 1 blocks_corrupted,
    null description
    FROM dba_extents e, v$database_block_corruption c
    WHERE e.file_id = c.file#
    AND e.block_id <= c.block# + c.blocks - 1
    AND e.block_id + e.blocks - 1 >= c.block#
    UNION
    SELECT s.owner,
    s.segment_type,
    s.segment_name,
    s.partition_name,
    c.file#,
    header_block corr_start_block#,
    header_block corr_end_block#,
    1 blocks_corrupted,
    'Segment Header' description
    FROM dba_segments s, v$database_block_corruption c
    WHERE s.header_file = c.file#
    AND s.header_block between c.block# and c.block# + c.blocks - 1
    UNION
    SELECT null owner,
    null segment_type,
    null segment_name,
    null partition_name,
    c.file#,
    greatest(f.block_id, c.block#) corr_start_block#,
    least(f.block_id + f.blocks - 1, c.block# + c.blocks - 1) corr_end_block#,
    least(f.block_id + f.blocks - 1, c.block# + c.blocks - 1) -
    greatest(f.block_id, c.block#) + 1 blocks_corrupted,
    'Free Block' description
    FROM dba_free_space f, v$database_block_corruption c
    WHERE f.file_id = c.file#
    AND f.block_id <= c.block# + c.blocks - 1
    AND f.block_id + f.blocks - 1 >= c.block#
    order by file#, corr_start_block#;

    ②给定一个表空间,并在此表空间下建立维修表:
    BEGIN
     DBMS_REPAIR.ADMIN_TABLES (
     TABLE_NAME => 'REPAIR_TABLE',
     TABLE_TYPE => dbms_repair.repair_table,
     ACTION => dbms_repair.create_action,
     TABLESPACE => '&tablespace_name');
    END;
    /
    ③对指定的<schema>.<object>检查并确认其中坏块(如果同时指定PARTITION_NAME也可以进行分区级别检查):
    set serveroutput on
    DECLARE num_corrupt INT;
    BEGIN
     num_corrupt := 0;
     DBMS_REPAIR.CHECK_OBJECT (
     SCHEMA_NAME => '&schema_name',
     OBJECT_NAME => '&object_name',
     REPAIR_TABLE_NAME => 'REPAIR_TABLE',
     corrupt_count => num_corrupt);
     DBMS_OUTPUT.PUT_LINE('number corrupt: ' || TO_CHAR (num_corrupt));
    END;
    /

    5.利用exp工具导出整个数据库可以检测坏块
    对以下情况的坏块是检测不出来的:
    ①HWM以上的坏块是不会发现的
    ②索引中存在的坏块是不会发现的
    ③数据字典中的坏块是不会发现的


    三、修复方式


    1. 当前数据库初始化参数配置DB_BLOCK_CHECKSUM=TYPICAL,因此从disk读取数据块时校验checksum:

    SQL> show parameter DB_BLOCK_CHECKSUM
    
    NAME                                 TYPE        VALUE
    ------------------------------------ ----------- ------------------------------
    db_block_checksum                    string      TYPICAL

     

    2. 查询表dept时发现有坏块,报错信息ORA-1578和ORA-1110,坏块为file # 4, block # 133

    SQL> select * from dept;
     select * from dept
    *
    ERROR at line 1:
    ORA-01578: ORACLE data block corrupted (file # 4, block # 133)
    ORA-01110: data file 4: '/u01/app/oracle/oradata/orcl/users01.dbf'

     

    3. 出现以上错误的同时在alert log中也有详细错误信息,这些错误信息说明数据块(file # 4, block # 133)损坏的原因是checksum无效。数据块中记录的checksum值为0x8167(这个值是上一次DBWR写入磁盘时计算的),读取数据块时重新计算得到的checksum是0x8122,checksum值异或运算(Xor)的结果是0x45 (computed block checksum)。由于两次checksum值不同(即异或结果为非0),说明数据块被修改过,数据块为坏块(corruption)。

    Alert log错误信息:

    Hex dump of (file 4, block 133) in trace file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_20892.trc
    Corrupt block relative dba: 0x01000085 (file 4, block 133)
    Bad check value found during multiblock buffer read  <<<<<<<<<<<<<< 说明坏块的原因是checksum无效
    Data in bad block:
     type: 6 format: 2 rdba: 0x01000085
     last change scn: 0x0000.0023d69a seq: 0x5 flg: 0x06
     spare1: 0x0 spare2: 0x0 spare3: 0x0
     consistency value in tail: 0xd69a0605
     check value in block header: 0x8167   <<<<<<<<<<<<<< 数据块中记录的checksum值为0x8167
     computed block checksum: 0x45         <<<<<<<<<<<<<< 0x8167与0x8122异或运算(Xor)的结果是0x45
    Reading datafile '/u01/app/oracle/oradata/orcl/users01.dbf' for corruption at rdba: 0x01000085 (file 4, block 133)
    Reread (file 4, block 133) found same corrupt data (no logical check)
    Sun Mar 23 22:53:40 2014
    Corrupt Block Found
             TSN = 4, TSNAME = USERS
             RFN = 4, BLK = 133, RDBA = 16777349
             OBJN = 14343, OBJD = 14343, OBJECT = DEPT, SUBOBJECT = 
             SEGMENT OWNER = JAMES, SEGMENT TYPE = Table Segment         <<<<<<<<<<<<<< 坏块对应的object ID
    Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_20892.trc  (incident=182595):
    ORA-01578: ORACLE data block corrupted (file # 4, block # 133)
    ORA-01110: data file 4: '/u01/app/oracle/oradata/orcl/users01.dbf'

    4.1 对应的orcl_ora_20892.trc中也有数据块的信息,其中数据块上记录的checksum值是0x8167(chkval)

    Block dump from disk:
    buffer tsn: 4 rdba: 0x01000085 (4/133)
    scn: 0x0000.0023d69a seq: 0x05 flg: 0x06 tail: 0xd69a0605
    frmt: 0x02 chkval: 0x8167 type: 0x06=trans data
    Hex dump of block: st=0, typ_found=1


    4.2 通过dd也查看数据块中记录的checksum值, offset 16,17 对应的是checksum值0x8167

    $ dd if=/u01/app/oracle/oradata/orcl/users01.dbf bs=8192 count=1 skip=133 of=/tmp/dd133.out
    
    $ od -x /tmp/dd133.out
    0000000 a206 0000 0085 0100 d69a 0023 0000 0605
    0000020 8167 0000 0001 0000 3807 0000 2fef 000c
    ^^^^

    解决方法:

    5. 修复数据坏块的方法可以通过备份恢复或者DBMS_REPAIR.SKIP_CORRUPT_BLOCKS跳过坏块。
     方法1 RMAN数据块恢复:
    首先要存在Rman的最新备份集,然后执行如下命令:

    RMAN>backup validate datafile 4;
    
    RMAN> run {blockrecover datafile 4 block 133;}
    
    SQL> select * from dept;
    
        DEPTNO DNAME          LOC
    ---------- -------------- -------------
            10 ACCOUNTING     DALIAN
            20 RESEARCH       DALLAS
            30 SALES          CHICAGO
            40 OPERATIONS     BOSTON

     

    方法2 bbed方法修改文件头


    逻辑坏块:


    alert日志报错:

    Reading datafile '/oradata/datafiles/oadb/oa01.dbf' for corruption at rdba: 0x016d4dd5 (file 5, block 2969045)
    Reread (file 5, block 2969045) found same corrupt data (no logical check)
    Tue Aug 18 10:53:51 2015
    Corrupt Block Found
            TSN = 6, TSNAME = OA
            RFN = 5, BLK = 2969045, RDBA = 23940565
            OBJN = 95690, OBJD = 95690, OBJECT = EDOC_BASE_WORKFLOW, SUBOBJECT = 
            SEGMENT OWNER = INSPUROA, SEGMENT TYPE = Table Segment
    Tue Aug 18 10:55:03 2015
    Hex dump of (file 5, block 2969045) in trace file /u01/app/oracle/diag/rdbms/oadb/oadb/trace/oadb_ora_4565.trc
    Corrupt block relative dba: 0x016d4dd5 (file 5, block 2969045)
    Bad header found during buffer read
    Data in bad block:
     type: 117 format: 0 rdba: 0x20206b73
     last change scn: 0x2020.20202020 seq: 0x20 flg: 0x20
     spare1: 0x64 spare2: 0x69 spare3: 0x0
     consistency value in tail: 0x4d240601
     check value in block header: 0x5f49
     block checksum disabled
    Reading datafile '/oradata/datafiles/oadb/oa01.dbf' for corruption at rdba: 0x016d4dd5 (file 5, block 2969045)
    Reread (file 5, block 2969045) found same corrupt data (no logical check)
    Tue Aug 18 10:55:03 2015
    Corrupt Block Found
            TSN = 6, TSNAME = OA
            RFN = 5, BLK = 2969045, RDBA = 23940565
            OBJN = 95690, OBJD = 95690, OBJECT = EDOC_BASE_WORKFLOW, SUBOBJECT = 
            SEGMENT OWNER = INSPUROA, SEGMENT TYPE = Table Segment
    Tue Aug 18 10:57:29 2015
    Hex dump of (file 5, block 2969045) in trace file /u01/app/oracle/diag/rdbms/oadb/oadb/trace/oadb_ora_21708.trc
    Corrupt block relative dba: 0x016d4dd5 (file 5, block 2969045)
    Bad header found during buffer read
    Data in bad block:
     type: 117 format: 0 rdba: 0x20206b73
     last change scn: 0x2020.20202020 seq: 0x20 flg: 0x20
     spare1: 0x64 spare2: 0x69 spare3: 0x0
     consistency value in tail: 0x4d240601
     check value in block header: 0x5f49
     block checksum disabled

    执行修复 根据报错信息

    方法1

    Reading datafile '/oradata/datafiles/oadb/oa01.dbf' for corruption at rdba: 0x016d4dd5 (file 5, block 2969045)
    Reread (file 5, block 2969045) found same corrupt data (no logical check)
    Corrupt Block Found
            TSN = 6, TSNAME = OA
            RFN = 5, BLK = 2969045, RDBA = 23940565
            OBJN = 95690, OBJD = 95690, OBJECT = EDOC_BASE_WORKFLOW, SUBOBJECT = 
            SEGMENT OWNER = INSPUROA, SEGMENT TYPE = Table Segment

    确定数据文件 datafile 5,oa01.dbf出现坏块现象
    1.查看坏块信息:

    SQL> select * from v$database_block_corruption;
    
        FILE#    BLOCK#    BLOCKS CORRUPTION_CHANGE# CORRUPTIO
    ---------- ---------- ---------- ------------------ ---------
            5    2969045          1                  0 CORRUPT

    确定坏块为2969045号,检查备份日志(增量,全量)是否完整备份
    2.检查备份datafile 5 是否完整

    RMAN> backup validate datafile 5;
    
    Starting backup at 18-AUG-15
    using target database control file instead of recovery catalog
    allocated channel: ORA_DISK_1
    channel ORA_DISK_1: SID=982 device type=DISK
    channel ORA_DISK_1: starting full datafile backup set
    channel ORA_DISK_1: specifying datafile(s) in backup set
    input datafile file number=00005 name=/oradata/datafiles/oadb/oa01.dbf
    channel ORA_DISK_1: backup set complete, elapsed time: 00:05:35
    List of Datafiles
    =================
    File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
    ---- ------ -------------- ------------ --------------- ----------
    5    FAILED 0              1840        4190720        9484751217293
      File Name: /oradata/datafiles/oadb/oa01.dbf
      Block Type Blocks Failing Blocks Processed
      ---------- -------------- ----------------
      Data      0              2842014        
      Index      0              182983          
      Other      1              1163883       
    
    validate found one or more corrupt blocks
    See trace file /u01/app/oracle/diag/rdbms/oadb/oadb/trace/oadb_ora_13513.trc for details
    Finished backup at 18-AUG-15


    3.使用RMAN工具修复
    RMAN> blockrecover datafile 5 block 2969045;
    4.再次查询故障块信息:
    SQL> select * from v$database_block_corruption;

    no rows selected

    方法2 DBMS_REPAIR.SKIP_CORRUPT_BLOCKS跳过坏块,然后将dept表中的其他数据导出重建表
    设置多块读
    alter session set db_file_multiblock_read_count=1;

    对检查出的坏块,可选择性地进行标记:
    select BLOCK_ID, CORRUPT_TYPE, CORRUPT_DESCRIPTION
    from REPAIR_TABLE;
     
    REM Mark the identified blocks as corrupted ( Soft Corrupt - reference Note 1496934.1 )
    DECLARE num_fix INT;
    BEGIN
     num_fix := 0;
     DBMS_REPAIR.FIX_CORRUPT_BLOCKS (
     SCHEMA_NAME => '&schema_name',
     OBJECT_NAME=> '&object_name',
     OBJECT_TYPE => dbms_repair.table_object,
     REPAIR_TABLE_NAME => 'REPAIR_TABLE',
     FIX_COUNT=> num_fix);
     DBMS_OUTPUT.PUT_LINE('num fix: ' || to_char(num_fix));
    END;
    /
    在将来进行DML操作时,对坏块进行跳过处理:
    BEGIN
     DBMS_REPAIR.SKIP_CORRUPT_BLOCKS (
     SCHEMA_NAME => '&schema_name',
     OBJECT_NAME => '&object_name',
     OBJECT_TYPE => dbms_repair.table_object,
     FLAGS => dbms_repair.SKIP_FLAG);
    END;
    /
    注意:
    使用DBMS_REPAIR访问坏块后,INDEX scan可能会出现报错,碰到这类报错,你需要重建这些索引。如果是唯一索引,那么相同数据的重新插入可能会报ORA-1错误。
    如果在dbms_repair.SKIP_FLAG已经启用后,希望将跳块标记清除以重新访问坏块,可以在执行DBMS_REPAIR.SKIP_CORRUPT_BLOCKS时,使用dbms_repair.NOSKIP_FLAG进行参数设置。
    使用DBMS_REPAIR.SKIP_CORRUPT_BLOCKS来跳块仅能针对出现ORA-1578报错的那些坏块情况。如果是针对其它类型坏块,就需要额外执行ADMIN_TABLES, CHECK_OBJECT 和FIX_CORRUPT_BLOCKS来对坏块进行标记处理。
    在执行过SKIP_CORRUPT_BLOCKS后,如果需要将表中的坏块进行清理,可以对表使用”alter table <name> MOVE”,而不是重建或truncate掉它。然后使用dbms_repair.NOSKIP_FLAG去除掉跳块标记即可。注意,坏块中的数据会被丢失掉。
     
    SQL> alter session set db_file_multiblock_read_count=1;
     
    SQL> create table dept_new as select * from dept;

    展开全文

空空如也

空空如也

1 2 3 4 5 ... 20
收藏数 5,870
精华内容 2,348
热门标签
关键字:

oracle坏块修复