精华内容
下载资源
问答
  • Oracle数据库服务器CPU持续100%之等待事件asynch descriptor resize
    2017年5月11日下午13点,一地市技术反应服务器响应慢,CPU长时间100%如图:

    经观察数据库服务器资源监视器,发现是oracle进程导致的,登录数据库查询数据库等待事件,发现asynch descriptor resize居高不下

    按照等待事件类型查询对应的会话信息如下,有30多个会话同时执行同一条sql语句:b7rng1bdrzzkq

    查询sql语句b7rng1bdrzzkq对应的sql文本如下:

    b7rng1bdrzzkq的执行计划如下:
    PLAN_TABLE_OUTPUT 
    --------------------------------------------------------------------------------------------------------------------------------------------------------
    SQL_ID b7rng1bdrzzkq, child number 0 
    ------------------------------------- 
    select zdb.HOSPITAL_ID, 
    zdh.HOSPITAL_NAME, 
    zdb.hisid, 
    zdb.HS_PATIENT_NAME as 
    PATIENT_NAME, 
    hs_area_code as 
    DEPTNAME, 
    admission_disease_name as 
    RULE_NAME, 
    round(zdb.TOTAL_COST/10000,4) as Total_Costs 
    from zk_dw_bill zdb, zk_dim_hospital zdh 
    where zdb.HOSPITAL_ID = zdh.HOSPITAL_ID_SZ 
    and exists 
    (select 1 from 
    zk_dw_billdetail zdbd where zdb.HISID = zdbd.pid 
    AND zdbd.item_date >= 
    to_date('2017/5/11 00:00:00', 'yyyy/mm/dd HH24:mi:ss') 
    AND zdbd.item_date <= 
    to_date('2017/5/11 23:59:59', 'yyyy/mm/dd HH24: 
    Plan hash value: 32811706 
    ------------------------------------------------------------------------------------------------------------------------------- 
    | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | 
    ------------------------------------------------------------------------------------------------------------------------------- 
    | 0 | SELECT STATEMENT | | | | 13 (100)| | | | 
    | 1 | SORT ORDER BY | | 1 | 424 | 13 (24)| 00:00:01 | | | 
    | 2 | NESTED LOOPS | | | | | | | | 
    | 3 | NESTED LOOPS | | 1 | 424 | 12 (17)| 00:00:01 | | | 
    | 4 | MERGE JOIN CARTESIAN | | 1 | 362 | 12 (17)| 00:00:01 | | | 
    | 5 | MERGE JOIN CARTESIAN | | 1 | 296 | 6 (17)| 00:00:01 | | | 
    | 6 | SORT UNIQUE | | 1 | 261 | 2 (0)| 00:00:01 | | | 
    | 7 | PARTITION RANGE SINGLE | | 1 | 261 | 2 (0)| 00:00:01 | 862 | 862 | 
    |* 8 | TABLE ACCESS FULL | CLAIMDETAILHOSPITAL | 1 | 261 | 2 (0)| 00:00:01 | 862 | 862 | 
    | 9 | BUFFER SORT | | 31 | 1085 | 4 (25)| 00:00:01 | | | 
    | 10 | TABLE ACCESS FULL | DW_ZD_HOSPITAL_YB | 31 | 1085 | 3 (0)| 00:00:01 | | | 
    | 11 | BUFFER SORT | | 1 | 66 | 9 (23)| 00:00:01 | | | 
    | 12 | VIEW | VW_SQ_1 | 1 | 66 | 6 (17)| 00:00:01 | | | 
    | 13 | HASH UNIQUE | | 1 | 2083 | | | | | 
    |* 14 | HASH JOIN | | 1 | 2083 | 6 (17)| 00:00:01 | | | 
    | 15 | PARTITION RANGE SINGLE | | 1 | 2077 | 2 (0)| 00:00:01 | 862 | 862 | 
    |* 16 | TABLE ACCESS FULL | AUDITRESULT4HOSPITAL | 1 | 2077 | 2 (0)| 00:00:01 | 862 | 862 | 
    | 17 | TABLE ACCESS FULL | GZ_LIST | 29 | 174 | 3 (0)| 00:00:01 | | | 
    |* 18 | INDEX UNIQUE SCAN | PK_CLAIMHOSPITAL_HISID | 1 | | 0 (0)| | | | 
    |* 19 | TABLE ACCESS BY GLOBAL INDEX ROWID| CLAIMHOSPITAL | 1 | 62 | 0 (0)| | ROWID | ROWID | 
    ------------------------------------------------------------------------------------------------------------------------------- 
    Query Block Name / Object Alias (identified by operation id): 
    ------------------------------------------------------------- 
    1 - SEL$817CBF02 
    8 - SEL$817CBF02 / CLAIMDETAILHOSPITAL@SEL$5 
    10 - SEL$817CBF02 / DW_ZD_HOSPITAL_YB@SEL$3 
    12 - SEL$A7D54A5B / VW_SQ_1@SEL$E4B10583 
    13 - SEL$A7D54A5B 
    16 - SEL$A7D54A5B / AA@SEL$7 
    17 - SEL$A7D54A5B / GL@SEL$7 
    18 - SEL$817CBF02 / CLAIMHOSPITAL@SEL$2 
    19 - SEL$817CBF02 / CLAIMHOSPITAL@SEL$2 
    Outline Data 
    ------------- 
    /*+ 
    BEGIN_OUTLINE_DATA 
    IGNORE_OPTIM_EMBEDDED_HINTS 
    OPTIMIZER_FEATURES_ENABLE('11.2.0.1') 
    DB_VERSION('11.2.0.1') 
    OPT_PARAM('_optimizer_use_feedback' 'false') 
    ALL_ROWS 
    OUTLINE_LEAF(@"SEL$A7D54A5B") 
    OUTLINE_LEAF(@"SEL$817CBF02") 
    UNNEST(@"SEL$68B588A0") 
    UNNEST(@"SEL$7286615E") 
    OUTLINE(@"SEL$68B588A0") 
    MERGE(@"SEL$7") 
    OUTLINE(@"SEL$E4B10583") 
    OUTLINE(@"SEL$7286615E") 
    MERGE(@"SEL$5") 
    OUTLINE(@"SEL$6") 
    OUTLINE(@"SEL$7") 
    OUTLINE(@"SEL$5428C7F1") 
    MERGE(@"SEL$2") 
    MERGE(@"SEL$3") 
    OUTLINE(@"SEL$4") 
    OUTLINE(@"SEL$5") 
    OUTLINE(@"SEL$1") 
    OUTLINE(@"SEL$2") 
    OUTLINE(@"SEL$3") 
    FULL(@"SEL$817CBF02" "CLAIMDETAILHOSPITAL"@"SEL$5") 
    FULL(@"SEL$817CBF02" "DW_ZD_HOSPITAL_YB"@"SEL$3") 
    NO_ACCESS(@"SEL$817CBF02" "VW_SQ_1"@"SEL$E4B10583") 
    INDEX(@"SEL$817CBF02" "CLAIMHOSPITAL"@"SEL$2" ("CLAIMHOSPITAL"."HISID")) 
    LEADING(@"SEL$817CBF02" "CLAIMDETAILHOSPITAL"@"SEL$5" "DW_ZD_HOSPITAL_YB"@"SEL$3" "VW_SQ_1"@"SEL$E4B10583" 
    "CLAIMHOSPITAL"@"SEL$2") 
    USE_MERGE_CARTESIAN(@"SEL$817CBF02" "DW_ZD_HOSPITAL_YB"@"SEL$3") 
    USE_MERGE(@"SEL$817CBF02" "VW_SQ_1"@"SEL$E4B10583") 
    USE_NL(@"SEL$817CBF02" "CLAIMHOSPITAL"@"SEL$2") 
    NLJ_BATCHING(@"SEL$817CBF02" "CLAIMHOSPITAL"@"SEL$2") 
    FULL(@"SEL$A7D54A5B" "AA"@"SEL$7") 
    FULL(@"SEL$A7D54A5B" "GL"@"SEL$7") 
    LEADING(@"SEL$A7D54A5B" "AA"@"SEL$7" "GL"@"SEL$7") 
    USE_HASH(@"SEL$A7D54A5B" "GL"@"SEL$7") 
    USE_HASH_AGGREGATION(@"SEL$A7D54A5B") 
    END_OUTLINE_DATA 
    */ 

    Predicate Information (identified by operation id): 
    --------------------------------------------------- 
    8 - filter("ITEM_DATE"<=TO_DATE(' 2017-05-11 23:59:59', 'syyyy-mm-dd hh24:mi:ss')) 
    14 - access("GL"."ID"=TO_NUMBER("AA"."RULECODE")) 
    16 - filter("AA"."ITEM_DATE"<=TO_DATE(' 2017-05-11 23:59:59', 'syyyy-mm-dd hh24:mi:ss')) 
    18 - access("HISID"="PID") 
    filter("ITEM_1"="HISID") 
    19 - filter("HOSPITAL_ID"="HOSPITAL_ID_SZ") 
    Column Projection Information (identified by operation id): 
    ----------------------------------------------------------- 
    1 - (#keys=1) INTERNAL_FUNCTION("SETTLE_DATE")[7], "HOSPITAL_ID"[VARCHAR2,128], "HOSPITAL_NAME"[VARCHAR2,128], 
    "HISID"[VARCHAR2,128], "HS_PATIENT_NAME"[VARCHAR2,200], "HS_AREA_CODE"[VARCHAR2,100], 
    "ADMISSION_DISEASE_NAME"[VARCHAR2,128], ROUND("TOTAL_COST"/10000,4)[22] 
    2 - "HOSPITAL_NAME"[VARCHAR2,128], "HISID"[VARCHAR2,128], "HOSPITAL_ID"[VARCHAR2,128], 
    "ADMISSION_DISEASE_NAME"[VARCHAR2,128], "HS_AREA_CODE"[VARCHAR2,100], "TOTAL_COST"[NUMBER,22], 
    "HS_PATIENT_NAME"[VARCHAR2,200], "SETTLE_DATE"[DATE,7] 
    3 - "HOSPITAL_ID_SZ"[VARCHAR2,128], "HOSPITAL_NAME"[VARCHAR2,128], "CLAIMHOSPITAL".ROWID[ROWID,10], 
    "HISID"[VARCHAR2,128] 
    4 - "PID"[VARCHAR2,500], "HOSPITAL_ID_SZ"[VARCHAR2,128], "HOSPITAL_NAME"[VARCHAR2,128], "ITEM_1"[VARCHAR2,128] 
    5 - "PID"[VARCHAR2,500], "HOSPITAL_ID_SZ"[VARCHAR2,128], "HOSPITAL_NAME"[VARCHAR2,128] 
    6 - (#keys=1) "PID"[VARCHAR2,500] 
    7 - "PID"[VARCHAR2,500] 
    8 - "PID"[VARCHAR2,500] 
    9 - (#keys=0) "HOSPITAL_ID_SZ"[VARCHAR2,128], "HOSPITAL_NAME"[VARCHAR2,128] 
    10 - "HOSPITAL_ID_SZ"[VARCHAR2,128], "HOSPITAL_NAME"[VARCHAR2,128] 
    11 - (#keys=0) "ITEM_1"[VARCHAR2,128] 
    12 - "ITEM_1"[VARCHAR2,128] 
    13 - "AA"."CLAIM_ID"[VARCHAR2,128] 
    14 - (#keys=1) "AA"."CLAIM_ID"[VARCHAR2,128] 
    15 - "AA"."CLAIM_ID"[VARCHAR2,128], "AA"."RULECODE"[VARCHAR2,4000] 
    16 - "AA"."CLAIM_ID"[VARCHAR2,128], "AA"."RULECODE"[VARCHAR2,4000] 
    17 - "GL"."ID"[NUMBER,22] 
    18 - "CLAIMHOSPITAL".ROWID[ROWID,10], "HISID"[VARCHAR2,128] 
    19 - "HOSPITAL_ID"[VARCHAR2,128], "ADMISSION_DISEASE_NAME"[VARCHAR2,128], "HS_AREA_CODE"[VARCHAR2,100], 
    "TOTAL_COST"[NUMBER,22], "HS_PATIENT_NAME"[VARCHAR2,200], "SETTLE_DATE"[DATE,7] 
    已选择144行。
    由于服务器CPU100%,响应极慢,由于是select查询语句,与地市技术人员沟通后,决定查杀等待事件asynch descriptor resize对应的会话进程:

    如图所示,会话查杀后,服务器CPU恢复正常水平。
    后期处理:
    需要开发或实施测试人员优化对应的sql语句
    网上有说关闭操作系统磁盘异步IO,由于时间关系来不及验证:alter system set disk_asynch_io=false scope=spfile;
    展开全文
  • 数据库和文件夹中读取图片并且resize container c,cImage; str PicfileName; int w,h,tmp; real bmpW...

     

    从数据库和文件夹中读取图片并且resize

    container           c,cImage;

        str                 PicfileName;

        int                 w,h,tmp;

        real                bmpW = bmp1.widthValue(),bmpH = bmp1.heightValue();

        image               img = new image();

        Act_DesignNoTable   Act_DesignNoTable;

        DocuImgS            DocuImgS;

        binData             Bin = new BinData();

        ;

     

        if (!_itemid && !_ACT_DesignNo)

        return c;

        PicfileName = DocuRef::NJ_findRefId_Sketch(TableNum(Act_DesignNoTable),

        Act_DesignNoTable::find(InventTable::find(_itemid).ACT_DesignNo).RecId).completeFilename();

        if (winapi::fileExists(PicfileName))

        {

            img.loadImage(PicfileName);

     

        }

         else

         {

            cImage = DocuImgS::getBitmap(tablenum(Act_DesignNoTable), Act_DesignNoTable::find(_ACT_DesignNo).RecId);

            if (conlen(cImage) > 0)

                img.setData(cImage);

            else

                return c;

         }

            w = img.width();

            h = img.height();

            tmp = img.size();

            if (w>h)

            {

                  img.resize(138,138/(w/h),interpolationMode::InterpolationModeHighQuality);

            }

            else

            {

                  img.resize(138*(w/h),138,interpolationMode::InterpolationModeHighQuality);

            }

            return img.getData();

    -----------------------------

    date2str(ACT_InvoiceTable.InvoiceDate,123,2,1,3,1,4) +"\n";

    转载于:https://www.cnblogs.com/perock/archive/2011/07/28/2119110.html

    展开全文
  • 问题是aix上10.2.0.5...后续经过分析排查发现原因是由于使用sga自动管理,SGA各组件频繁的resize,导致数据库出现大量SGA: allocation forcing component growth及cursor: pin S wait on X等待,数据库hang,表现...

    问题是aix上10.2.0.5版本RAC,用户发现应用慢,登陆数据库进行查看时发现RAC其中一个节点的sqlplus登陆时一直hang住无法登陆;后续经过分析排查发现原因是由于使用sga自动管理,SGA各组件频繁的resize,导致数据库出现大量SGA: allocation forcing component growth及cursor: pin S wait on X等待,数据库hang,表现出的故障现象就成了sqlplus客户端hang住无法正常登陆。

     介绍下此问题的排查步骤思路: 

    1.首选接到反馈是sqlplus / as sysdba登陆hang住无法登陆,在了解到是在AIX系统上后,第一反应是检查AIX上sqlplus hang的几个常见问题:用了NFS挂载在/根目录下但是NFS出了问题、/etc/resolv.conf中写了DNS出现了问题,同时/etc/netsvc.conf(LINUX是/etc/nsswitch.conf)中的首选是DNS时等,简单的检查后排除了这些问题;检查alert日志,也无明显报错。 

    2.通过集群命令crs_stat -t查看集群资源,均显示正常。检查主机负载(cpu/内存swap/io),无明显异常,监听正常,应用到数据库的连接(ps -ef|grep LOCAL=NO|wc -l)仍在。 

    3.在简单的排查了常见的问题仍没有头绪之后,准备使用操作系统层面的trace及数据库层面的systemstate/hang命令工具来分析问题;并最终通过数据库的systemstate trace分析来确定出故障原因,进而得出临时处理方法(已经hang住只能kill进程或者sqlplus -prelim / as sysdba进去SHUTDOWN ABORT了)及后续解决方法(SGA自动管理相关的设置,在最后); 

    如下是对收集的TRACE的分析过程: 

    1.如下是操作系统的trace分析部分: AIX上的操作系统层面trace方法:AIX5L:$ truss -aefo <output file> <executable> truss -aefo /tmp/sqlplus_trace.log sqlplus / as sysdba

    截取部分重要的trace;
    877354: accessx("/etc/group", 04, 0)                    = 0
    877354: statx("/etc/group", 0x0FFFFFFFFFFF4090, 176, 0) = 0
    877354: getdomainname(0x0FFFFFFFFFFF4FC8, 256)          = 0
    877354: close(12)                                       = 0
    877354: getgidx(2)                                      = 101
    877354: open("/oracle/admin/cmdb/adump/cmdb1_ora_877354_1.aud", O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, S_IRUSR|S_IWUSR|S_IRGRP|S_IWGRP) = 12
    877354: kwrite(12, " A u d i t   f i l e  ", 11)        = 11
    ……省略部分输出
    877354: kwrite(12, " F r i   J u l     7   1".., 25)    = 25
    877354: kwrite(12, " L E N G T H   :   ' 1 6".., 14)    = 14
    877354: kwrite(12, "\n", 1)                             = 1
    877354: kwrite(12, " A C T I O N   : [ 7 ]  ".., 159)   = 159
    877354: kwrite(12, "\n", 1)                             = 1
    877354: times(0x0FFFFFFFFFFF5E30)                       = 1191467758
    877354: times(0x0FFFFFFFFFFF5B10)                       = 1191467758
    877354: times(0x0FFFFFFFFFFF5D30)                       = 1191467758
    …………省略部分输出
    877354: times(0x0FFFFFFFFFFF4E10)                       = 1191467759
    877354: times(0x0FFFFFFFFFFF45F0)                       = 1191467759
    877354: times(0x0FFFFFFFFFFF45F0)                       = 1191467759
    877354: thread_wait(10)                                 = 1
    877354: thread_wait(10)                                 = 1
    877354: thread_wait(10)                                 = 1
    877354: thread_wait(10)                                 = 1
    877354: thread_wait(10)                                 = 1
    877354: thread_wait(10)                                 = 1
    877354: thread_wait(10)                                 = 1
    877354: thread_wait(10)                                 = 1
    --后面均是877354:   thread_wait(10)                                 = 1这种输出

    thread_wait(10)这个函数虽然不懂,从字面意思上来猜是进程等待,但是在等待什么呢,OS层面的资源(CPU/IO/MEMORY)目前未发现明显瓶颈,只能继续从ORACLE数据库的进程状态中查找原因了。

    2.数据库的systemstate/hang输出

    使用命令:
    Hanganalyze:
    sqlplus -prelim / as sysdba
    SQL>oradebug setmypid
    SQL>oradebug unlimit
    SQL>oradebug -g all hanganalyze 3
    Wait for 30 seconds
    SQL>oradebug -g all hanganalyze 3
    SQL>exit
    
    Systemstate dump:
    sqlplus -prelim / as sysdba
    SQL>oradebug setmypid
    SQL>oradebug unlimit
    SQL>oradebug -g all dump systemstate 10
    Wait for 30 seconds
    SQL>oradebug -g all dump systemstate 10
    SQL>oradebug tracefile_name
    SQL>exit

    对于Systemstate命令产生的TRACE文件,可以使用MOS官方提供的AWK脚本ass109.awk来进行快速分析,找出产生阻塞的进程及各进程的等待事件。

    ]$ awk -f ass109.awk cmdb1_ora_791490.trc 
    Starting Systemstate 1
    ..............................................................................
    Ass.Awk Version 1.0.9 - Processing cmdb1_ora_791490.trc
    System State 1
    ~~~~~~~~~~~~~~~~
    1:                                      
    2:  waiting for 'pmon timer'            
    3:  waiting for 'DIAG idle wait'        
    4:  waiting for 'rdbms ipc message'     
    5:  waiting for 'rdbms ipc message'     
    6:  waiting for 'ges remote message'    
    7:  waiting for 'gcs remote message'    
    8:  waiting for 'gcs remote message'    
    9:  waiting for 'gcs remote message'    
    10: waiting for 'gcs remote message'    
    11: last wait for 'rdbms ipc message'   
    12: waiting for 'rdbms ipc message'     
    13: waiting for 'rdbms ipc message'     
    14: waiting for 'rdbms ipc message'     
    15: waiting for 'rdbms ipc message'     
    16: waiting for 'SGA: allocation forcing component growth'   ============>>>>>>>>>>
    17: waiting for 'cursor: pin S wait on X'      =============>>>>>>>>>>
    18: waiting for 'SGA: allocation forcing component growth' 
    19: waiting for 'SGA: allocation forcing component growth' 
    20: waiting for 'rdbms ipc message'     
    21:                                     
    22:                                     
    23: waiting for 'rdbms ipc message'     
    24: waiting for 'SQL*Net message from client' 
         Cmd: Insert
    25: waiting for 'SQL*Net message from client' 
         Cmd: Insert
    26: waiting for 'cursor: pin S wait on X' 
         Cmd: Select
    27: waiting for 'SQL*Net message from client' 
    28: waiting for 'cursor: pin S wait on X' 
    29: waiting for 'SQL*Net message from client' 
    30: waiting for 'SQL*Net message from client' 
         Cmd: Insert
    31: waiting for 'cursor: pin S wait on X' 
         Cmd: Select
    32: waiting for 'cursor: pin S wait on X' 
         Cmd: Select
    …………省略部分类似输出
    54: waiting for 'SQL*Net message from client' 
    55: waiting for 'cursor: pin S wait on X' 
         Cmd: Select
    56: waiting for 'cursor: pin S wait on X' 
         Cmd: Select
    57: waiting for 'SQL*Net message from client' 
    58: waiting for 'SQL*Net message from client' 
         Cmd: Delete
    59: waiting for 'SQL*Net message from client' 
    60: waiting for 'SQL*Net message from client' 
    61: last wait for 'SGA: allocation forcing component growth' 
    ………… 
    100:waiting for 'SGA: allocation forcing component growth' 
    101:waiting for 'SQL*Net message from client' 
    102:waiting for 'SGA: allocation forcing component growth' 
    ……省略部分类似输出
    119:waiting for 'cursor: pin S wait on X' 
    120:waiting for 'cursor: pin S wait on X' 
    121:waiting for 'SGA: allocation forcing component growth' 
    122:waiting for 'SQL*Net message from client' 
    123:waiting for 'cursor: pin S wait on X' 
         Cmd: Select
    124:waiting for 'cursor: pin S wait on X' 
    125:waiting for 'cursor: pin S wait on X' 
         Cmd: Select
    126:waiting for 'SGA: allocation forcing component growth' 
    127:waiting for 'SQL*Net message from client' 
    128:waiting for 'SQL*Net message from client' 
         Cmd: Insert
    129:waiting for 'cursor: pin S wait on X' 
         Cmd: Select
    ………… 省略部分类似输出
    321:waiting for 'cursor: pin S wait on X' 
    322:waiting for 'SGA: allocation forcing component growth' 
         Cmd: Select
    323:waiting for 'Streams AQ: qmn slave idle wait' 
    324:waiting for 'cursor: pin S wait on X' 
    325:waiting for 'cursor: pin S wait on X' 
    326:waiting for 'cursor: pin S wait on X' 
    327:waiting for 'cursor: pin S wait on X' 
    328:waiting for 'cursor: pin S wait on X' 
    329:waiting for 'cursor: pin S wait on X' 
    330:waiting for 'cursor: pin S wait on X' 
    331:waiting for 'cursor: pin S wait on X' 
    332:waiting for 'cursor: pin S wait on X' 
         Cmd: Select
    333:waiting for 'SGA: allocation forcing component growth' 
         Cmd: Select
    334:waiting for 'cursor: pin S wait on X' 
    335:waiting for 'cursor: pin S wait on X' 
    336:waiting for 'cursor: pin S wait on X' 
    337:waiting for 'cursor: pin S wait on X' 
         Cmd: Select
    338:waiting for 'cursor: pin S wait on X' 
    …………
    376:waiting for 'cursor: pin S wait on X' 
    532:waiting for 'SQL*Net message from client' 
    533:waiting for 'SQL*Net message from client' 
    534:waiting for 'cursor: pin S wait on X' 
         Cmd: Select
    
    NO BLOCKING PROCESSES FOUND   ===========>>>>>>>>>从SYSTEMSDUMP中未发现阻塞进程
    
    554983 Lines Processed.

    从分析来看,是存在大量的cursor: pin S wait on X及SGA: allocation forcing component growth异常等待事件,对SGA内存管理类问题有过研究的相信到这里已经基本可以推断出此次引起系统hang的原因了,具体在后面总结部分再展开。 查看hang trace: 从hang trace中也没有发现有直接的 hang住的进程,

    *** 2017-07-07 19:00:26.013
    ==============
    HANG ANALYSIS: level 3 syslevel 1
    ==============
    session (kjzddbx) switches to a new action
    PORADATA TRACEFILE /oracle/admin/cmdb/bdump/cmdb1_diag_95122.trc
    PORADEBUG END ORIGINATING INST:1 SERIAL:0 PID:762600
    ********************************************************************
    Found 22 objects waiting for <cnode/sid/sess_srno/proc_ptr/ospid/wait_event>
        <0/1096/1/0x93708b8/176578/No Wait>
    Open chains found:
    Chain 1 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :
        <0/1088/1/0xa3bcc48/250860/cursor: pin S wait on X>
     -- <1/1090/1/enq: DR - contention>
    Chain 2 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :
        <0/1096/1/0x93708b8/176578/No Wait>
     -- <0/571/33923/0xf3cbf88/831752/SGA: allocation forcing componen>
    Other chains found:
    Chain 3 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :
        <0/516/51097/0xa3dd828/668272/cursor: pin S wait on X>
    Chain 4 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :
        <0/517/11143/0x937a778/430216/cursor: pin S wait on X>
    Chain 5 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :
        <0/519/1603/0x83a8210/860644/cursor: pin S wait on X>
    ……………………
    [2193]/1/1089/1/0x9560e50//IGN/1291/1292//none
    [2194]/1/1090/1/0x8565b58//NLEAF/1293/1294/[1087]/none
    [2195]/1/1091/1/0xa5a5cb8//IGN/1295/1296//none
    …………
    [2206]/1/1102/1/0x8569bc0//IGN/1317/1318//none
    [2207]/1/1103/1/0xa5a9d20//IGN/1319/1320//none
    [2208]/1/1104/1/0xf594508//SINGLE_NODE_NW/1321/1322//none
    [2209]/1/1105/1/0x9566430//IGN/1323/1324//none
    session (kjzddbx) switches to a new action
    *** 2017-07-07 19:00:34.443
    ====================
    END OF HANG ANALYSIS
    ====================

    从上面trace分析来看,数据库出现大量的cursor: pin S wait on X及SGA: allocation forcing component growth异常等待事件,数据库hang住。 根据已有信息来分析一下,首先是产生cursor: pin S wait on X及SGA: allocation forcing component growth的原因,可能是业务压力变大、业务变化等因素,导致对SGA内存组件(主要是buffer cache/shared_pool)的需求发生变化后发生了resize;

    在SGA自动管理时,各组件resize时会使SGA内存类似冻结,同时会出现处于resize中间状态的KGH: NO ACCESS内存,经常会引起SQL解析的异常(如常见的ORA-04031问题、Cursor: Pin S Wait On X', 'Library Cache Lock' And "Latch: Shared Pool"等)。由于这个故障不是当前在维护的客户,所以后续也没有在重启数据库后收集故障时间段的AWR/ASH来分析更深层次的原因了,对于希望从技术角度来学习、深究问题根源来说,算是一个小小遗憾了;

    好在从对trace的分析后提出的对SGA自动管理的相关设置建议,基本可以规避SGA resize引起的问题了。 对于SGA自动管理中如何规避SGA中各组件resize对系统稳定进行的影响,MOS上官方的建议是: 

    1.首先设置合理的 SGA_TARGET和SGA_MAX_SIZE值,可以参考AWR中或视图V$SGA_TARGET_ADVICE中的建议值,甚至有必要时可以关闭SGA自动管理。

     2.设置合理的shared_pool_size及db_cache_size;默认在SGA自动管理时是不需要设置这两个参数值的;通过设置这两个参数值,会使这两个组件在SGA自动管理时不会低于设置值,也即设置的是这两个组件的最小值。如果SGA大小去除这两个参数后还有调整余地,这两个组件如果需要可以继续扩张;可以减少RESIZE的发生及避免RESIZE过多;

     3.设置隐含参数:alter system set“_memory_broker_stat_interval”= 999;(单位秒) 减少resize操作的次数,即最多999秒resize一次,此参数默认设置为30秒,RAC需要在所有节点上设置。 

    4.调整应用减少SAL解析,包括硬解析、软解析。 5.打补丁解决一些与此相关的已知BUG或升级到较新的数据库版本。

    可参考的MOS文档如下:
    High Waits for "SGA: allocation forcing component growth" and Slow Database Performance in 10g (文档 ID 1170814.1)
    High 'Cursor: Pin S Wait On X', 'Library Cache Lock' And "Latch: Shared Pool" Waits due to Shared Pool/Buffer Cache Resize Activity (文档 ID 742599.1)
     Troubleshooting 'cursor: pin S wait on X' waits. (文档 ID 1349387.1)
    展开全文
  • Resize 的作用是调整 instance 的 vCPU、内存和磁盘资源。Instance 需要多少资源是定义在 flavor 中的,resize 操作是通过为 instance 选择新的 flavor 来调整资源的分配。 在使用nova resize去修改虚拟机的...
    Resize 的作用是调整 instance 的 vCPU、内存和磁盘资源。Instance 需要多少资源是定义在 flavor 中的,resize 操作是通过为 instance 选择新的 flavor 来调整资源的分配。

    在使用nova resize去修改虚拟机的配置的时候,实际调用的是migrate的代码。

         为什么需要在resize的时候进行迁移?要知道迁移过程涉及到底层网络、磁盘迁移、调度算法等,这个过程是相当复杂的 。
         我认为之所以会进行迁移,面对的场景一般是用户增加配置(disk等),这个时候出于OpenStack的一个调度策略,会进行nova-schedule来选择计算节点中最能够满足的一台机器进行resize(所以也有可能调度到本机)

    在默认情况下,执行resize的时候,首先会进行磁盘disk镜像和后端镜像进行一个合并成raw,然后转换成qcow2的过程。而且会进行迁移的动作。整个过程会相当的消耗时间,而我们一般只需要修改cpu和mem的大小,并不需要去合并镜像,更不需要去做迁移,所以有时觉得迁移是不需要的。

    resize与migrate底层接口一致,假若前端传入了新的flavor,则是resize,该新的flavor传入底层。

         迁移传入底层的flavor则为自身实例相同的flavor。底层根据传入进来的flavor参数走相同的逻辑。resize与migrate的区别是在迁移的同时,是否改变虚拟机的flavor。

         resize底层与冷迁移共用接口,因此迁移过程中会关闭实例。(注:在confirm_resize过程则不会关闭实例)实质是:当resize完成,新虚拟机已经迁移完毕,且能对外提供服务。 可以通过以下的源代码跟踪对应的流程

    @wrap_check_policy
    @check_instance_lock
    @check_instance_cell
    @check_instance_state(vm_state=[vm_states.ACTIVE, vm_states.STOPPED])
    def resize(self, context, instance, flavor_id=None, clean_shutdown=True,
               **extra_instance_updates):
        """Resize (ie, migrate) a running instance.
        If flavor_id is None, the process is considered a migration, keeping
        the original flavor_id. If flavor_id is not None, the instance should
        be migrated to a new host and resized to the new flavor_id.
        """
        self._check_auto_disk_config(instance, **extra_instance_updates)
        current_instance_type = instance.get_flavor()
        # If flavor_id is not provided, only migrate the instance.
        #通过flavor_id获取虚拟机类型,加入为空,新类型为原来的flavor
        if not flavor_id:
            LOG.debug("flavor_id is None. Assuming migration.",
                      instance=instance)
            new_instance_type = current_instance_type
        else:
            new_instance_type = flavors.get_flavor_by_flavor_id(
                    flavor_id, read_deleted="no")
            if (new_instance_type.get('root_gb') == 0 and
                current_instance_type.get('root_gb') != 0):
                reason = _('Resize to zero disk flavor is not allowed.')
                raise exception.CannotResizeDisk(reason=reason)
        if not new_instance_type:
            raise exception.FlavorNotFound(flavor_id=flavor_id)
        current_instance_type_name = current_instance_type['name']
        new_instance_type_name = new_instance_type['name']
        LOG.debug("Old instance type %(current_instance_type_name)s, "
                  " new instance type %(new_instance_type_name)s",
                  {'current_instance_type_name': current_instance_type_name,
                   'new_instance_type_name': new_instance_type_name},
                  instance=instance)
        same_instance_type = (current_instance_type['id'] ==
                              new_instance_type['id'])
        # NOTE(sirp): We don't want to force a customer to change their flavor
        # when Ops is migrating off of a failed host.
        if not same_instance_type and new_instance_type.get('disabled'):
            raise exception.FlavorNotFound(flavor_id=flavor_id)
        if same_instance_type and flavor_id and self.cell_type != 'compute':
            raise exception.CannotResizeToSameFlavor()
        # ensure there is sufficient headroom for upsizes
        if flavor_id:
         #计算resize所需要的资源配额,主要统计vCpu和mem
            deltas = self._upsize_quota_delta(context, new_instance_type,
                                              current_instance_type)
            try:
                quotas = self._reserve_quota_delta(context, deltas, instance)
            except exception.OverQuota as exc:
                quotas = exc.kwargs['quotas']
                overs = exc.kwargs['overs']
                usages = exc.kwargs['usages']
                headroom = self._get_headroom(quotas, usages, deltas)
                resource = overs[0]
                used = quotas[resource] - headroom[resource]
                total_allowed = used + headroom[resource]
                overs = ','.join(overs)
                LOG.warning(_LW("%(overs)s quota exceeded for %(pid)s,"
                                " tried to resize instance."),
                            {'overs': overs, 'pid': context.project_id})
                raise exception.TooManyInstances(overs=overs,
                                                 req=deltas[resource],
                                                 used=used,
                                                 allowed=total_allowed,
                                                 resource=resource)
        else:
            quotas = objects.Quotas(context=context)
        instance.task_state = task_states.RESIZE_PREP
        instance.progress = 0
        instance.update(extra_instance_updates)
        instance.save(expected_task_state=[None])
        filter_properties = {'ignore_hosts': []}
        
        # 判断是否可以resize到本机
        if not CONF.allow_resize_to_same_host:
            filter_properties['ignore_hosts'].append(instance.host)
        # Here when flavor_id is None, the process is considered as migrate.
        if (not flavor_id and not CONF.allow_migrate_to_same_host):
            filter_properties['ignore_hosts'].append(instance.host)
        if self.cell_type == 'api':
            # Commit reservations early and create migration record.
            self._resize_cells_support(context, quotas, instance,
                                       current_instance_type,
                                       new_instance_type)
        if not flavor_id:
            self._record_action_start(context, instance,
                                      instance_actions.MIGRATE)
        else:
            self._record_action_start(context, instance,
                                      instance_actions.RESIZE)
        scheduler_hint = {'filter_properties': filter_properties}
        #调用nova-conductor
        self.compute_task_api.resize_instance(context, instance,
                extra_instance_updates, scheduler_hint=scheduler_hint,
                flavor=new_instance_type,
                reservations=quotas.reservations or [],
                clean_shutdown=clean_shutdown)
    


    为了使得VM在resize之后能够resize至本机,在/etc/nova/nova.conf配置文件中添加如下(所有控制节点和计算节点):
    allow_resize_to_same_host=True
    scheduler_default_filters=AllHostsFilter By
    (changing the default nova scheduler filter to AllHostsFilter , all compute hosts will be available and unfiltered. It is not a good idea to keep this setting in a multi-compute environment)
    配置好之后重启控制节点所有nova-*服务和计算节点nova-compute服务

         事实上,在reboot一个instance的时候,这个instance的libvirt.xml(只考虑kvm环境)会重新从数据库中生成。所以只需要修改数据库就行了。那么我们就会想能不能直接修改数据库然后使得resize生效呢?

    答案是肯定的

    首先分析数据库里面的nova表,直觉就是修改instances表,因为这个记录了所有instances的基本信息

    update instances set  instance_type_id='3',  vcpus='4',memory_mb='8192',root_gb='40'   where hostname='cinder-cirros'  AND vm_state != 'deleted' AND vm_state != 'error';

         修改好之后在刷新Dashboard或者是执行 nova --debug list,会发现完全没有反应,说明其实单单修改了instances表远远不够,通过观察,发现其实真正生效的数据库表在instance_extra表里面

    flavor字段信息(json数据)
    {
      "new": null,
      "old": null,
      "cur": {
        "nova_object.version": "1.1",
        "nova_object.changes": [
          "deleted",
          "ephemeral_gb",
          "updated_at",
          "disabled",
          "extra_specs",
          "rxtx_factor",
          "is_public",
          "deleted_at",
          "id",
          "root_gb",
          "name",
          "flavorid",
          "created_at",
          "memory_mb",
          "vcpus",
          "swap",
          "vcpu_weight"
        ],
        "nova_object.name": "Flavor",
        "nova_object.data": {
          "disabled": false,
          "root_gb": 1,
          "name": "m1.tiny",
          "flavorid": "1",
          "deleted": false,
          "created_at": null,
          "ephemeral_gb": 0,
          "updated_at": null,
          "memory_mb": 512,
          "vcpus": 1,
          "extra_specs": {},
          "swap": 0,
          "rxtx_factor": 1.0,
          "is_public": true,
          "deleted_at": null,
          "vcpu_weight": 0,
          "id": 2
        },
        "nova_object.namespace": "nova"
      }
    }


         在这个json数据中修改nova_object.data中的root_gb、name、flavorid、memory_mb、vcpus、id(对应flavor表中的主键),比如:

         在修改了对应的数据之后,同时修改instances表中对应的flavor信息相关字段(确保一致性),然后执行nova-list或者刷新Dashboard会看到信息都修改了,这时候进入虚拟机通过  
         cat /proc/meminfo  
    会看到虚拟机的实际配置是没有改变的,这时候需要执行
         nova --hard reboot instance
    来使得libvirt.xml文件通过元数据注入重新生效,OK!修改成功

         在这里还需要测试这样做之后是否对整个系统的信息是否有损害,比如这样更改之后对于用户的资源用量统计信息是否会同步更新,因此这种做法还是值得商榷与测试的。但是这样的确是可以达到修改VM配置的目的
    展开全文
  • 由于是基于Jquery的,所以至于拖拽和大小变化的功能...="CodeList_resizeDiv"  class ="resizeDiv" > div > div >   最后放上 源码 转载于:https://www.cnblogs.com/qiba78/archive/2009/06/14/1503037.html
  • 第 5 章 数据库存储管理 设置和修改表空间属性 3 更改大文件表空间大小 例修改大文件表空间 bigtbs 的数据文件大 小为 4GB 代码如下 SQL> ALTER TABLESPACE bigtbs RESIZE 4G; 表空间已更改 第 5 章 数据库存储管理 ...
  • PI数据库例程

    2012-07-09 10:57:20
    PI数据库的示例程序,包括ManageServers,modattrsvb,PI SQC Automation Example,Recalculate and Resize DataLink code example,PIDELETE_2_1e (GUI)。
  • asynch descriptor resize

    2014-10-09 09:48:42
    今天在做数据库巡检时发现等待事件asynch descriptor resize出现在TOP 5里,而且次数很高。 对于这个等待事件很陌生,百度了一下,有种说法是Bug。呵呵,既然说是Bug,那在MOS里应该...
  • 数据库内存抖动

    2016-05-14 21:02:48
    数据库巡检的过程中,发现在AWR报告中(该报告的采集间隔为1小时),两个采集点的Buffer Cache和Shared Pool Size发生了较大变化: 接着查询v$SGA_RESIZE_OPS视图: 确实在报告对应的时间里发生了shared pool的...
  • Api-resize-observer.zip

    2019-09-18 10:16:23
    Api-resize-observer.zip,polyfill resize observer api。resize observer polyfill,一个api可以被认为是多个软件...例如,api可用于web应用程序之间的数据库通信。通过提取实现并将数据放弃到对象中,api简化了编程。
  • resize数据文件大小

    2020-10-19 14:32:36
    缩减数据文件大小,用于系统存储空间紧张的情况下,保证数据库正常运行 (分享以下内容旨在传递信息之用途 , 请在使用后及时完成评估并在24小时内删除 , 否则一切后果请您自负)
  • 客户数据库服务器cpu很高,查询等待事件 发现 asynch descriptor resize等待事件 从AWR报告里也看到的asynch descriptor resize事件。 SQL> show parameter async NAME_COL_PLUS_SHOW_PARAM TYPE VALUE_COL...
  • pytorch 自定义数据库

    2018-12-05 12:33:45
    from torch.utils import data import os from PIL import Image from torchvision import transforms as T ... T.Resize(224), # 缩放图片,保持长宽比不变,最短边为224px T.CenterCrop(224), # 从图...
  • 假设数据库 BOOKSHOP 页面大小为 8K,数据文件存放路径为 C:\DMDBMS\data。 例1 给数据库增加一个日志文件C:\DMDBMS\data\dmlog_0.log,其大小为200M。 ALTER DATABASE ADD LOGFILE '...ALTER DATABASE RESIZE LOGFIL
  • 数据库试题1

    2009-02-07 21:59:38
    创建用户时,需要赋予新用户什么权限才能使它联上数据库。[/b] Connect [b]2. 如何在tablespace里增加数据文件?[/b] alter tablespace add datafile size [b]3. 如何变动数据文件的大小?[/b] alter ...
  • 达梦数据库表空间的管理:包括查询表空间:默认有7个表空间;规划表空间;维护表空间:数据导入导出,Resize 数据文件大小,增加数据文件;更换存储位置;删除表空间
  • 数据库的一些操作

    2020-11-10 17:20:54
    文章目录oraclemongomysql oracle 创建一个名为YC的表空间,初始大小为1024M,并自动增长 create tablespace YC ...alter database datafile 'D:\oracle11\oradata\tablespaces\test.dbf' resize 100m; 查询空表 s
  • Orale 12c RAC环境ALERT LOG中出现Resize operation completed for file# 查看数据库版本: 1 BANNER CON_ID 2 -----------------------------------...
  • 本示例主要解决GridPanel TabPanel ComboBox,动态生成数据,无须任何配置,并且大部分...源码 " onerror="this.src='/images/ifnoimg.gif'" src="/uploads/allimg/090912/121UKI7-0.jpg" width="468" resize="true" />
  • Oracle数据库-表空间

    2021-04-17 11:50:07
    如何调整表空间的尺寸(表空间的大小等同它下的数据文件大小之和) 当发生表空间不足的问题时常用的 3 个...alter database tbs01 datafile '/u01/app/oracle/oradata/db01/tbs01_01.dbf' resize 20m; 2)增加数据文
  • jQuery Resize And Crop是一个对图片进行局部选取的插件,如上图所示,特别适用来做网站的头像处理,只不过您还要进行二次开发才可以,因为上传头像还要连接数据库和上传组件。
  • 安装完成数据库之后,需要一些常规的优化。以下是我工作中用到的一些常见的优化命令 ------------------------------------------------------------------------ alter database resize logfile 'DAMENG01.log' to...
  • ORACLE 数据库操作语言

    2017-04-19 14:29:03
    --创建表空间 create tablespace tbs_LiuYa datafile 'E:/orcl.ora' size 100m autoextend on;...alter database datafile 'E:/orcl.ora' resize 200m; --改变表空间的读写状态 alter tablespace tbs_LiuYa re
  • Oracle-35-闪回数据库

    2018-07-30 17:06:10
    通过引入闪回日志和归档日志将数据库还原到某个历史时间点,但有几种误操作是无法还原的:删除表空间、删除数据文件、对数据文件进行resize、重建控制文件、使用nologing子句、resetlog等 首先查看数据库是否开启闪...
  • 通过作者历时3年的写作,本书介绍了Oracle的安装及卸载、数据库的启动关闭、表空间和数据文件的管理、Oracle网络管理、口令文件管理、内存管理、进程管理、用户管理、数据的导入导出、备份与恢复、模式对象的管理...
  • 通过作者历时3年的写作,本书介绍了Oracle的安装及卸载、数据库的启动关闭、表空间和数据文件的管理、Oracle网络管理、口令文件管理、内存管理、进程管理、用户管理、数据的导入导出、备份与恢复、模式对象的管理...
  • Oracle变更数据库数据文件大小

    千次阅读 2018-10-14 12:33:58
    ALTER DATABASE DATAFILE '文件名' RESIZE 大小;

空空如也

空空如也

1 2 3 4 5 ... 15
收藏数 298
精华内容 119
关键字:

数据库resize