精华内容
下载资源
问答
  • 使用docker部署skywalking集群
    千次阅读
    2020-12-30 09:31:22

    环境准备

    节点名称ip地址
    node1192.168.130.20
    node2192.168.130.19
    node2192.168.130.21

    安装docker

    安装zookeeper集群

    参考:https://blog.csdn.net/kk3909/article/details/111937681

    安装elasticsearch7

    参考:https://blog.csdn.net/kk3909/article/details/111937621

    skywalking安装

    1. 创建配置目录
    mkdir -p /root/skywalking/config
    
    1. 创建配置
    cd /root/skywalking/config
    vim application.yml
    

    添加内容如下:

    cluster:
      selector: ${SW_CLUSTER:zookeeper}
      standalone:
      # Please check your ZooKeeper is 3.5+, However, it is also compatible with ZooKeeper 3.4.x. Replace the ZooKeeper 3.5+
      # library the oap-libs folder with your ZooKeeper 3.4.x library.
      zookeeper:
        nameSpace: ${SW_NAMESPACE:""}
        hostPort: ${SW_CLUSTER_ZK_HOST_PORT:192.168.130.20:2181,192.168.130.19:2181,192.168.130.21:2181}
        # Retry Policy
        baseSleepTimeMs: ${SW_CLUSTER_ZK_SLEEP_TIME:1000} # initial amount of time to wait between retries
        maxRetries: ${SW_CLUSTER_ZK_MAX_RETRIES:3} # max number of times to retry
        # Enable ACL
        enableACL: ${SW_ZK_ENABLE_ACL:false} # disable ACL in default
        schema: ${SW_ZK_SCHEMA:digest} # only support digest schema
        expression: ${SW_ZK_EXPRESSION:skywalking:skywalking}
      kubernetes:
        watchTimeoutSeconds: ${SW_CLUSTER_K8S_WATCH_TIMEOUT:60}
        namespace: ${SW_CLUSTER_K8S_NAMESPACE:default}
        labelSelector: ${SW_CLUSTER_K8S_LABEL:app=collector,release=skywalking}
        uidEnvName: ${SW_CLUSTER_K8S_UID:SKYWALKING_COLLECTOR_UID}
      consul:
        serviceName: ${SW_SERVICE_NAME:"SkyWalking_OAP_Cluster"}
        # Consul cluster nodes, example: 10.0.0.1:8500,10.0.0.2:8500,10.0.0.3:8500
        hostPort: ${SW_CLUSTER_CONSUL_HOST_PORT:localhost:8500}
        aclToken: ${SW_CLUSTER_CONSUL_ACLTOKEN:""}
      nacos:
        serviceName: ${SW_SERVICE_NAME:"SkyWalking_OAP_Cluster"}
        hostPort: ${SW_CLUSTER_NACOS_HOST_PORT:localhost:8848}
        # Nacos Configuration namespace
        namespace: ${SW_CLUSTER_NACOS_NAMESPACE:"public"}
      etcd:
        serviceName: ${SW_SERVICE_NAME:"SkyWalking_OAP_Cluster"}
        # etcd cluster nodes, example: 10.0.0.1:2379,10.0.0.2:2379,10.0.0.3:2379
        hostPort: ${SW_CLUSTER_ETCD_HOST_PORT:localhost:2379}
    
    core:
      selector: ${SW_CORE:default}
      default:
        # Mixed: Receive agent data, Level 1 aggregate, Level 2 aggregate
        # Receiver: Receive agent data, Level 1 aggregate
        # Aggregator: Level 2 aggregate
        role: ${SW_CORE_ROLE:Mixed} # Mixed/Receiver/Aggregator
        restHost: ${SW_CORE_REST_HOST:0.0.0.0}
        restPort: ${SW_CORE_REST_PORT:12800}
        restContextPath: ${SW_CORE_REST_CONTEXT_PATH:/}
        gRPCHost: ${SW_CORE_GRPC_HOST:0.0.0.0}
        gRPCPort: ${SW_CORE_GRPC_PORT:11800}
        gRPCSslEnabled: ${SW_CORE_GRPC_SSL_ENABLED:false}
        gRPCSslKeyPath: ${SW_CORE_GRPC_SSL_KEY_PATH:""}
        gRPCSslCertChainPath: ${SW_CORE_GRPC_SSL_CERT_CHAIN_PATH:""}
        gRPCSslTrustedCAPath: ${SW_CORE_GRPC_SSL_TRUSTED_CA_PATH:""}
        downsampling:
          - Hour
          - Day
          - Month
        # Set a timeout on metrics data. After the timeout has expired, the metrics data will automatically be deleted.
        enableDataKeeperExecutor: ${SW_CORE_ENABLE_DATA_KEEPER_EXECUTOR:true} # Turn it off then automatically metrics data delete will be close.
        dataKeeperExecutePeriod: ${SW_CORE_DATA_KEEPER_EXECUTE_PERIOD:5} # How often the data keeper executor runs periodically, unit is minute
        recordDataTTL: ${SW_CORE_RECORD_DATA_TTL:90} # Unit is minute
        minuteMetricsDataTTL: ${SW_CORE_MINUTE_METRIC_DATA_TTL:90} # Unit is minute
        hourMetricsDataTTL: ${SW_CORE_HOUR_METRIC_DATA_TTL:36} # Unit is hour
        dayMetricsDataTTL: ${SW_CORE_DAY_METRIC_DATA_TTL:45} # Unit is day
        monthMetricsDataTTL: ${SW_CORE_MONTH_METRIC_DATA_TTL:18} # Unit is month
        # Cache metric data for 1 minute to reduce database queries, and if the OAP cluster changes within that minute,
        # the metrics may not be accurate within that minute.
        enableDatabaseSession: ${SW_CORE_ENABLE_DATABASE_SESSION:true}
        topNReportPeriod: ${SW_CORE_TOPN_REPORT_PERIOD:10} # top_n record worker report cycle, unit is minute
        # Extra model column are the column defined by in the codes, These columns of model are not required logically in aggregation or further query,
        # and it will cause more load for memory, network of OAP and storage.
        # But, being activated, user could see the name in the storage entities, which make users easier to use 3rd party tool, such as Kibana->ES, to query the data by themselves.
        activeExtraModelColumns: ${SW_CORE_ACTIVE_EXTRA_MODEL_COLUMNS:false}
    
    storage:
      selector: ${SW_STORAGE:elasticsearch7}
      elasticsearch:
        nameSpace: ${SW_NAMESPACE:""}
        clusterNodes: ${SW_STORAGE_ES_CLUSTER_NODES:localhost:9200}
        protocol: ${SW_STORAGE_ES_HTTP_PROTOCOL:"http"}
        trustStorePath: ${SW_SW_STORAGE_ES_SSL_JKS_PATH:"../es_keystore.jks"}
        trustStorePass: ${SW_SW_STORAGE_ES_SSL_JKS_PASS:""}
        user: ${SW_ES_USER:""}
        password: ${SW_ES_PASSWORD:""}
        secretsManagementFile: ${SW_ES_SECRETS_MANAGEMENT_FILE:""} # Secrets management file in the properties format includes the username, password, which are managed by 3rd party tool.
        enablePackedDownsampling: ${SW_STORAGE_ENABLE_PACKED_DOWNSAMPLING:true} # Hour and Day metrics will be merged into minute index.
        dayStep: ${SW_STORAGE_DAY_STEP:1} # Represent the number of days in the one minute/hour/day index.
        indexShardsNumber: ${SW_STORAGE_ES_INDEX_SHARDS_NUMBER:2}
        indexReplicasNumber: ${SW_STORAGE_ES_INDEX_REPLICAS_NUMBER:0}
        # Those data TTL settings will override the same settings in core module.
        recordDataTTL: ${SW_STORAGE_ES_RECORD_DATA_TTL:7} # Unit is day
        otherMetricsDataTTL: ${SW_STORAGE_ES_OTHER_METRIC_DATA_TTL:45} # Unit is day
        monthMetricsDataTTL: ${SW_STORAGE_ES_MONTH_METRIC_DATA_TTL:18} # Unit is month
        # Batch process setting, refer to https://www.elastic.co/guide/en/elasticsearch/client/java-api/5.5/java-docs-bulk-processor.html
        bulkActions: ${SW_STORAGE_ES_BULK_ACTIONS:1000} # Execute the bulk every 1000 requests
        flushInterval: ${SW_STORAGE_ES_FLUSH_INTERVAL:10} # flush the bulk every 10 seconds whatever the number of requests
        concurrentRequests: ${SW_STORAGE_ES_CONCURRENT_REQUESTS:2} # the number of concurrent requests
        resultWindowMaxSize: ${SW_STORAGE_ES_QUERY_MAX_WINDOW_SIZE:10000}
        metadataQueryMaxSize: ${SW_STORAGE_ES_QUERY_MAX_SIZE:5000}
        segmentQueryMaxSize: ${SW_STORAGE_ES_QUERY_SEGMENT_SIZE:200}
        profileTaskQueryMaxSize: ${SW_STORAGE_ES_QUERY_PROFILE_TASK_SIZE:200}
        advanced: ${SW_STORAGE_ES_ADVANCED:""}
      elasticsearch7:
        nameSpace: ${SW_NAMESPACE:""}
        clusterNodes: ${SW_STORAGE_ES_CLUSTER_NODES:192.168.130.20:9200,192.168.130.19:9200,192.168.130.21:9200}
        protocol: ${SW_STORAGE_ES_HTTP_PROTOCOL:"http"}
        trustStorePath: ${SW_SW_STORAGE_ES_SSL_JKS_PATH:"../es_keystore.jks"}
        trustStorePass: ${SW_SW_STORAGE_ES_SSL_JKS_PASS:""}
        enablePackedDownsampling: ${SW_STORAGE_ENABLE_PACKED_DOWNSAMPLING:true} # Hour and Day metrics will be merged into minute index.
        dayStep: ${SW_STORAGE_DAY_STEP:1} # Represent the number of days in the one minute/hour/day index.
        user: ${SW_ES_USER:""}
        password: ${SW_ES_PASSWORD:""}
        secretsManagementFile: ${SW_ES_SECRETS_MANAGEMENT_FILE:""} # Secrets management file in the properties format includes the username, password, which are managed by 3rd party tool.
        indexShardsNumber: ${SW_STORAGE_ES_INDEX_SHARDS_NUMBER:2}
        indexReplicasNumber: ${SW_STORAGE_ES_INDEX_REPLICAS_NUMBER:0}
        # Those data TTL settings will override the same settings in core module.
        recordDataTTL: ${SW_STORAGE_ES_RECORD_DATA_TTL:7} # Unit is day
        otherMetricsDataTTL: ${SW_STORAGE_ES_OTHER_METRIC_DATA_TTL:45} # Unit is day
        monthMetricsDataTTL: ${SW_STORAGE_ES_MONTH_METRIC_DATA_TTL:18} # Unit is month
        # Batch process setting, refer to https://www.elastic.co/guide/en/elasticsearch/client/java-api/5.5/java-docs-bulk-processor.html
        bulkActions: ${SW_STORAGE_ES_BULK_ACTIONS:1000} # Execute the bulk every 1000 requests
        flushInterval: ${SW_STORAGE_ES_FLUSH_INTERVAL:10} # flush the bulk every 10 seconds whatever the number of requests
        concurrentRequests: ${SW_STORAGE_ES_CONCURRENT_REQUESTS:2} # the number of concurrent requests
        resultWindowMaxSize: ${SW_STORAGE_ES_QUERY_MAX_WINDOW_SIZE:10000}
        metadataQueryMaxSize: ${SW_STORAGE_ES_QUERY_MAX_SIZE:5000}
        segmentQueryMaxSize: ${SW_STORAGE_ES_QUERY_SEGMENT_SIZE:200}
        profileTaskQueryMaxSize: ${SW_STORAGE_ES_QUERY_PROFILE_TASK_SIZE:200}
        advanced: ${SW_STORAGE_ES_ADVANCED:""}
      h2:
        driver: ${SW_STORAGE_H2_DRIVER:org.h2.jdbcx.JdbcDataSource}
        url: ${SW_STORAGE_H2_URL:jdbc:h2:mem:skywalking-oap-db}
        user: ${SW_STORAGE_H2_USER:sa}
        metadataQueryMaxSize: ${SW_STORAGE_H2_QUERY_MAX_SIZE:5000}
      mysql:
        properties:
          jdbcUrl: ${SW_JDBC_URL:"jdbc:mysql://localhost:3306/swtest"}
          dataSource.user: ${SW_DATA_SOURCE_USER:root}
          dataSource.password: ${SW_DATA_SOURCE_PASSWORD:root@1234}
          dataSource.cachePrepStmts: ${SW_DATA_SOURCE_CACHE_PREP_STMTS:true}
          dataSource.prepStmtCacheSize: ${SW_DATA_SOURCE_PREP_STMT_CACHE_SQL_SIZE:250}
          dataSource.prepStmtCacheSqlLimit: ${SW_DATA_SOURCE_PREP_STMT_CACHE_SQL_LIMIT:2048}
          dataSource.useServerPrepStmts: ${SW_DATA_SOURCE_USE_SERVER_PREP_STMTS:true}
        metadataQueryMaxSize: ${SW_STORAGE_MYSQL_QUERY_MAX_SIZE:5000}
      influxdb:
        # Metadata storage provider configuration
        metabaseType: ${SW_STORAGE_METABASE_TYPE:H2} # There are 2 options as Metabase provider, H2 or MySQL.
        h2Props:
          dataSourceClassName: ${SW_STORAGE_METABASE_DRIVER:org.h2.jdbcx.JdbcDataSource}
          dataSource.url: ${SW_STORAGE_METABASE_URL:jdbc:h2:mem:skywalking-oap-db}
          dataSource.user: ${SW_STORAGE_METABASE_USER:sa}
          dataSource.password: ${SW_STORAGE_METABASE_PASSWORD:}
        mysqlProps:
          jdbcUrl: ${SW_STORAGE_METABASE_URL:"jdbc:mysql://localhost:3306/swtest"}
          dataSource.user: ${SW_STORAGE_METABASE_USER:root}
          dataSource.password: ${SW_STORAGE_METABASE_PASSWORD:root@1234}
          dataSource.cachePrepStmts: ${SW_STORAGE_METABASE_CACHE_PREP_STMTS:true}
          dataSource.prepStmtCacheSize: ${SW_STORAGE_METABASE_PREP_STMT_CACHE_SQL_SIZE:250}
          dataSource.prepStmtCacheSqlLimit: ${SW_STORAGE_METABASE_PREP_STMT_CACHE_SQL_LIMIT:2048}
          dataSource.useServerPrepStmts: ${SW_STORAGE_METABASE_USE_SERVER_PREP_STMTS:true}
        metadataQueryMaxSize: ${SW_STORAGE_METABASE_QUERY_MAX_SIZE:5000}
        # InfluxDB configuration
        url: ${SW_STORAGE_INFLUXDB_URL:http://localhost:8086}
        user: ${SW_STORAGE_INFLUXDB_USER:root}
        password: ${SW_STORAGE_INFLUXDB_PASSWORD:}
        database: ${SW_STORAGE_INFLUXDB_DATABASE:skywalking}
        actions: ${SW_STORAGE_INFLUXDB_ACTIONS:1000} # the number of actions to collect
        duration: ${SW_STORAGE_INFLUXDB_DURATION:1000} # the time to wait at most (milliseconds)
        fetchTaskLogMaxSize: ${SW_STORAGE_INFLUXDB_FETCH_TASK_LOG_MAX_SIZE:5000} # the max number of fetch task log in a request
    
    receiver-sharing-server:
      selector: ${SW_RECEIVER_SHARING_SERVER:default}
      default:
        authentication: ${SW_AUTHENTICATION:""}
    receiver-register:
      selector: ${SW_RECEIVER_REGISTER:default}
      default:
    
    receiver-trace:
      selector: ${SW_RECEIVER_TRACE:default}
      default:
        bufferPath: ${SW_RECEIVER_BUFFER_PATH:../trace-buffer/}  # Path to trace buffer files, suggest to use absolute path
        bufferOffsetMaxFileSize: ${SW_RECEIVER_BUFFER_OFFSET_MAX_FILE_SIZE:100} # Unit is MB
        bufferDataMaxFileSize: ${SW_RECEIVER_BUFFER_DATA_MAX_FILE_SIZE:500} # Unit is MB
        bufferFileCleanWhenRestart: ${SW_RECEIVER_BUFFER_FILE_CLEAN_WHEN_RESTART:false}
        sampleRate: ${SW_TRACE_SAMPLE_RATE:10000} # The sample rate precision is 1/10000. 10000 means 100% sample in default.
        slowDBAccessThreshold: ${SW_SLOW_DB_THRESHOLD:default:200,mongodb:100} # The slow database access thresholds. Unit ms.
    
    receiver-jvm:
      selector: ${SW_RECEIVER_JVM:default}
      default:
    
    receiver-clr:
      selector: ${SW_RECEIVER_CLR:default}
      default:
    
    receiver-profile:
      selector: ${SW_RECEIVER_PROFILE:default}
      default:
    
    service-mesh:
      selector: ${SW_SERVICE_MESH:default}
      default:
        bufferPath: ${SW_SERVICE_MESH_BUFFER_PATH:../mesh-buffer/}  # Path to trace buffer files, suggest to use absolute path
        bufferOffsetMaxFileSize: ${SW_SERVICE_MESH_OFFSET_MAX_FILE_SIZE:100} # Unit is MB
        bufferDataMaxFileSize: ${SW_SERVICE_MESH_BUFFER_DATA_MAX_FILE_SIZE:500} # Unit is MB
        bufferFileCleanWhenRestart: ${SW_SERVICE_MESH_BUFFER_FILE_CLEAN_WHEN_RESTART:false}
    
    istio-telemetry:
      selector: ${SW_ISTIO_TELEMETRY:default}
      default:
    
    envoy-metric:
      selector: ${SW_ENVOY_METRIC:default}
      default:
        alsHTTPAnalysis: ${SW_ENVOY_METRIC_ALS_HTTP_ANALYSIS:""}
    
    receiver_zipkin:
      selector: ${SW_RECEIVER_ZIPKIN:-}
      default:
        host: ${SW_RECEIVER_ZIPKIN_HOST:0.0.0.0}
        port: ${SW_RECEIVER_ZIPKIN_PORT:9411}
        contextPath: ${SW_RECEIVER_ZIPKIN_CONTEXT_PATH:/}
    
    receiver_jaeger:
      selector: ${SW_RECEIVER_JAEGER:-}
      default:
        gRPCHost: ${SW_RECEIVER_JAEGER_HOST:0.0.0.0}
        gRPCPort: ${SW_RECEIVER_JAEGER_PORT:14250}
    
    query:
      selector: ${SW_QUERY:graphql}
      graphql:
        path: ${SW_QUERY_GRAPHQL_PATH:/graphql}
    
    alarm:
      selector: ${SW_ALARM:default}
      default:
    
    telemetry:
      selector: ${SW_TELEMETRY:none}
      none:
      prometheus:
        host: ${SW_TELEMETRY_PROMETHEUS_HOST:0.0.0.0}
        port: ${SW_TELEMETRY_PROMETHEUS_PORT:1234}
      so11y:
        prometheusExporterEnabled: ${SW_TELEMETRY_SO11Y_PROMETHEUS_ENABLED:true}
        prometheusExporterHost: ${SW_TELEMETRY_PROMETHEUS_HOST:0.0.0.0}
        prometheusExporterPort: ${SW_TELEMETRY_PROMETHEUS_PORT:1234}
    
    receiver-so11y:
      selector: ${SW_RECEIVER_SO11Y:-}
      default:
    
    configuration:
      selector: ${SW_CONFIGURATION:none}
      none:
      apollo:
        apolloMeta: http://106.12.25.204:8080
        apolloCluster: default
        apolloEnv: ""
        appId: skywalking
        period: 5
      nacos:
        # Nacos Server Host
        serverAddr: 127.0.0.1
        # Nacos Server Port
        port: 8848
        # Nacos Configuration Group
        group: 'skywalking'
        # Nacos Configuration namespace
        namespace: ''
        # Unit seconds, sync period. Default fetch every 60 seconds.
        period : 60
        # the name of current cluster, set the name if you want to upstream system known.
        clusterName: "default"
      zookeeper:
        period : 60 # Unit seconds, sync period. Default fetch every 60 seconds.
        nameSpace: /default
        hostPort: localhost:2181
        # Retry Policy
        baseSleepTimeMs: 1000 # initial amount of time to wait between retries
        maxRetries: 3 # max number of times to retry
      etcd:
        period : 60 # Unit seconds, sync period. Default fetch every 60 seconds.
        group :  'skywalking'
        serverAddr: localhost:2379
        clusterName: "default"
      consul:
        # Consul host and ports, separated by comma, e.g. 1.2.3.4:8500,2.3.4.5:8500
        hostAndPorts: ${consul.address}
        # Sync period in seconds. Defaults to 60 seconds.
        period: 1
        # Consul aclToken
        #aclToken: ${consul.aclToken}
    
    exporter:
      selector: ${SW_EXPORTER:-}
      grpc:
        targetHost: ${SW_EXPORTER_GRPC_HOST:127.0.0.1}
        targetPort: ${SW_EXPORTER_GRPC_PORT:9870}
    

    主要修改cluster.zookeeper.hostPort、storage.elasticsearch7.clusterNodes

    skywalking oap

    node1

     docker run --name sk-node1 -d -p 1234:1234 -p 11800:11800 \
     -p 12800:12800 --restart always  \
     -e TZ=Asia/Shanghai \
     -e SW_STORAGE=elasticsearch7 \
     -e SW_STORAGE_ES_CLUSTER_NODES=192.168.130.20:9200,192.168.130.19:9200,192.168.130.21:9200 \
     -v /root/skywalking/config/application.yml:/skywalking/config/application.yml \
     apache/skywalking-oap-server:7.0.0-es7
    

    node2

     docker run --name sk-node2 -d -p 1234:1234 -p 11800:11800 \
     -p 12800:12800 --restart always  \
     -e TZ=Asia/Shanghai \
     -e SW_STORAGE=elasticsearch7 \
     -e SW_STORAGE_ES_CLUSTER_NODES=192.168.130.20:9200,192.168.130.19:9200,192.168.130.21:9200 \
     -v /root/skywalking/config/application.yml:/skywalking/config/application.yml \
     apache/skywalking-oap-server:7.0.0-es7
    
    

    skywalking ui

    node1

    docker run --name sk-ui-node1 -d -p 8080:8080  \
    -e TZ=Asia/Shanghai \
    -e SW_OAP_ADDRESS=192.168.130.20:12800 \
    -e SW_TIMEOUT=20000 \
    --restart always apache/skywalking-ui:7.0.0 
    

    node2

    docker run --name sk-ui-node2 -d -p 8080:8080  \
    -e TZ=Asia/Shanghai \
    -e SW_OAP_ADDRESS=192.168.130.19:12800 \
    -e SW_TIMEOUT=20000 \
    --restart always apache/skywalking-ui:7.0.0 
    

    在前面再加个lb做负载

    更多相关内容
  • 1、配置环境参数 cat /etc/hosts 172.16.1.1 test-es01 172.16.1.2 test-es02 172.16.1.3 test-es03 # grep vm.max_map_count /etc/sysctl.conf vm.max_map_count=262144 ...如果没有上面参数执行命令 ...
  • skywalking是个非常不错的apm产品,但是在使用过程中有个非常蛋疼的问题,在基于es的存储情况下,es的数据一有问题,就会导致整个skywalking web ui服务不可用,然后需要agent端一个服务一个服务的停用,然后服务...

    前言

    skywalking是个非常不错的apm产品,但是在使用过程中有个非常蛋疼的问题,在基于es的存储情况下,es的数据一有问题,就会导致整个skywalking web ui服务不可用,然后需要agent端一个服务一个服务的停用,然后服务重新部署后好,全部走一遍。这种问题同样也会存在skywalking的版本升级迭代中。而且apm 这种过程数据是允许丢弃的,默认skywalking中关于trace的数据记录只保存了90分钟。故博主准备将skywalking的部署容器化,一键部署升级。下文是整个skywalking 容器化部署的过程。

    目标:将skywalking的docker镜像运行在k8s的集群环境中提供服务

    docker镜像构建

    FROM registry.cn-xx.xx.com/keking/jdk:1.8
    ADD apache-skywalking-apm-incubating/  /opt/apache-skywalking-apm-incubating/
    RUN ln -sf /usr/share/zoneinfo/Asia/Shanghai  /etc/localtime \
        && echo 'Asia/Shanghai' >/etc/timezone \
        && chmod +x /opt/apache-skywalking-apm-incubating/config/setApplicationEnv.sh \
        && chmod +x /opt/apache-skywalking-apm-incubating/webapp/setWebAppEnv.sh \
        && chmod +x /opt/apache-skywalking-apm-incubating/bin/startup.sh \
        && echo "tail -fn 100 /opt/apache-skywalking-apm-incubating/logs/webapp.log" >> /opt/apache-skywalking-apm-incubating/bin/startup.sh
    
    EXPOSE 8080 10800 11800 12800
    CMD /opt/apache-skywalking-apm-incubating/config/setApplicationEnv.sh \
         && sh /opt/apache-skywalking-apm-incubating/webapp/setWebAppEnv.sh \
         && /opt/apache-skywalking-apm-incubating/bin/startup.sh

    在编写Dockerfile时需要考虑几个问题:skywalking中哪些配置需要动态配置(运行时设置)?怎么保证进程一直运行(skywalking 的startup.sh和tomcat中 的startup.sh类似)?

    application.yml

    #cluster:
    #  zookeeper:
    #    hostPort: localhost:2181
    #    sessionTimeout: 100000
    naming:
      jetty:
        #OS real network IP(binding required), for agent to find collector cluster
        host: 0.0.0.0
        port: 10800
        contextPath: /
    cache:
    #  guava:
      caffeine:
    remote:
      gRPC:
        # OS real network IP(binding required), for collector nodes communicate with each other in cluster. collectorN --(gRPC) --> collectorM
        host: #real_host
        port: 11800
    agent_gRPC:
      gRPC:
        #os real network ip(binding required), for agent to uplink data(trace/metrics) to collector. agent--(grpc)--> collector
        host: #real_host
        port: 11800
        # Set these two setting to open ssl
        #sslCertChainFile: $path
        #sslPrivateKeyFile: $path
    
        # Set your own token to active auth
        #authentication: xxxxxx
    agent_jetty:
      jetty:
        # OS real network IP(binding required), for agent to uplink data(trace/metrics) to collector through HTTP. agent--(HTTP)--> collector
        # SkyWalking native Java/.Net/node.js agents don't use this.
        # Open this for other implementor.
        host: 0.0.0.0
        port: 12800
        contextPath: /
    analysis_register:
      default:
    analysis_jvm:
      default:
    analysis_segment_parser:
      default:
        bufferFilePath: ../buffer/
        bufferOffsetMaxFileSize: 10M
        bufferSegmentMaxFileSize: 500M
        bufferFileCleanWhenRestart: true
    ui:
      jetty:
        # Stay in `localhost` if UI starts up in default mode.
        # Change it to OS real network IP(binding required), if deploy collector in different machine.
        host: 0.0.0.0
        port: 12800
        contextPath: /
    storage:
      elasticsearch:
        clusterName: #elasticsearch_clusterName
        clusterTransportSniffer: true
        clusterNodes: #elasticsearch_clusterNodes
        indexShardsNumber: 2
        indexReplicasNumber: 0
        highPerformanceMode: true
        # Batch process setting, refer to https://www.elastic.co/guide/en/elasticsearch/client/java-api/5.5/java-docs-bulk-processor.html
        bulkActions: 2000 # Execute the bulk every 2000 requests
        bulkSize: 20 # flush the bulk every 20mb
        flushInterval: 10 # flush the bulk every 10 seconds whatever the number of requests
        concurrentRequests: 2 # the number of concurrent requests
        # Set a timeout on metric data. After the timeout has expired, the metric data will automatically be deleted.
        traceDataTTL: 2880 # Unit is minute
        minuteMetricDataTTL: 90 # Unit is minute
        hourMetricDataTTL: 36 # Unit is hour
        dayMetricDataTTL: 45 # Unit is day
        monthMetricDataTTL: 18 # Unit is month
    #storage:
    #  h2:
    #    url: jdbc:h2:~/memorydb
    #    userName: sa
    configuration:
      default:
        #namespace: xxxxx
        # alarm threshold
        applicationApdexThreshold: 2000
        serviceErrorRateThreshold: 10.00
        serviceAverageResponseTimeThreshold: 2000
        instanceErrorRateThreshold: 10.00
        instanceAverageResponseTimeThreshold: 2000
        applicationErrorRateThreshold: 10.00
        applicationAverageResponseTimeThreshold: 2000
        # thermodynamic
        thermodynamicResponseTimeStep: 50
        thermodynamicCountOfResponseTimeSteps: 40
        # max collection's size of worker cache collection, setting it smaller when collector OutOfMemory crashed.
        workerCacheMaxSize: 10000
    #receiver_zipkin:
    #  default:
    #    host: localhost
    #    port: 9411
    #    contextPath: /

    webapp.yml

    动态配置:密码,grpc等需要绑定主机的ip都需要运行时设置,这里我们在启动skywalking的startup.sh只之前,先执行了两个设置配置的脚本,通过k8s在运行时设置的环境变量来替换需要动态配置的参数

    setApplicationEnv.sh

    #!/usr/bin/env sh
    sed -i "s/#elasticsearch_clusterNodes/${elasticsearch_clusterNodes}/g" /opt/apache-skywalking-apm-incubating/config/application.yml
    sed -i "s/#elasticsearch_clusterName/${elasticsearch_clusterName}/g" /opt/apache-skywalking-apm-incubating/config/application.yml
    sed -i "s/#real_host/${real_host}/g" /opt/apache-skywalking-apm-incubating/config/application.yml

    setWebAppEnv.sh

    #!/usr/bin/env sh
    sed -i "s/#skywalking_password/${skywalking_password}/g" /opt/apache-skywalking-apm-incubating/webapp/webapp.yml
    sed -i "s/#real_host/${real_host}/g" /opt/apache-skywalking-apm-incubating/webapp/webapp.yml

    保持进程存在:通过在skywalking 启动脚本startup.sh末尾追加"tail -fn 100
    /opt/apache-skywalking-apm-incubating/logs/webapp.log",来让进程保持运行,并不断输出webapp.log的日志

    Kubernetes中部署

    apiVersion: extensions/v1beta1
    kind: Deployment
    metadata:
      name: skywalking
      namespace: uat
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: skywalking
      template:
        metadata:
          labels:
            app: skywalking
        spec:
          imagePullSecrets:
          - name: registry-pull-secret
          nodeSelector:
             apm: skywalking
          containers:
          - name: skywalking
            image: registry.cn-xx.xx.com/keking/kk-skywalking:5.2
            imagePullPolicy: Always
            env:
            - name: elasticsearch_clusterName
              value: elasticsearch
            - name: elasticsearch_clusterNodes
              value: 172.16.16.129:31300
            - name: skywalking_password
              value: xxx
            - name: real_host
              valueFrom:
                fieldRef:
                  fieldPath: status.podIP
            resources:
              limits:
                cpu: 1000m
                memory: 4Gi
              requests:
                cpu: 700m
                memory: 2Gi
    
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: skywalking
      namespace: uat
      labels:
        app: skywalking
    spec:
      selector:
        app: skywalking
      ports:
      - name: web-a
        port: 8080
        targetPort: 8080
        nodePort: 31180
      - name: web-b
        port: 10800
        targetPort: 10800
        nodePort: 31181
      - name: web-c
        port: 11800
        targetPort: 11800
        nodePort: 31182
      - name: web-d
        port: 12800
        targetPort: 12800
        nodePort: 31183
      type: NodePort

    Kubernetes部署脚本中唯一需要注意的就是env中关于pod ip的获取,skywalking中有几个ip必须绑定容器的真实ip,这个地方可以通过环境变量设置到容器里面去

    结语

    整个skywalking容器化部署从测试到可用大概耗时1天,其中花了个多小时整了下谭兄的skywalking-docker镜像(
    https://hub.docker.com/r/wutang/skywalking-docker/),发现有个脚本有权限问题(谭兄反馈已解决,还没来的及测试),以及有几个地方自己不是很好控制,便build了自己的docker镜像,其中最大的问题还是解决集群中网络通讯的问题,一开始我把skywalking中的服务ip都设置为0.0.0.0,然后通过集群的nodePort映射出来,这个时候的agent通过集群ip+31181是可以访问到naming服务的,然后通过naming服务获取到的collector gRPC服务缺变成了0.0.0.0:11800, 这个地址agent肯定访问不到collector的,后面通过绑定pod ip的方式解决了这个问题。

                          资源获取:

    大家点赞、收藏、关注、评论啦 、查看👇🏻👇🏻👇🏻微信公众号获取联系方式👇🏻👇🏻👇🏻

     精彩专栏推荐订阅:下方专栏👇🏻👇🏻👇🏻👇🏻

    每天学四小时:Java+Spring+JVM+分布式高并发,架构师指日可待

    展开全文
  • 基于Docker部署Skywalking

    千次阅读 2022-04-15 11:50:36
    Skywalking是分布式系统的应用程序性能监视工具,专为微服务、云原生架构和基于容器(Docker、K8s)架构而设计。提供分布式追踪、服务网格遥测分析、度量聚合和可视化一体化解决方案。 Docker部署Skywalking步骤 ...

    Skywalking简介

        Skywalking是分布式系统的应用程序性能监视工具,专为微服务、云原生架构和基于容器(Docker、K8s)架构而设计。提供分布式追踪、服务网格遥测分析、度量聚合和可视化一体化解决方案。

    Docker部署Skywalking步骤

    • 部署Elasticsearch
    • 部署Skywalking OAP
    • 部署Skywalking UI
    • 应用程序配合Skywalking Agent部署

    部署Elasticsearch

        这里基于Docker简单单机部署,普通部署和集群部署可以参考官方文档

    拉取镜像

    docker pull elasticsearch:7.6.2
    

    指定单机启动

    注:通过ES_JAVA_OPTS设置ES初始化内存,否则在验证时可能会起不来

    docker run --restart=always -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" \
    -e ES_JAVA_OPTS="-Xms512m -Xmx512m" \
    --name='elasticsearch' --cpuset-cpus="1" -m 2G -d elasticsearch:7.6.2
    

    验证es安装成功

        浏览器地址栏输入:http://127.0.0.1:9200/,浏览器页面显示如下内容:

    {
      "name" : "6eebe74f081b",
      "cluster_name" : "docker-cluster",
      "cluster_uuid" : "jgCr_SQbQXiimyAyOEqk9g",
      "version" : {
        "number" : "7.6.2",
        "build_flavor" : "default",
        "build_type" : "docker",
        "build_hash" : "ef48eb35cf30adf4db14086e8aabd07ef6fb113f",
        "build_date" : "2020-03-26T06:34:37.794943Z",
        "build_snapshot" : false,
        "lucene_version" : "8.4.0",
        "minimum_wire_compatibility_version" : "6.8.0",
        "minimum_index_compatibility_version" : "6.0.0-beta1"
      },
      "tagline" : "You Know, for Search"
    }
    

    部署Skywalking OAP

    拉取镜像

    docker pull apache/skywalking-oap-server:8.3.0-es7
    

    启动Skywalking OAP

    注:–link后面的第一个参数和elasticsearch容器名一致; -e SW_STORAGE_ES_CLUSTER_NODES:es7也可改为你es服务器部署的Ip地址,即ip:9200

    docker run --name oap --restart always -d --restart=always -e TZ=Asia/Shanghai -p 12800:12800 -p 11800:11800 --link elasticsearch:elasticsearch -e SW_STORAGE=elasticsearch7 -e SW_STORAGE_ES_CLUSTER_NODES=elasticsearch:9200 apache/skywalking-oap-server:8.3.0-es7
    

    部署Skywalking UI

    拉取镜像

    docker pull apache/skywalking-ui:8.3.0
    

    启动Skywalking UI

    注:–link后面的第一个参数和skywalking OAP容器名一致;

    docker run -d --name skywalking-ui \
    --restart=always \
    -e TZ=Asia/Shanghai \
    -p 8088:8080 \
    --link oap:oap \
    -e SW_OAP_ADDRESS=oap:12800 \
    apache/skywalking-ui:8.3.0
    

    应用程序配合Skywalking Agent部署

    官网下载skywalking-agent

        下载地址:https://archive.apache.org/dist/skywalking/8.3.0/

        这里有一点需要注意,一定要下载对应的skywalking-oap版本的skywalking-agent,否则就有可能会导致agent无法上报,笔者一开始觉得agent可以做到向下兼容,下载了8.8版本,导致上传监控数据失败了,报错原因是oap-server端返回的方法不存在grpc-message: Method not found: skywalking.v3.JVMMetricReportService/collect,日志如下:

    在这里插入图片描述

    javaagent运行

        spring-transaction-2.2.6.RELEASE.jar是笔者写的一个测试程序,感兴趣的可以从笔者github下载:https://github.com/Redick01/my-transaction.git spring-transaction工程就是测试代码

    java -jar -javaagent:/Users/penghuiliu/geek_learn/skywalking-agent/skywalking-agent.jar=agent.service_name=fw-gateway,collector.backend_service=127.0.0.1:11800 -jar spring-transaction-2.2.6.RELEASE.jar
    

        服务启动成功调用接口,skywalking UI显示结果如下:

    仪表盘

    在这里插入图片描述

    拓扑

    在这里插入图片描述

    链路

    在这里插入图片描述

    展开全文
  • SkyWalking部署之Docker-Agent客户端集群 文件状态: [ ] 草稿 [√] 正在修改 当前版本 1.0 历史修订版本 1.0; 作 者 杜有龙 ...

    SkyWalking部署之Docker-Agent客户端集群

    文件状态:

    [ ] 草稿

    [√] 正在修改

    当前版本

    1.0

    历史修订版本

    1.0;

    作    者

    杜有龙

    完成日期

    2019-02-15

    • 一、项目中配置agent.config
    1. 添加配置文件

    【说明】在resources中添加SkyWalking配置文件agent.config。

     

     

    2、修改配置文件

    【说明】修改agent.config的配置项agent.application_code的值为当前应用的名称。

    【示例】:
    agent.application_code=mall-dubbo

    3、打包环境支持agent.config配置文件

    3.1修改package.xml文件

     

    如上图,添加:<include>agent.config</include>

    3.2修改pom.xml文件

     

    如上图,添加:<include>*.config</include>

    二、Dockerfile重写

    1、基础镜像

    【说明】基础镜像选用ccr.ccs.tencentyun.com/eqxiu/jre8-agent5。

     

    1. SkyWalking agent支持

    【说明】:需要添加两项配置,具体请查看步骤如2.1、2.2。

     

     

    2.1配置文件拷贝

     

    如图所示,添加一项配置:

    RUN \cp -f /app/config/agent.config /skywalking/agent/config

    2.2环境变量设置

     

    如图所示,添加一项配置:

    ENV JAVA_OPTS="$JAVA_OPTS -javaagent:/skywalking/agent/skywalking-agent.jar"

     

     

    展开全文
  • docker搭建es集群+skywalking 1.利用docker镜像在一台服务器上搭建es集群 我的整个过程都是在/usr/local/packages中进行,所以最好先切换到该目录,没有则创建 mkdir -p /usr/local/packages cd /usr/local/...
  • 前言 目前主要的一些 APM 工具有: Cat、Zipkin、Pinpoint、SkyWalking;Apache SkyWalking 是观察性分析平台和应用性能管理系统。...mkdir -p /usr/local/docker/skywalking docker-compose.yml v
  • 集群配置说明 集群模式- 基于nacos 动态配置发布- 基于nacos 存储- 基于elasticsearch7 nacos配置参照nacos部署 es配置参照ELK部署 创建配置目录 mkdir -pv /data/... /data/docker-compose/skywalking/docker-
  • skywalking 存储 选择的es ,所以 先安装 es elasticsearch 安装 1、推荐参照官网(https://www.elastic.co/guide/en/elasticsearch/reference/7.12/docker.html) 2、设置环境 # 在宿主机上修改配置: vim /etc...
  • Skywalking 集群安装

    千次阅读 2022-01-19 17:11:43
    搭建一个 SkyWalking 集群环境,步骤如下: 第一步,搭建一个 Elasticsearch 集群。 第二步, 搭建一个注册中心的集群。目前 SkyWalking 支持 Zookeeper、Kubernetes、Consul、Nacos 作为注册中心。 第三步,搭建一...
  • docker部署skywalking

    2021-01-07 15:34:36
    # skywalking-es以elasticsearch集群作为数据存储 elasticsearch集群请参考docker部署es集群 ```shell sudo docker run --name skywalking -d -p 1234:1234 -p 11800:11800 -p 12800:12800 --restart always -e TZ=...
  • docker-compose文件 ```yaml version: '3.3' services: elasticsearch: image: elasticsearch:7.9.0 container_name: elasticsearch restart: always ports: - 9200:9200 environment: discovery.type: ...
  • docker-compose 部署Skywalking 性能监控、链路追踪、性能剖析、日志采集
  • # k8s部署ES集群,skywalking集群以及微服务监控 ? 记录一次自己部署ES,以及skywalking踩过的坑。最初在windows部署ES和skywalking单机版,后来在linux部署单机版,再转向k8s集群部署,再到最后对ES集群以及...
  • 想要使用 skywalking,你需要的是整个 agent 目录,不是其中的某几个 jar https://blog.csdn.net/weixin_45497155/article/details/110244087 https://blog.csdn.net/weixin_43806389/article/details/110237504 ...
  • 1. 环境准备 1.1. 用于搭建 SkyWalking 的三台服务器 1.1.1. 服务器 10.1.62.78 10.1.62.79 10.1...
  • docker中部署skywalking链路监控

    千次阅读 2020-09-21 14:18:47
    Docker运行skywalking参考命令: 安装oap-server: docker run --name oap --restart always -d \ -e TZ=Asia/Shanghai \ -p 12800:12800 \ -p 11800:11800 \ --link es7:es7 \ -e SW_STORAGE=elasticsearch \ -e ...
  • SkyWalking Linux服务端集群部署 文件状态: [ ] 草稿 [√] 正在修改 当前版本 1.0 历史修订版本 1.0; 作 者 杜有龙 完成...
  • SkyWalking 的后台提供了以下几种集群管理方案 Zookeeper Kubernetes Consul Etcd Nacos 可以通过修改 oap-service 下的 application.yml 文件来决定使用哪种集群管理方案。 ​ 基于Nacos的集群方案配置 Nacos ...
  • Skywalking安装部署

    千次阅读 2022-03-10 10:24:17
    Skywalking 简介   Skywalking是由国内开源爱好者吴晟(原OneAPM工程师,目前在华为)开源并提交到Apache孵化器的产品,它同时吸收了Zipkin/Pinpoint/CAT的设计思路,支持非侵入式埋点。是一款基于OpenTracing APM...
  • skywalking配置es的密码

    千次阅读 2021-12-29 16:33:57
    进入容器 docker exec -it elasticsearch /bin/bash 打开 es.yml文件 vi config/elasticsearch.yml 在最后增加如下参数 #跨域允许设置的头信息,默认为X-Requested-With,Content-Type,Content-Lengt ...
  • Docker 网络 7.1 理解Docker0 清空下前面的docker 镜像、容器 # 删除全部容器 [root@cVzhanshi tomcat-diy]# docker rm -f $(docker ps -aq) # 删除全部镜像 [root@cVzhanshi tomcat-diy]# docker rmi -f $...
  • skywalking配置nacos集群模式

    千次阅读 2019-07-11 17:49:04
    版本: name version ...skywalking ...es集群管理工具 cerebro-0.8.3 https://github.com/lmenezes/cerebro 前置: es、nacos至少各有一个 开始: 1.将 apache-skywalking-apm-6.2.0.zip...

空空如也

空空如也

1 2 3 4 5 ... 20
收藏数 1,029
精华内容 411
关键字:

skywalking集群docker