精华内容
下载资源
问答
  • opengrok镜像离线包

    2020-12-03 12:10:29
    opengrok镜像离线包,opengrok是一款开源的源代码检索,查看工具,资源为其docker镜像版打包,使用docker load进行加载,之后进行相关run操作启动服务,因为大小有限制,分段上传,下载之后请进行cat合并 成tar格式
  • opengrok镜像离线包,opengrok是一款开源的源代码检索,查看工具,资源为其docker镜像版打包,使用docker load进行加载,之后进行相关run操作启动服务,因为大小有限制,分段上传,下载之后请进行cat合并 成tar格式
  • 从页面下载grok_exporter-$ARCH.zip于您的操作系统的grok_exporter-$ARCH.zip ,解压缩存档cd grok_exporter-$ARCH ,然后运行 ./grok_exporter -config ./example/config.yml 示例日志文件exim-rejected-RCPT-...
  • 简单的库,可以解析gook的grok模式。 这个小工具从标准输入读取,应用grok模式并将捕获的字段转储为JSON。 例子 cat apache.log | groker -pattern=%{COMMONAPACHELOG} echo " Hello 123 " | groker -pattern= " %...
  • Java Grok是简单的API,可让您轻松解析日志和其他文件(单行)。 使用Java Grok,您可以将非结构化日志和事件数据转换为结构化数据(JSON)。 我可以将Grok用作什么? 从日志和流程中报告错误和其他模式 解析复杂...
  • OPENGROK 环境搭建

    2019-05-23 16:40:12
    使用OPENGROK可以非常快速的查看代码,整合了git的功能
  • grok库使您可以快速解析潜在的非结构化数据并将其匹配为结构化结果。 在分析各种日志文件时,它特别有用。 这个版本主要是一个移植,而则从原始的获得了灵感。 用法 将此添加到您的Cargo.toml : [ dependencies ]...
  • OpenGrok-邪恶的快速源浏览器 1.简介 OpenGrok是使用Java编写的快速且可用的源代码搜索和交叉引用引擎。 它可以帮助您搜索,交叉引用和导航您的源代码树。 它可以理解许多源代码管理系统的各种程序文件格式和版本...
  • OpenGrok [WIP] 开源专用的ngrok模拟。 OpenGrok被设计为比其他任何项目都快。 去做 超时处理 例如,当客户端发送不完整的数据包时,OpenGrok将永远等待它,如果该数据包是经特殊设计的,可以用垃圾填满您的RAM,请...
  • grok_exporter, 从任意非结构化日志数据导出普罗米修斯 grok_exporter从任意非结构化日志数据导出 Prometheus 。关于 Grok是一种工具,用来将无用的非结构化日志数据解析为结构和查询。 Grok在中大量使用,以提供...
  • 搭建Opengrok看源码文件,放到目录下,配置相关环境变量
  • opengrok-1.1-rc28.tar.gz

    2019-01-13 13:28:18
    此资源为opengrok 的ubuntu 执行文件,版本如下:opengrok-1.1-rc28.tar.gz
  • opengrok_tool.zip

    2019-10-16 17:28:47
    工具总要用来搭建ubuntu环境下的代码阅读工具openGrok 1. 下载资源,解压到一个空间比较大的磁盘目录下 2. cd opengrok_tool 执行 ./CreatOpenGrok $project_name $source_code_dir第一个参数是网页打开时的后缀,...
  • logstash grok 添加了自定义的正则表达式,可以提取出日志的等级,日志的时间,日志的线程号
  • 此扩展名仅在OPENGROK网站上以文件名命名。为LG内部使用开发。 此扩展名仅在OPENGROK网站上使用文件名重命名选项卡标题。 专为LG内部使用而开发 支持语言:English
  • Android代码-grok

    2019-08-07 18:19:21
    Grok Grok is simple tool that allows you to easily parse logs and other files (single line). With Grok, you can turn unstructured log and event data into structured data (JSON). Grok is an ...
  • grok

    2017-08-08 10:56:00
    grok (verb)understand (something) intuitively or by empathy. One the most common tasks when parsing log data is to decompose raw lines of text into a set of structured fields which other tools...

    grok (verb) 
    understand (something) intuitively or by empathy.

    One the most common tasks when parsing log data is to decompose raw lines of text into a set of structured fields which other tools can manipulate. If you’re using the Elastic Stack, you can leverage Elasticsearch’s aggregations and Kibana’s visualizations to answer both business and operational questions from the information extracted in the logs, like ip addresses, timestamps, and domain specific data.

    For Logstash, this deconstruction job is carried by logstash-filter-grok, a filter plugin that helps you describe the structure of your log formats.

    There are over 200 grok patterns available which abstract concepts such as IPv6 addresses , UNIX paths and names of months
    In order to match a line with the format:

    2016-09-19T18:19:00 [8.8.8.8:prd] DEBUG this is an example log message
    

    with the grok library, it’s only necessary to compose a handful of patterns to come up with:

    %{TIMESTAMP_ISO8601:timestamp} \[%{IPV4:ip};%{WORD:environment}\] %{LOGLEVEL:log_level} %{GREEDYDATA:message}
    

    Which will create the structure:

    {
      "timestamp": "2016-09-19T18:19:00",
      "ip": "8.8.8.8",
      "environment": "prd",
      "log_level": "DEBUG",
      "message": "this is an example log message"
    }
    

    Easy right?

    Yes!

    Great! Are we done here? No! Because..

    “I’m using grok and it’s super slow!!”

    That is a very common remark! Performance is a topic that is often brought up from the community as, often enough, users or customers will create a grok expression that will greatly reduce the number of events per second being processed by the logstash pipeline.

    As mentioned before, grok patterns are regular expressions, and therefore this plugin’s performance is severely impacted by the behaviour of the regular expression engine. In the following chapters, we’ll provide some guidelines on do’s and don’ts when creating grok expressions to match your log lines.

    Measure, measure, measure

    In order to validate decisions and experiments during grok expression design, we need a way to quickly measure performance between two or more expressions. For this, I created a small jruby script that uses the logstash-filter-grok plugin directly, bypassing the logstash pipeline.

    You can fetch this script here. We’ll be using it to collect performance numbers to validate (or destroy!) our assumptions.

    Beware of the performance impact when grok fails to match

    Although it is very important to know how fast your grok pattern matches a log entry, it is also essential to understand what happens when it doesn’t. Successful matches can perform very differently than unsuccessful ones.

    When grok fails to match an event, it will add a tag to the event. By default, this tag is _grokparsefailure.

    Logstash allows you then to route those events somewhere where they can be counted and reviewed. For example, you can write all the failed matches to a file:

    input { # ... }
    filter {
      grok {
        match => { "message" => "%{TIMESTAMP_ISO8601:timestamp} \[%{IPV4:ip};%{WORD:environment}\] %{LOGLEVEL:log_level} %{GREEDYDATA:message}" }
      }
    }
    output {
      if "_grokparsefailure" in [tags] {
        # write events that didn't match to a file
        file { "path" => "/tmp/grok_failures.txt" }
      } else {
         elasticsearch { }
      }
    }
    

    If you find that there are multiple pattern match failures, you can benchmark those lines and find out their impact on the pipeline throughput.

    We’ll now use a grok expression that is meant to parse apache log lines and study its behaviour. First, we start with an example log entry:

    220.181.108.96 - - [13/Jun/2015:21:14:28 +0000] "GET /blog/geekery/xvfb-firefox.html HTTP/1.1" 200 10975 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)"
    

    And use the following grok pattern to match it:

    %{IPORHOST:clientip} %{USER:ident} %{USER:auth} \[%{HTTPDATE:timestamp}\] "%{WORD:verb} %{DATA:request} HTTP/%{NUMBER:httpversion}" %{NUMBER:response:int} (?:-|%{NUMBER:bytes:int}) %{QS:referrer} %{QS:agent}
    

    Now, we’ll compare the matching speed of a successful match against three other log entries which don’t conform to the format, either at the start, the middle, or at the end of the line:

    # beginning mismatch - doesn't start with an IPORHOST
    'tash-scale11x/css/fonts/Roboto-Regular.ttf HTTP/1.1" 200 41820 "http://semicomplete.com/presentations/logs'
    
    # middle mismatch - instead of an HTTP verb like GET or PUT there's the number 111
    '220.181.108.96 - - [13/Jun/2015:21:14:28 +0000] "111 /blog/geekery/xvfb-firefox.html HTTP/1.1" 200 10975 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)"'
    
    # end mismatch - the last element isn't a quoted string, but a number
    '220.181.108.96 - - [13/Jun/2015:21:14:28 +0000] "GET /blog/geekery/xvfb-firefox.html HTTP/1.1" 200 10975 "-" 1'
    

    These log lines were benchmarked using the script described at the start, and the result is presented below:

    matching events per second

    We can see that, for this grok expression, depending on the location of the mismatch, the time spent checking that a line doesn’t match can be up to 6 times slower than a regular (successful) match. This helps explain user reports on grok maximizing CPU usage when lines don’t match, like https://github.com/logstash-plugins/logstash-filter-grok/issues/37.

    What can we do about it?

    Fail Faster, Set Anchors

    So now that we understand that match failures are dangerous to your pipeline’s performance, we need to fix them. In regular expression design, the best thing you can do to aid the regex engine is to reduce the amount of guessing it needs to do. This is why greedy patterns are generally avoided, but we’ll come back to that in a bit, as there’s a much simpler change that alters how your patterns are matched.

    Let’s come back to our lovely apache log line…

    220.181.108.96 - - [13/Jun/2015:21:14:28 +0000] "GET /blog/geekery/xvfb-firefox.html HTTP/1.1" 200 10975 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)"
    

    …which is parsed by the grok pattern below:

    %{IPORHOST:clientip} %{USER:ident} %{USER:auth} \[%{HTTPDATE:timestamp}\] "%{WORD:verb} %{DATA:request} HTTP/%{NUMBER:httpversion}" %{NUMBER:response:int} (?:-|%{NUMBER:bytes:int}) %{QS:referrer} %{QS:agent}
    

    There’s a performance problem hiding in plain sight which exists due the natural expectations from the user of the grok plugin: the assumption that the grok expression we wrote will only match our log line from start to finish. In reality, what grok is being told is to “find this sequence of elements within a line of text”.

    Wait, what? That’s right, “within a line of text”. This means that a line such as…

    OMG OMG OMG EXTRA INFORMATION 220.181.108.96 - - [13/Jun/2015:21:14:28 +0000] "GET /blog/geekery/xvfb-firefox.html HTTP/1.1" 200 10975 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)" OH LOOK EVEN MORE STUFF
    

    …will still match the grok pattern! The good news is that the fix is simple, we just need to add a couple of Anchors!

    Anchors allow you to pin the regular expression to a certain position of the string. By adding the start and end of line anchors (^ and $) to our grok expression, we make sure that we’ll only match those patterns against the whole string from start to finish, and nothing else.

    This is very important in the case of failure to match. If the anchors aren’t in place and the regex engine can’t match a line, it will start trying to find the pattern within substrings of the initial string, hence the performance degradation we saw above.

    So, to see the performance impact, we benchmarked the previous expression against a new one, now with anchors:

    ^%{IPORHOST:clientip} %{USER:ident} %{USER:auth} \[%{HTTPDATE:timestamp}\] "%{WORD:verb} %{DATA:request} HTTP/%{NUMBER:httpversion}" %{NUMBER:response:int} (?:-|%{NUMBER:bytes:int}) %{QS:referrer} %{QS:agent}$
    

    Here are the results:

    using anchors

    It’s a pretty dramatic change in behavior for the non matching scenarios! Not only we removed huge performance drops in the middle and end scenarios, but also made the initial match failure detection around 10 times faster. Sweet.

    Beware of matching the same thing twice

    You might be tempted to say: “well, all my lines are correctly formatted so we don’t have failed matched”, but this might not be the case.

    Over time, we’ve seen a very common pattern of grok usage, especially when lines from multiple applications come through a single gateway like syslog which adds a common header to all messages. Let’s take an example: imagine that we have three applications which log using a “common_header: payload” format:

    Application 1: '8.8.8.8 process-name[666]: a b 1 2 a lot of text at the end'
    Application 2: '8.8.8.8 process-name[667]: a 1 2 3 a lot of text near the end;4'
    Application 3: '8.8.8.8 process-name[421]: a completely different format | 1111'
    

    A common grok setup for this would be to match the three formats in one grok:

    grok {
      "match" => { "message => [
        '%{IPORHOST:clientip} %{DATA:process_name}\[%{NUMBER:process_id}\]: %{WORD:word_1} %{WORD:word_2} %{NUMBER:number_1} %{NUMBER:number_2} %{DATA:data}',
        '%{IPORHOST:clientip} %{DATA:process_name}\[%{NUMBER:process_id}\]: %{WORD:word_1} %{NUMBER:number_1} %{NUMBER:number_2} %{NUMBER:number_3} %{DATA:data};%{NUMBER:number_4}',
        '%{IPORHOST:clientip} %{DATA:process_name}\[%{NUMBER:process_id}\]: %{DATA:data} | %{NUMBER:number}'
        ] }
    }
    

    Now notice that even if your applications log correctly, grok will still sequentially try to match the incoming log line against the three expressions, breaking at the first match.

    This means that it’s still important to ensure we skip to the right one as fast as possible, since you’ll always have one failed match for Application 2 and two failed matches for Application 3.

    The first tactic we often see is to tier the grok matching: first match the header, overwrite the message field, then match only the bodies:

    filter {
      grok {
        "match" => { "message" => '%{IPORHOST:clientip} %{DATA:process_name}\[%{NUMBER:process_id}\]: %{GREEDYDATA:message}' },
        "overwrite" => "message"
      }
      grok {
        "match" => { "message" => [
          '%{WORD:word_1} %{WORD:word_2} %{NUMBER:number_1} %{NUMBER:number_2} %{GREEDYDATA:data}',
          '%{WORD:word_1} %{NUMBER:number_1} %{NUMBER:number_2} %{NUMBER:number_3} %{DATA:data};%{NUMBER:number_4}',
          '%{DATA:data} | %{NUMBER:number}'
        ] }
      }
    )
    

    This alone provides an interesting performance boost, matching lines 2.5x faster than the initial approach. But what if we add our fellow anchors?

    tiered match

    Interesting! Adding anchors makes both architectures perform equally well! In fact, because of the greatly increased failed match performance, our initial single grok design performs slightly better since there is one less match being executed.

    Ok, so how can I know things aren’t going well?

    We’ve already come to the conclusion that monitoring the existence of “_grokparsefaiure” events is essential, but there is more that you can do:

    Since version 3.2.0 of the grok plugin, there’s a couple of settings that help you understand when an event is taking a long time to match (or fail to). Using timeout_millis and tag_on_timeout it’s possible to set an upper bound time limit to the execution of the grok match. If this limit is reached, the match is interrupted and the event is tagged by default with _groktimeout.

    Using the same conditional strategy we presented before, you can redirect those events to a file or a different index in elasticsearch for later analysis.

    Another really cool thing that we will be introducing in Logstash 5.0 in the context of metrics is the ability to extract data on the pipeline performance and, most importantly, per plugin statistics. While logstash is running, you can query its API endpoint and see how much cumulative time logstash is spending on a plugin:

    $ curl localhost:9600/_node/stats/pipeline?pretty | jq ".pipeline.plugins.filters"
    [
      {
        "id": "grok_b61938f3833f9f89360b5fba6472be0ad51c3606-2",
        "events": {
          "duration_in_millis": 7,
          "in": 24,
          "out": 24
        },
        "failures": 24,
        "patterns_per_field": {
          "message": 1
        },
        "name": "grok"
      },
      {
        "id": "kv_b61938f3833f9f89360b5fba6472be0ad51c3606-3",
        "events": {
          "duration_in_millis": 2,
          "in": 24,
          "out": 24
        },
        "name": "kv"
      }
    ]
    

    With this information, you can see if grok’s “duration_in_millis” is growing rapidly or not and if the number of failures is increasing, which could serve as a warning flag that some pattern is not well designed or consuming more time than expected.

    Conclusion

    Hopefully this blog post will help you understand how grok is behaving and how to improve its throughput. To sum up our conclusions:

      1. Grok may not perform well when a match fails;
      2. Monitor the occurrence of _grokparsefailures and then benchmark 
        their cost;
      3. Use anchors such as ^ and $ to remove ambiguity and aid the regex 
        engine;
      4. Tiered matching increases performance if you don’t use anchors, 
        otherwise don’t bother. When in doubt, measure!
      5. Using either timeout settings or the upcoming Metrics API allows you 
        to get a better look into how grok is behaving, and serve a starting 
        point for performance analysis.

    转载于:https://www.cnblogs.com/alexhjl/p/7305718.html

    展开全文
  • Logstash模式 ... 在测试配置中的新模式添加到列表中的grok的前过滤器RSYSLOGMESSAGE如下: %{RSYSLOGPREFIX}%{<something>_MSG}这需要在quattor一个新的rpm和配置改变了。 仅当模式被认为是稳定的时才应执行。
  • GrokToRegex 是一个简单的命令行实用程序,可将已知的 grok 别名转换为其相应的正则表达式值。 安装 要安装 GrokToRegex,请使用 go get go get github.com/jrxfive/groktoregex 用法 一旦构建 groktoregex 只需要...
  • OpenGrok Title Fix-crx插件

    2021-04-01 22:52:26
    语言:English 此扩展名仅在OPENGROK网站上使用文件名重命名选项卡标题。仅针对LG INTERNAL USE开发 此扩展名仅在OPENGROK网站上使用文件名重命名选项卡标题。 专为LG内部使用而开发
  • opengrok工具

    2018-04-04 10:12:11
    windows10 搭建opengrok工具和自己写的文档,绝对值,包含jdk tomcat opengrok ctag
  • Logstash插件 这是的插件。 它是完全免费和完全开源的。 该许可证是Apache 2.0,这意味着您可以随意使用它,但是您可以通过任何方式使用它。 文献资料 Logstash提供了自动为该插件生成文档的基础结构。...
  • Grok是高性能的JPEG 2000编解码器。 库功能包括: 快速随机访问子图像解码,在可用时使用TLM和PLT标记 对ICC颜色配置文件的完全编码/解码支持 对XML , IPTC , XMP和EXIF元数据的完全编码/解码支持 对mono , ...
  • opengrok-src-updater:适用于opengrok snap的插件snap
  • postfix-grok-patterns, 解析后缀日志记录的Logstash配置和grok模式 用于后缀日志的 Logstash注释模式一组使用grok解析后缀日志记录的grok模式。 还包括一个示例Logstash配置文件,用于将grok模式作为过滤器应用。...
  • Ubuntu 搭建opengrok 流程

    2021-10-16 13:19:26
    opengrok平台搭建流程 软件下载链接: https://pan.baidu.com/s/1kCeXNlj2l3FujyMza3rM0w 提取码: iniy 搭建前的准备 电脑系统 电脑系统推荐使用ubuntu 16,这版系统较为稳定。细节未更新 python环境 推荐使用python ...

    opengrok平台搭建流程

    搭建前的准备

    软件下载链接: https://pan.baidu.com/s/1kCeXNlj2l3FujyMza3rM0w
    提取码: iniy

    电脑系统
    电脑系统推荐使用ubuntu 16,这版系统较为稳定。细节未更新

    python环境
    推荐使用python 2.7及以上版本,这一版本相对稳定,python安装细节未更新

    java环境
    推荐使用JDK1.8及以上版本,具体安装细节未更新

    通过 java -version 和 javac -version 可以查看版本。

    Opengrok正式搭建:

    安装Tomcat
    Tomcat简介:百度

    软件下载

    本文以apache-tomcat-8.5.31.tar.gz安装包为例,进行讲解说明。

    Tomcat安装

    以本文的安装目录为参考进行讲解。不推荐将软件安装在系统根目录,这是因为后期同步代码仓后,会产生较大的索引文件,如果根目录空间太小,容易导致系统崩溃。
    安装目录(本目录为自行创建的目录):/mnt/code/software_install/opengrok_platform

    将tomcat安装包复制到该目录下,然后使用ctrl+alt+t组合快捷键打开终端。使用cd命令进入安装目录,确认安装包是否已经存在安装目录下,命令如下

    $ cd /mnt/code/software_install/opengrok_platform
    $ ls –al
    

    结果如图
    在这里插入图片描述
    使用命令解压安装包

    $ tar xvf apache-tomcat-8.5.31.tar.gz
    

    解压完成后,安装目录下回多出一个目录
    在这里插入图片描述
    由于解压后的目录名称使用不便,推荐修改名称并删除原先的安装包,命令如下

    $ mv apache-tomcat-8.5.31   tomcat     	#将apache-tomcat-8.5.31目录修改为tomcat目录
    $ rm apache-tomcat-8.5.31.tar.gz			#删除原先的安装包
    $ ls –al
    

    此时解压后的生成的目录的名称会被修改为自己想要的名称
    在这里插入图片描述
    到这一步tomcat已经安装完成。此时我们还无法访问tomcat,因为尚未启动tomcat服务。需要按以下步骤启动tomcat。在终端中执行以下命令

    $ cd /mnt/code/software_install/opengrok_platform/tomcat/bin
    $ ./startup.sh
    

    执行完成后会出现成功提示,说明tomcat已经启动成功
    在这里插入图片描述
    启动tomcat后,需要再次确认是否能够正常访问tomcat主页。打开系统的浏览器,输入访问地址:http://localhost:8080 会出现tomcat主页
    在这里插入图片描述
    说明tomcat tomcat服务已启动,可以正常访问。此时的tomcat只能被本地访问,其他电脑无法访问。为使得其他电脑也能访问,我们需要修改访问方式

    将本地访问方式修改成ip地址访问方式。在终端执行以下命令

    $ cd /mnt/code/software_install/opengrok_platform/tomcat/conf/
    $ gedit server.xml
    

    执行完命令,系统会自动打开TextEitor(TextEitor:linux自带的文档编辑器),在打开的TextEitor中通过关键字 localhost 定位到需要修改的地方。定位到以后,将localhost修改为自己系统的IP地址,保存并关闭TextEitor。修改结果如下
    在这里插入图片描述

    如果不知道自己系统IP,可以使用命令查看自己的IP,在终端执行命令

    $ ifconfig
    

    图中白色方框中便是自己的IP
    在这里插入图片描述
    此时,可以使用局域网内任何一台电脑访问自己的tomcat。访问方法:在任何一台电脑的系统浏览器中输入访问地址
    访问地址:http://192.168.147.75:8080/
    在这里插入图片描述

    如果想关闭tomcat,可以在终端中执行命令

    $ cd /mnt/code/software_install/opengrok_platform/tomcat/bin
    $ ./shutdown.sh
    

    到此tomcat的安装及配置已经完成。下一步是安装Ctags。

    安装Exuberant Ctags:

    Ctags简介:百度

    Ctags安装
    Ctags是opengrok必备的标签软件。可以通过命令自行安装,打开终端执行以下命令

    $ sudo apt-get install exuberant-ctags
    

    Ctags安装完成后,便是opengrok的安装工作。

    安装opengrok

    Opengrok简介:百度

    Opengrok下载

    opengrok安装
    本文以安装包opengrok-1.1-rc27.tar.gz为例,讲解opengrok搭建
    将安装包放到一开始创建的安装目录 /mnt/code/software_install/opengrok_platform/ 中。使用命令解压安装包,并对解压后的软件目录进行重命名,这一步骤与tomcat类似,在终端中执行以下命令

    $ cd /mnt/code/software_install/opengrok_platform
    $ ls –al   
    $ tar xvf opengrok-1.1-rc27.tar.gz   
    $ mv opengrok-1.1-rc27 opengrok
    $ rm opengrok-1.1-rc27.tar.gz
    

    最终结果如下
    在这里插入图片描述

    此时虽opengrok已完成安装,但无法使用。必须将opengrok与tomcat建立联系。具体方法是将opengrok的lib目录下的source.war文件,复制到tomcat的webapp目录下,此时系统会自动在webapp目录下生成一个source目录,可通过以下命令完成

    $ cd /mnt/code/software_install/opengrok_platform/opengrok/lib
    $ cp source.war  /mnt/code/software_install/opengrok_platform/tomcat/webapps/
    $ cd /mnt/code/software_install/opengrok_platform/tomcat/webapps/
    $ rm source.war
    $ ls –al
    

    最终结果如下
    在这里插入图片描述

    如果执行命令后source目录没有生成,可用过以下命令进行解决

    $ cd /mnt/code/software_install/opengrok_platform/tomcat/webapps/
    $ rm source.war
    $ mkdir source
    $ cd /mnt/code/software_install/opengrok_platform/opengrok/lib
    $ cp source.war  /mnt/code/software_install/opengrok_platform/tomcat/webapps/source
    $ mv source.war  source.zip
    $ unzip source.war
    $ rm source.war
    $ ls –al
    

    此时tomcat/webapp目录下同样会生成source目录。这一步的实际目的是将source.war这个压缩包中的内容解压出来,与系统自动生成的source是同样的效果。

    测试opengrok
    到此opengrok才正式安装完成。我们需要验证opengrok是否能够正常启动与使用。
    启动opengrok,执行以下命令

    $ cd /mnt/code/software_install/opengrok_platform/opengrok/bin
    $ ./OpenGrok deploy
    

    执行完成后,会出现以下成功提示
    在这里插入图片描述

    此时opengrok已经正常启动,可以通过浏览器访问opengrok主页
    访问地址 http://192.168.147.75:8080/source/
    这里的IP地址是tomcat中配置的本地地址,访问结果
    在这里插入图片描述

    出现以上结果,说明opengrok已经与tomcat成功建立连接。下一步是关联源代码仓。

    关联代码仓

    创建源代码仓关联目录
    以上步骤完成后,代表opengrok平台已经搭建完成。下一步便是关联源代码。在关联源码的过程中会生成源代码索引文件和代码同步文件,而这个文件较大,如果根目录内存较小,便会导致系统崩溃。这就是之前不推荐将软件安装在根目录的原因。
    opengrok是通过索引目录和索引文件配合来访问源码,而不是直接访问源码,因此需要创建源代码关联目录和索引文件存放目录。打开终端,执行以下命令

    $ cd /mnt/code/software_install/opengrok_platform/opengrok             
    $ mkdir source			#该目录直接关联到源代码
    $ mkdir data				#该目录存放源代码索引文件
    $ mkdir etc				#该目录存放后期代码同步的必须文件
    $ ls –al
    

    最终结果如下
    在这里插入图片描述

    配置环境变量

    创建好源码关联目录后,需要关联源代码。关联源代码以及后期使用opengrok索引源代码的过程中都需要用到tomcat和opengrok的命令,但前期我们并没有配置tomcat和opengrok环境变量,所以此时我们需要配置tomcat和opengrok环境变量,打开终端执行以下命令

    $ echo "export OPENGROK_SRC_ROOT=/mnt/code/software_install/opengrok_platform/opengrok/source"   >> /etc/profile
    $ echo "export OPENGROK_SRC_ROOT=/mnt/code/software_install/opengrok_platform/opengrok/data"   >> /etc/profile
    $ echo "export OPENGROK_TOMCAT_BASE=/mnt/code/software_install/opengrok_platform/tomcat"   >> /etc/profile
    $ source /etc/profile                     
    $ cat /etc/profile     
    

    Profile文件中会多出以下内容
    在这里插入图片描述

    或者执行以下步骤,结果相同,这一步本文尚未尝试

    $ echo "export OPENGROK_SRC_ROOT=/mnt/code/software_install/opengrok_platform/opengrok/source"   >>  ~/.bashrc   
    $ echo "export OPENGROK_SRC_ROOT=/mnt/code/software_install/opengrok_platform/opengrok/data"   >>  ~/.bashrc   
    $ echo "export OPENGROK_TOMCAT_BASE=/mnt/code/software_install/opengrok_platform/tomcat"   >>  ~/.bashrc   
    $ source  ~/.bashrc                        
    $ cat  ~/.bashrc   
    

    环境变量具体含义:
    OPENGROK_SRC_ROOT
    该环境变量用于指定使用OpenGrok查看的源代码存放的目录,该目录用于直接关联上源代码实际存放目录
    OPENGROK_DATA_ROOT
    该环境变量用于指定OpenGrok生成的源代码索引文件存放目录
    OPENGROK_TOMCAT_BASE
    该环境变量用于指定Tomcat安装路径

    关联源代码目录
    Opengrok通过source目录来关联源码,在配合索引文件来访问源码。而关联源码的方法是通过软链接来实现,此处为了方便大家理解操作过程,临时创建了三个源代码目录,然后通过软链接将三个源代码目录关联到opengrok的source,源代码目录如下
    在这里插入图片描述

    执行以下命令创建软链接

    $ cd /mnt/code/software_install/opengrok_platform/opengrok/source
    $ ln –s  /mnt/code/basin/project/tmp1    /mnt/code/software_install/opengrok_platform/opengrok/source
    $ ln –s  /mnt/code/basin/project/tmp2    /mnt/code/software_install/opengrok_platform/opengrok/source
    $ ln –s  /mnt/code/basin/project/tmp3    /mnt/code/software_install/opengrok_platform/opengrok/source
    $ ls -al      
    

    执行结果如下
    在这里插入图片描述

    生成索引文件和configuration.xml文件
    上一步虽然让opengrok与源代码目录关联起来,但是opengrok不可能直接跳转到源代码目录去查找目标代码,还缺少源代码索引文件。并且索引文件还可以大大缩减了代码检索时间。下一步就是创建源代码索引文件,创建命令如下

    $ cd /mnt/code/software_install/opengrok_platform/opengrok
    $ java -jar /mnt/code/software_install/opengrok_platform/opengrok/lib/opengrok.jar -P -S -v -s /mnt/code/software_install/opengrok_platform/opengrok/source -d /mnt/code/software_install/opengrok_platform/opengrok/data -I *.java -I *.c -I *.h -I *.cpp -i *.dat -i *.bin -i d:.git -i d:.repo -i d:log -i d:out -W /mnt/code/software_install/opengrok_platform/opengrok/etc/configuration.xml
    

    -i: 忽略文件
    对于上一命令有疑惑或不理解的地方,可以参考 https://ox0spy.github.io/post/install/setup-opengrok/
    这一步将会占用很长时间,具体时长取决于源代码量,参考时间2小时。由于本文源代码量较小,这一步骤花费时间为1分钟左右。索引文件创建完成后终端中会有成功提示
    在这里插入图片描述

    关联configuration.xml文件

    上一步中创建源代码索引文件的同时,会生成一个配置文件—configuration.xml。下一步需要将这个配置文件关联上tomcat,打开终端执行以下命令

    $ cd /mnt/code/software_install/opengrok_platform/tomcat/webapps/source/WEB-INF       
    $ gedit  web.xml 
    

    在TextEidor中将CONFIGURATION的参数改为上一步生成的configuration.xml存放路径,修改结果如图
    在这里插入图片描述

    此时再通过浏览器访问opengrok的结果如图,访问地址 http://192.168.147.75:8080/source/
    在这里插入图片描述

    如图可以看到我们所创建的临时代码目录已经被加载到opengrok中。此时可以通过opengrok来检索源代码了

    同步代码

    如果我们更新了源代码,opengrok是无法检索到更新后的那一部分源代码的。所以我们必须再次创建源代码索引。为了简化步骤,我们可以通过脚本来实现。
    将脚本放在/mnt/code/software_install/opengrok_platform/opengrok目录下,执行命令如下

    $ cd /mnt/code/software_install/opengrok_platform/opengrok              
    $ bash update.sh           
    

    等待脚本执行完成,所花时间与源代码索引文件创建时间相同。这一步骤本质上等同于上文的源代码索引文件的创建,只是将需要的命令写入脚本而已。执行成功后会有如下成功提示
    在这里插入图片描述

    此时可以再次使用opengrok检索源代码

    【参考文献】

    [1] https://github.com/oracle/opengrok/wiki/How-to-install-OpenGrok
    [2] http://blog.csdn.net/sauphy/article/details/50301815
    [3] https://blog.csdn.net/luohuiwu/article/details/82382701
    [4] https://blog.csdn.net/luzhenrong45/article/details/52734781
    
    展开全文
  • OpenGrok安装包 :opengrok-0.12.1.5.tar.gz Tomat 安装包:apache-tomcat-7.0.40.tar.gz
  • opengrok-1.2.23.tar.gz

    2019-10-07 19:27:25
    opengrok-1.2.23.tar 官网下载的资源
  • opengrok-1.2.24.tar.gz

    2019-10-07 19:25:52
    opengrok-1.2.24.tar 官网下载的资源
  • Grok常用表达式

    2020-03-19 16:27:28
    grok默认表达式 Logstash 内置了120种默认表达式,可以查看patterns,里面对表达式做了分组,每个文件为一组,文件内部有对应的表达式模式。下面只是部分常用的。 常用表达式 表达式标识 名称 ...

    Grok默认表达式

    Logstash 内置了120种默认表达式,可以查看patterns,里面对表达式做了分组,每个文件为一组,文件内部有对应的表达式模式。下面只是部分常用的。

    常用表达式

    表达式标识

    名称

    详情

    匹配例子

    USERNAME 或 USER

    用户名

    由数字、大小写及特殊字符(._-)组成的字符串

    1234、Bob、Alex.Wong

    EMAILLOCALPART

    用户名

    首位由大小写字母组成,其他位由数字、大小写及特殊字符(_.+-=:)组成的字符串。注意,国内的QQ纯数字邮箱账号是无法匹配的,需要修改正则

    windcoder、windcoder_com、abc-123

    EMAILADDRESS

    电子邮件

     

    windcoder@abc.com、windcoder_com@gmail.com、abc-123@163.com

    HTTPDUSER

    Apache服务器的用户

    可以是EMAILADDRESS或USERNAME

     

    INT

    整数

    包括0和正负整数

    0、-123、43987

    BASE10NUM 或 NUMBER

    十进制数字

    包括整数和小数

    0、18、5.23

    BASE16NUM

    十六进制数字

    整数

    0x0045fa2d、-0x3F8709

    WORD

    字符串

    包括数字和大小写字母

    String、3529345、ILoveYou

    NOTSPACE

    不带任何空格的字符串

      

    SPACE

    空格字符串

      

    QUOTEDSTRING 或 QS

    带引号的字符串

     

    "This is an apple"、'What is your name?'

    UUID

    标准UUID

     

    550E8400-E29B-11D4-A716-446655440000

    MAC

    MAC地址

    可以是Cisco设备里的MAC地址,也可以是通用或者Windows系统的MAC地址

     

    IP

    IP地址

    IPv4或IPv6地址

    127.0.0.1、FE80:0000:0000:0000:AAAA:0000:00C2:0002

    HOSTNAME

    IP或者主机名称

      

    HOSTPORT

    主机名(IP)+端口

     

    127.0.0.1:3306、api.windcoder.com:8000

    PATH

    路径

    Unix系统或者Windows系统里的路径格式

    /usr/local/nginx/sbin/nginx、c:\windows\system32\clr.exe

    URIPROTO

    URI协议

     

    http、ftp

    URIHOST

    URI主机

     

    windcoder.com、10.0.0.1:22

    URIPATH

    URI路径

     

    //windcoder.com/abc/、/api.php

    URIPARAM

    URI里的GET参数

     

    ?a=1&b=2&c=3

    URIPATHPARAM

    URI路径+GET参数

    /windcoder.com/abc/api.php?a=1&b=2&c=3

     

    URI

    完整的URI

     

    https://windcoder.com/abc/api.php?a=1&b=2&c=3

    LOGLEVEL

    Log表达式

    Log表达式

    Alert、alert、ALERT、Error

     

    日期时间表达式

    表达式标识

    名称

    匹配例子

    MONTH

    月份名称

    Jan、January

    MONTHNUM

    月份数字

    03、9、12

    MONTHDAY

    日期数字

    03、9、31

    DAY

    星期几名称

    Mon、Monday

    YEAR

    年份数字

     

    HOUR

    小时数字

     

    MINUTE

    分钟数字

     

    SECOND

    秒数字

     

    TIME

    时间

    00:01:23

    DATE_US

    美国时间

    10-01-1892、10/01/1892/

    DATE_EU

    欧洲日期格式

    01-10-1892、01/10/1882、01.10.1892

    ISO8601_TIMEZONE

    ISO8601时间格式

    +10:23、-1023

    TIMESTAMP_ISO8601

    ISO8601时间戳格式

    2016-07-03T00:34:06+08:00

    DATE

    日期

    美国日期%{DATE_US}或者欧洲日期%{DATE_EU} |

    DATESTAMP

    完整日期+时间

    07-03-2016 00:34:06

    HTTPDATE

    http默认日期格式

    03/Jul/2016:00:36:53 +0800

     

    举例操作如下:

    例1:

    [2020-08-22 12:25:51.441] [TSC_IHU] [ERROR] [c.e.c.t.i.t.s.IhuTsaUplinkServiceImpl] Activation/Bind uplink, query UserSession by Token failure!

    grok调试如下:

    \[%{TIMESTAMP_ISO8601:time}\]\s*%{DATA:thread}\s*\[%{LOGLEVEL:level}\]\s*%{GREEDYDATA:data}

    输出结果如下:

    {
      "data": "[c.e.c.t.i.t.s.IhuTsaUplinkServiceImpl] Activation/Bind uplink, query UserSession by Token failure!",
      "level": "ERROR",
      "time": "2020-08-22 12:25:51.441",
      "thread": "[TSC_IHU]"
    }

     

    例2:

    2020-09-12 14:16:36.320+08:00 INFO 930856f7-c78f-4f12-a0f1-83a2610b2dfc DispatcherConnector ip-192-168-114-244 [Mqtt-Device-1883-worker-18-1] com.ericsson.sep.dispatcher.api.transformer.v1.MessageTransformer {"TraceID":"930856f7-c78f-4f12-a0f1-83a2610b2dfc","clientId":"5120916600003466K4GA1059","username":"LB37622Z3KX609880"}
    %{TIMESTAMP_ISO8601:access_time}\s*%{LOGLEVEL:level}\s*%{UUID:uuid}\s*%{WORD:word}\s*%{HOSTNAME:hostname}\s*\[%{DATA:work}\]\s*(?<api>([\S+]*))\s*(?<TraceID>([\S+]*))\s*%{GREEDYDATA:message_data}

    输出结果如下:

    {
      "level": "INFO",
      "work": "Mqtt-Device-1883-worker-18-1",
      "uuid": "930856f7-c78f-4f12-a0f1-83a2610b2dfc",
      "hostname": "ip-192-168-114-244",
      "message_data": "",
      "TraceID": "{\"TraceID\":\"930856f7-c78f-4f12-a0f1-83a2610b2dfc\",\"clientId\":\"5120916600003466K4GA1059\",\"username\":\"LB37622Z3KX609880\"}",
      "api": "com.ericsson.sep.dispatcher.api.transformer.v1.MessageTransformer",
      "word": "DispatcherConnector",
      "access_time": "2020-09-12 14:16:36.320+08:00"
    }

     

     例3:

    192.168.125.138 - - [12/Sep/2020:14:10:58 +0800] "GET /backend/services/ticketRemind/query?cid=&msgType=1&pageSize=100&pageIndex=1&langCode=zh HTTP/1.1" 200 91

    grok调试如下:

     %{IP:ip}\s*%{DATA:a}\s*\[%{HTTPDATE:access_time}\]\s*%{DATA:b}%{WORD:method}\s*%{URIPATH:url}%{URIPARAM:param}\s*%{URIPROTO:uri}%{DATA:c}%{NUMBER:treaty}%{DATA:d}\s*%{NUMBER:status}\s*%{NUMBER:latency_millis}

    输出结果如下:

    {
      "a": "- -",
      "b": "\"",
      "c": "/",
      "method": "GET",
      "d": "\"",
      "ip": "192.168.125.138",
      "latency_millis": "91",
      "uri": "HTTP",
      "url": "/backend/services/ticketRemind/query",
      "param": "?cid=&msgType=1&pageSize=100&pageIndex=1&langCode=zh",
      "treaty": "1.1",
      "access_time": "12/Sep/2020:14:10:58 +0800",
      "status": "200"
    }

     

     例4:

    [08/Nov/2020:11:40:24 +0800] tc-com.g-netlink.net - - 192.168.122.58 192.168.122.58 192.168.125.135 80 GET 200 /geelyTCAccess/tcservices/capability/L6T7944Z0JN427155 ?pageIndex=1&pageSize=2000&vehicleType=0 21067 17

     grok调试如下:

    \[%{HTTPDATE:access_time}\] %{DATA:hostname} %{DATA:username} %{DATA:fwd_for} %{DATA:remote_hostname} %{IP:remote_ip} %{IP:local_ip} %{NUMBER:local_port} %{DATA:method} %{DATA:status} %{DATA:uri} %{DATA:query} %{NUMBER:bytes} %{NUMBER:latency_ms}

    输出结果如下:

    {
      "method": "GET",
      "local_port": "80",
      "fwd_for": "-",
      "query": "?pageIndex=1&pageSize=2000&vehicleType=0",
      "remote_hostname": "192.168.122.58",
      "uri": "/geelyTCAccess/tcservices/capability/L6T7944Z0JN427155",
      "latency_ms": "17",
      "local_ip": "192.168.125.135",
      "hostname": "tc-com.g-netlink.net",
      "remote_ip": "192.168.122.58",
      "bytes": "21067",
      "access_time": "08/Nov/2020:11:40:24 +0800",
      "username": "-",
      "status": "200"
    }

     

    展开全文

空空如也

空空如也

1 2 3 4 5 ... 20
收藏数 9,850
精华内容 3,940
关键字:

Grok