A rule-based log analyzer & filter

上山打老虎

Last update: Jun 23, 2022

Related tags

Overview

Flog

一个根据规则集来处理文本日志的工具。

前言

在日常开发过程中，由于缺乏必要的日志规范，导致很多人乱打一通，一个日志文件夹解压缩后往往有几十万行。

日志泛滥会导致信息密度骤减，给排查问题带来了不小的麻烦。

以前都是用grep之类的工具先挑选出有用的，再逐条进行排查，费时费力。在忍无可忍之后决定写这个工具，根据规则自动分析日志、剔除垃圾信息。

使用方法

安装

python setup.py install

基础用法

flog -r rules.yaml /path/to/1.log /path/to/2.log /path/to/3.log -o /path/to/filtered.log

其中：

rules.yaml是规则文件
/path/to/x.log是原始的日志文件，支持一次输入多个日志文件。
/path/to/filtered.log是过滤后的日志文件，如果不指定文件名（直接一个-o），会自动生成一个。

如果不需要过滤日志内容，只需显示分析结果，可以直接：

flog -r rules.yaml /path/to/your.log

规则语法

基础

name: Rule Name #规则集名称
patterns: #规则列表
  # 单行模式，如果匹配到 ^Hello，就输出 Match Hello
  - match: "^Hello"
    message: "Match Hello"
    action: bypass #保留此条日志（会输出到-o指定的文件中）
    
  # 多行模式，以^Hello开头，以^End结束，输出 Match Hello to End，并丢弃此条日志
  - start: "^Hello"
    end: "^End"
    message: "Match Hello to End"
    action: drop

  - start: "Start"
    start_message: "Match Start" #匹配开始时显示的信息
    end: "End"
    end_messagee: "Match End" #结束时显示的信息

纯过滤模式

name: Rule Name
patterns:
  - match: "^Hello" #删除日志中以Hello开头的行
  - start: "^Hello" #多行模式，删除从Hello到End中间的所有内容
    end: "^End"

过滤日志内容，并输出信息

name: Rule Name
patterns:
  - match: "^Hello" #删除日志中以Hello开头的行
    message: "Match Hello"
    action: drop #删除此行日志

规则嵌套

仅多行模式支持规则嵌套。

name: Rule
patterns:
  - start: "^Response.*{$"
    end: "^}"
    patterns:
      - match: "username = (.*)"
        message: "Current user: {{ capture[0] }}"

输入：

Login Response {
  username = zorro
  userid = 123456
}

输出：

Current user: zorro

action

action字段主要用于控制是否过滤此条日志，仅在指定 -o 参数后生效。取值范围：【drop，bypass】。

为了简化纯过滤类型规则的书写，action默认值的规则如下：

如果规则中包含message、start_message、end_message字段，action默认为bypass，即输出到文件中。
如果规则中不包含message相关字段，action默认为drop，变成一条纯过滤规则。

message

message 字段用于在标准输出显示信息，并且支持 Jinja 模版语法来自定义输出信息内容，通过它可以实现一些简单的日志分析功能。

目前支持的参数有:

lines: （多行模式下）匹配到的所有行
content: 匹配到的日志内容
captures: 正则表达式（match/start/end）捕获的内容

例如：

name: Rule Name
patterns:
  - match: "^Hello (.*)"
    message: "Match {{captures[0]}}"

如果遇到："Hello lilei"，则会在终端输出"Match lilei"

context

可以把日志中频繁出现的正则提炼出来，放到context字段下，避免复制粘贴多次，例如:

name: Rule Name

context:
  timestamp: "\\d{4}-\\d{2}-\\d{2} \\d{2}:\\d{2}:\\d{2}.\\d{3}"
patterns:
  - match: "hello ([^:]*):"
    message: "{{ timestamp }} - {{ captures[0] }}"

输入：2022-04-08 16:52:37.152 hello world: this is a test message
输出：2022-04-08 16:52:37.152 - world

高亮

内置了一些 Jinja 的 filter，可以在终端高亮输出结果，目前包含：

black, red, green, yellow, blue, purple, cyan, white, bold, light, italic, underline, blink, reverse, strike

例如：

patterns:
  - match: "Error: (.*)"
    message: "{{ captures[0] | red }}"

输入：Error: file not found
输出：file not found

include

支持引入其它规则文件，例如：

name: Rule
include: base #引入同级目录下的 base.yaml 或 base.yml

include支持引入一个或多个文件，例如:

name: Rule
include:
  - base
  - ../base
  - base.yaml
  - base/base1
  - base/base2.yaml
  - ../base.yaml
  - /usr/etc/rules/base.yml

context、patterns会按照引用顺序依次合并，如果有同名的context，后面的会替换之前的。

License

MIT

You might also like...

Trading and Backtesting environment for training reinforcement learning agent or simple rule base algo.

TradingGym TradingGym is a toolkit for training and backtesting the reinforcement learning algorithms. This was inspired by OpenAI Gym and imitated th

1.1k Jan 2, 2023

Continuous Security Group Rule Change Detection & Response at scale

Introduction Get notified of Security Group Changes across all AWS Accounts & Regions in an AWS Organization, with the ability to respond/revert those

3 Aug 13, 2022

Paper list of log-based anomaly detection

411 Dec 5, 2022

LogDeep is an open source deeplearning-based log analysis toolkit for automated anomaly detection.

279 Dec 13, 2022

PyTorch Implementation of NCSOFT's FastPitchFormant: Source-filter based Decomposed Modeling for Speech Synthesis

FastPitchFormant - PyTorch Implementation PyTorch Implementation of FastPitchFormant: Source-filter based Decomposed Modeling for Speech Synthesis. Qu

63 Jan 2, 2023

Search and filter videos based on objects that appear in them using convolutional neural networks

Thingscoop: Utility for searching and filtering videos based on their content Description Thingscoop is a command-line utility for analyzing videos se

354 Dec 4, 2022

An exploration of log domain "alternative floating point" for hardware ML/AI accelerators.

This repository contains the SystemVerilog RTL, C++, HLS (Intel FPGA OpenCL to wrap RTL code) and Python needed to reproduce the numerical results in

373 Dec 31, 2022

Code for "Human Pose Regression with Residual Log-likelihood Estimation", ICCV 2021 Oral

Human Pose Regression with Residual Log-likelihood Estimation [Paper] [arXiv] [Project Page] Human Pose Regression with Residual Log-likelihood Estima

347 Dec 24, 2022

A simple log parser and summariser for IIS web server logs

IISLogFileParser A basic parser tool for IIS Logs which summarises findings from the log file. Inspired by the Gist https://gist.github.com/wh13371/e7

2 Mar 26, 2022