GoAccess is a real-time web log analyzer and interactive viewer that runs in a terminal in *nix systems or through your browser.

Overview

GoAccess Build Status GoAccess

What is it?

GoAccess is an open source real-time web log analyzer and interactive viewer that runs in a terminal on *nix systems or through your browser. It provides fast and valuable HTTP statistics for system administrators that require a visual server report on the fly. More info at: https://goaccess.io.

GoAccess Terminal Dashboard GoAccess HTML Dashboard

Features

GoAccess parses the specified web log file and outputs the data to the X terminal. Features include:

  • Completely Real Time
    All panels and metrics are timed to be updated every 200 ms on the terminal output and every second on the HTML output.

  • Minimal Configuration needed
    You can just run it against your access log file, pick the log format and let GoAccess parse the access log and show you the stats.

  • Track Application Response Time
    Track the time taken to serve the request. Extremely useful if you want to track pages that are slowing down your site.

  • Nearly All Web Log Formats
    GoAccess allows any custom log format string. Predefined options include, Apache, Nginx, Amazon S3, Elastic Load Balancing, CloudFront, etc.

  • Incremental Log Processing
    Need data persistence? GoAccess has the ability to process logs incrementally through the on-disk persistence options.

  • Only one dependency
    GoAccess is written in C. To run it, you only need ncurses as a dependency. That's it. It even features its own Web Socket server — http://gwsocket.io/.

  • Visitors
    Determine the amount of hits, visitors, bandwidth, and metrics for slowest running requests by the hour, or date.

  • Metrics per Virtual Host
    Have multiple Virtual Hosts (Server Blocks)? It features a panel that displays which virtual host is consuming most of the web server resources.

  • Color Scheme Customizable
    Tailor GoAccess to suit your own color taste/schemes. Either through the terminal, or by simply applying the stylesheet on the HTML output.

  • Support for Large Datasets
    GoAccess features the ability to parse large logs due to its optimized in-memory hash tables. It has very good memory usage and pretty good performance. This storage has support for on-disk persistence as well.

  • Docker Support
    Ability to build GoAccess' Docker image from upstream. You can still fully configure it, by using Volume mapping and editing goaccess.conf. See Docker section below.

Nearly all web log formats...

GoAccess allows any custom log format string. Predefined options include, but not limited to:

  • Amazon CloudFront (Download Distribution).
  • Amazon Simple Storage Service (S3)
  • AWS Elastic Load Balancing
  • Combined Log Format (XLF/ELF) Apache | Nginx
  • Common Log Format (CLF) Apache
  • Google Cloud Storage.
  • Apache virtual hosts
  • Squid Native Format.
  • W3C format (IIS).
  • Caddy's JSON Structured format.

Why GoAccess?

GoAccess was designed to be a fast, terminal-based log analyzer. Its core idea is to quickly analyze and view web server statistics in real time without needing to use your browser (great if you want to do a quick analysis of your access log via SSH, or if you simply love working in the terminal).

While the terminal output is the default output, it has the capability to generate a complete, self-contained, real-time HTML report, as well as a JSON, and CSV report.

You can see it more of a monitor command tool than anything else.

Installation

Build from release

GoAccess can be compiled and used on *nix systems.

Download, extract and compile GoAccess with:

$ wget https://tar.goaccess.io/goaccess-1.4.5.tar.gz
$ tar -xzvf goaccess-1.4.5.tar.gz
$ cd goaccess-1.4.5/
$ ./configure --enable-utf8 --enable-geoip=legacy
$ make
# make install

Build from GitHub (Development)

$ git clone https://github.com/allinurl/goaccess.git
$ cd goaccess
$ autoreconf -fiv
$ ./configure --enable-utf8 --enable-geoip=legacy
$ make
# make install

Distributions

It is easiest to install GoAccess on Linux using the preferred package manager of your Linux distribution. Please note that not all distributions will have the latest version of GoAccess available.

Debian/Ubuntu

# apt-get install goaccess

Note: It is likely this will install an outdated version of GoAccess. To make sure that you're running the latest stable version of GoAccess see alternative option below.

Official GoAccess Debian & Ubuntu repository

$ echo "deb https://deb.goaccess.io/ $(lsb_release -cs) main" | sudo tee -a /etc/apt/sources.list.d/goaccess.list
$ wget -O - https://deb.goaccess.io/gnugpg.key | sudo apt-key --keyring /etc/apt/trusted.gpg.d/goaccess.gpg add -
$ sudo apt-get update
$ sudo apt-get install goaccess

Note:

  • .deb packages in the official repo are available through HTTPS as well. You may need to install apt-transport-https.

Fedora

# yum install goaccess

Arch Linux

# pacman -S goaccess

Gentoo

# emerge net-analyzer/goaccess

OS X / Homebrew

# brew install goaccess

FreeBSD

# cd /usr/ports/sysutils/goaccess/ && make install clean
# pkg install sysutils/goaccess

OpenBSD

# cd /usr/ports/www/goaccess && make install clean
# pkg_add goaccess

openSUSE

# zypper ar -f obs://server:http http
# zypper in goaccess

OpenIndiana

# pkg install goaccess

pkgsrc (NetBSD, Solaris, SmartOS, ...)

# pkgin install goaccess

Windows

CowAxess is a GoAccess implementation for Windows systems. It is a packaging of GoAccess, Cygwin and many other related tools to make it a complete and ready-to-use solution for real-time web log analysis, all in a 4 MB package.

If you prefer to go the more tedious route, GoAccess can be used in Windows through Cygwin. See Cygwin's packages. Or through the Linux Subsystem on Windows 10.

Distribution Packages

GoAccess has minimal requirements, it's written in C and requires only ncurses. However, below is a table of some optional dependencies in some distros to build GoAccess from source.

Distro NCurses GeoIP (opt) GeoIP2 (opt) OpenSSL (opt)
Ubuntu/Debian libncursesw5-dev libgeoip-dev libmaxminddb-dev libssl-dev
RHEL/CentOS ncurses-devel geoip-devel libmaxminddb-devel openssl-devel
Arch Linux ncurses geoip libmaxminddb openssl
Gentoo sys-libs/ncurses dev-libs/geoip dev-libs/libmaxminddb dev-libs/openssl
Slackware ncurses GeoIP libmaxminddb openssl

Note: You may need to install build tools like gcc, autoconf, gettext, autopoint etc for compiling/building software from source. e.g., base-devel, build-essential, "Development Tools".

Docker

A Docker image has been updated, capable of directing output from an access log. If you only want to output a report, you can pipe a log from the external environment to a Docker-based process:

cat access.log | docker run --rm -i -e LANG=$LANG allinurl/goaccess -a -o html --log-format COMBINED - > report.html

OR real-time

cat access.log | docker run -p 7890:7890 --rm -i -e LANG=$LANG allinurl/goaccess -a -o html --log-format COMBINED --real-time-html - > report.html

You can read more about using the docker image in DOCKER.md.

Storage

Default Hash Tables

In-memory storage provides better performance at the cost of limiting the dataset size to the amount of available physical memory. GoAccess uses in-memory hash tables. It has very good memory usage and pretty good performance. This storage has support for on-disk persistence as well.

Command Line / Config Options

See options that can be supplied to the command or specified in the configuration file. If specified in the configuration file, long options need to be used without prepending --.

Usage / Examples

Note: Piping data into GoAccess won't prompt a log/date/time configuration dialog, you will need to previously define it in your configuration file or in the command line.

Getting Started

To output to a terminal and generate an interactive report:

# goaccess access.log

To generate an HTML report:

# goaccess access.log -a > report.html

To generate a JSON report:

# goaccess access.log -a -d -o json > report.json

To generate a CSV file:

# goaccess access.log --no-csv-summary -o csv > report.csv

GoAccess also allows great flexibility for real-time filtering and parsing. For instance, to quickly diagnose issues by monitoring logs since goaccess was started:

# tail -f access.log | goaccess -

And even better, to filter while maintaining opened a pipe to preserve real-time analysis, we can make use of tail -f and a matching pattern tool such as grep, awk, sed, etc:

# tail -f access.log | grep -i --line-buffered 'firefox' | goaccess --log-format=COMBINED -

or to parse from the beginning of the file while maintaining the pipe opened and applying a filter

# tail -f -n +0 access.log | grep -i --line-buffered 'firefox' | goaccess -o report.html --real-time-html -

Multiple Log files

There are several ways to parse multiple logs with GoAccess. The simplest is to pass multiple log files to the command line:

# goaccess access.log access.log.1

It's even possible to parse files from a pipe while reading regular files:

# cat access.log.2 | goaccess access.log access.log.1 -

Note: the single dash is appended to the command line to let GoAccess know that it should read from the pipe.

Now if we want to add more flexibility to GoAccess, we can use zcat --force to read compressed and uncompressed files. For instance, if we would like to process all log files access.log*, we can do:

# zcat --force access.log* | goaccess -

Note: On Mac OS X, use gunzip -c instead of zcat.

Real-time HTML outputs

GoAccess has the ability the output real-time data in the HTML report. You can even email the HTML file since it is composed of a single file with no external file dependencies, how neat is that!

The process of generating a real-time HTML report is very similar to the process of creating a static report. Only --real-time-html is needed to make it real-time.

# goaccess access.log -o /usr/share/nginx/html/your_site/report.html --real-time-html

To view the report you can navigate to http://your_site/report.html.

By default, GoAccess will use the host name of the generated report. Optionally, you can specify the URL to which the client's browser will connect to. See FAQ for a more detailed example.

# goaccess access.log -o report.html --real-time-html --ws-url=goaccess.io

By default, GoAccess listens on port 7890, to use a different port other than 7890, you can specify it as (make sure the port is opened):

# goaccess access.log -o report.html --real-time-html --port=9870

And to bind the WebSocket server to a different address other than 0.0.0.0, you can specify it as:

# goaccess access.log -o report.html --real-time-html --addr=127.0.0.1

Note: To output real time data over a TLS/SSL connection, you need to use --ssl-cert=<cert.crt> and --ssl-key=<priv.key>.

Filtering

Working with dates

Another useful pipe would be filtering dates out of the web log

The following will get all HTTP requests starting on 05/Dec/2010 until the end of the file.

# sed -n '/05\/Dec\/2010/,$ p' access.log | goaccess -a -

or using relative dates such as yesterdays or tomorrows day:

# sed -n '/'$(date '+%d\/%b\/%Y' -d '1 week ago')'/,$ p' access.log | goaccess -a -

If we want to parse only a certain time-frame from DATE a to DATE b, we can do:

# sed -n '/5\/Nov\/2010/,/5\/Dec\/2010/ p' access.log | goaccess -a -

If we want to preserve only certain amount of data and recycle storage, we can keep only a certain number of days. For instance to keep & show the last 5 days:

# goaccess access.log --keep-last=5

Virtual hosts

Assuming your log contains the virtual host field. For instance:

vhost.io:80 8.8.4.4 - - [02/Mar/2016:08:14:04 -0600] "GET /shop HTTP/1.1" 200 615 "-" "Googlebot-Image/1.0"

And you would like to append the virtual host to the request in order to see which virtual host the top urls belong to:

awk '$8=$1$8' access.log | goaccess -a -

To do the same, but also use real-time filtering and parsing:

tail -f  access.log | unbuffer -p awk '$8=$1$8' | goaccess -a -

To exclude a list of virtual hosts you can do the following:

# grep -v "`cat exclude_vhost_list_file`" vhost_access.log | goaccess -

Files, status codes and bots

To parse specific pages, e.g., page views, html, htm, php, etc. within a request:

# awk '$7~/\.html|\.htm|\.php/' access.log | goaccess -

Note, $7 is the request field for the common and combined log format, (without Virtual Host), if your log includes Virtual Host, then you probably want to use $8 instead. It's best to check which field you are shooting for, e.g.:

# tail -10 access.log | awk '{print $8}'

Or to parse a specific status code, e.g., 500 (Internal Server Error):

# awk '$9~/500/' access.log | goaccess -

Or multiple status codes, e.g., all 3xx and 5xx:

# tail -f -n +0 access.log | awk '$9~/3[0-9]{2}|5[0-9]{2}/' | goaccess -o out.html -

And to get an estimated overview of how many bots (crawlers) are hitting your server:

# tail -F -n +0 access.log | grep -i --line-buffered 'bot' | goaccess -

Tips

Also, it is worth pointing out that if we want to run GoAccess at lower priority, we can run it as:

# nice -n 19 goaccess -f access.log -a

and if you don't want to install it on your server, you can still run it from your local machine!

# ssh -n root@server 'tail -f /var/log/apache2/access.log' | goaccess -

Note: SSH requires -n so GoAccess can read from stdin. Also, make sure to use SSH keys for authentication as it won't work if a passphrase is required.

Troubleshooting

We receive many questions and issues that have been answered previously.

Incremental log processing

GoAccess has the ability to process logs incrementally through its internal storage and dump its data to disk. It works in the following way:

  1. A dataset must be persisted first with --persist, then the same dataset can be loaded with.
  2. --restore. If new data is passed (piped or through a log file), it will append it to the original dataset.
NOTES

GoAccess keeps track of inodes of all the files processed (assuming files will stay on the same partition), in addition, it extracts a snippet of data from the log along with the last line parsed of each file and the timestamp of the last line parsed. e.g., inode:29627417|line:20012|ts:20171231235059

First, it compares if the snippet matches the log being parsed, if it does, it assumes the log hasn't changed drastically, e.g., hasn't been truncated. If the inode does not match the current file, it parses all lines. If the current file matches the inode, it then reads the remaining lines and updates the count of lines parsed and the timestamp. As an extra precaution, it won't parse log lines with a timestamp ≤ than the one stored.

Piped data works based off the timestamp of the last line read. For instance, it will parse and discard all incoming entries until it finds a timestamp >= than the one stored.

Examples
// last month access log
# goaccess access.log.1 --persist

then, load it with

// append this month access log, and preserve new data
# goaccess access.log --restore --persist

To read persisted data only (without parsing new data)

# goaccess --restore

Contributing

Any help on GoAccess is welcome. The most helpful way is to try it out and give feedback. Feel free to use the Github issue tracker and pull requests to discuss and submit code changes.

Enjoy!

Comments
  • Duplicated request entries on

    Duplicated request entries on "Requested Files" panel

    Hello! I'm currently trying to automate the report generation for a project which contains several deployed nodes, these nodes write to different logs, and I'm generating the report for those logs. Each of these logs relates to a day of accesses and they are rotated daily around 3AM.

    # Concatenate log files into a single file
    zcat -f $LOGS | sort -k $SORT > $PARSED_LOG
    
    # Process log variations
    bash $BASEDIR/process-log-variations.sh $PARSED_LOG
    
    # Import log data into BTREE databases
    goaccess --process-and-exit -p $conf $PARSED_LOG
    
    # Generate HTML from BTREE databases
    goaccess -p $conf -o $REPDIR/report.html
    

    My script receives several log files and concatenates them together, sorts them by the date field, and then using SED removes variation fields, such as IDs, Hashes, and others from the requests URL replacing them with the word "$var". A single log file is generated from this process which I then feed to GoAcess, first persisting the data and then updating the report HTML with data from the last log file.

    The problem is that, after loading 4/5 days of logs, in the "Requested Files" panel, the behavior in the following image happens:

    image

    As you can see, there are several "GET /api/auth/token" when only one should be appearing. Does anyone know why this happen? Is it a bug or am I doing something wrong?

    question log-processing 
    opened by rtista 56
  • ```18435 Bus error``` on a 22 Gbyte CLF [fixed/resolved]

    ```18435 Bus error``` on a 22 Gbyte CLF [fixed/resolved]

    I'm using 0.9.2 & encountering what seems to be a crash?

     18435 Bus error               goaccess -f logs.log --geoip-city ~/GeoLiteCity3.dat >> logs.html
    

    The log file is rather large 22 Gigabytes / 75-million+ records - at the end of the process (@ ~ [57,000/s]) when it comes to write or sometime into this process. FYI I'm using a working config / setting that works with other smaller files.

    Is there a debug build with gdb or any other tool I can use to debug the issue? Or is there some other limit I may be overlooking? Meanwhile I'll try a build of the latest edition. Many thanks in advanced.

    PS - Its probably worthwhile mentioning that the system has 32GB in physical RAM as well as another 32GB in swap.

    bug 
    opened by aphorise 54
  • goaccess not running in --real-time-html

    goaccess not running in --real-time-html

    Hello, I have a problem with "--real-time-html"

    when i use command: root@pa:~# goaccess -f /var/log/apache2/access.log -a -o /var/www/html/index.html --real-time-html i get in terminal the following: root@pa:~# goaccess -f /var/log/apache2/access.log -a -o /var/www/html/index.html --real-time-html WebSocket server ready to accept new client connections

    but on my index.html i have static page with "Last Updated" when i use a command, not real time page.

    so when i use command root@pa:~# goaccess -f /var/log/apache2/access.log -a -o /var/www/html/index.html --real-time-html i have the same result like root@pa:~# goaccess -f /var/log/apache2/access.log -a -o /var/www/html/index.html

    Please, help me to make it real time

    question html report websocket-server 
    opened by muagkov 42
  • Freebsd 11.2 - Missing development files for libmaxminddb library

    Freebsd 11.2 - Missing development files for libmaxminddb library

    Hello,

    I installed freebsd 11.2 and i am trying to compile goaccess 1.3. My configure command is:

    sudo ./configure --enable-geoip=mmdb

    However i am getting this error:

    checking for MMDB_open in -lmaxminddb... no
    configure: error: 
        *** Missing development files for libmaxminddb library.
    
    

    The pkg info for this library is:

    $ sudo pkg install libmaxminddb
    you have mail
    Password:
    Updating FreeBSD repository catalogue...
    FreeBSD repository is up to date.
    All repositories are up to date.
    Checking integrity... done (0 conflicting)
    The most recent version of packages are already installed
    
    

    Do you know when goaccess is failing to compile it?

    Thanks

    build 
    opened by andygr 39
  • didn't enable real-time-html, but comes out: websocket ready to accept new client

    didn't enable real-time-html, but comes out: websocket ready to accept new client

    here is my command: goaccess api.crm.51zan.production.access.log -a -o test.html then comes out: WebSocket server ready to accept new client connections

    in the access.conf, I didn't enable real-time too!!! image

    websocket-server command-line options cron 
    opened by saclin 37
  • Web access returns 400 Invalid Request

    Web access returns 400 Invalid Request

    Running the following docker run -p 7890:7890 -v "/mnt/user/appdata/goaccess/data":"/srv/data":rw -v "/mnt/user/appdata/goaccess/html":"/srv/report":rw -v "/mnt/user/appdata/letsencrypt/log/nginx":"/srv/logs":ro allinurl/goaccess goaccess /srv/logs/access.log -o report.html --real-time-html --no-global-config --config-file=/srv/data/goaccess.conf parses the access log and returns WebSocket server ready to accept new client connections.

    Unfortunately accessing the the front-end results in an empty page, curl confirms:

    # curl http://localhost:7890/ -vvvv
    *   Trying localhost...
    * TCP_NODELAY set
    * Connected to localhost (localhost) port 7890 (#0)
    > GET / HTTP/1.1
    > Host: localhost:7890
    > User-Agent: curl/7.57.0
    > Accept: */*
    > 
    < HTTP/1.1 400 Invalid Request
    * no chunk, no close, no size. Assume close to signal end
    < 
    * Closing connection 0
    

    Using different ports for the front-end and the WebSocket server (--ws-url=0.0.0.0:7891 --port=7890)produces the same result.

    unable to replicate websocket-server docker 
    opened by realies 35
  • No time format was found on your conf file.

    No time format was found on your conf file.

    Hi,

    I am currently running on GoAccess v0.9.6 and installed on Centos 6.5 (Final) server.

    whenever i try to run my log using "goaccess -f logfile" (without .log)

    I get the error as below:-

    Fatal error has occurred
    Error occured at: src/parser.c - verify_formats - 1938
    No time format was found on your conf file.
    

    I have change the date-format and log-format respectively:-

    date-format %d/%b/%Y
    log-format %h %^[%d:%^] "%r" %s %b "%R" "%u"
    

    No luck and keeps getting the same error....Need your kind advice on this.

    Here is my config file settings:-

    ######################################
    # Time Format Options (required)
    ######################################
    #time-format %H:%M:%S
    #time-format %T
    #   
    #   
    #time-format %f
    
    ######################################
    # Date Format Options (required)
    ######################################
    # Apache log date format. The following date format works with any 
    # of the Apache's log formats below.
    #   
    date-format %d/%b/%Y
    #   
    #date-format %Y-%m-%d
    #   
    #   
    #date-format %f
    
    ######################################
    # Log Format Options (required)
    ######################################
    #   
    #   
    # NCSA Combined Log Format
    #   
    #log-format %h %^[%d:%t %^] "%r" %s %b "%R" "%u"
    log-format %h %^[%d:%^] "%r" %s %b "%R" "%u"
    

    Thanks.

    question 
    opened by coverguy 34
  • not able to open report.html

    not able to open report.html

    I am able to see interactive screen with all statistics for my access log but not able to load the html from browser. It says loading but never loads. Please help me out with loading the html report generated.

    command used:

    /usr/local/bin/goaccess /opt/IBMIHS/logs/webqa/access_log.20190321 --log-format='%h %v %U %s [%d:%t %^] %D "%u"' --date-format='%d/%b/%Y' --time-format=%T --enable-panel=REFERRERS --enable-panel=KEYPHRASES

    command used for generating report.html:

    /usr/local/bin/goaccess /opt/IBMIHS/logs/webqa/access_log.20190321 --log-format='%h %v %U %s [%d:%t %^] %D "%u"' --date-format='%d/%b/%Y' --time-format=%T --enable-panel=REFERRERS --enable-panel=KEYPHRASES -o report.html

    html report 
    opened by partsauthority 32
  • Error occured at: goaccess.c - main - Nothing valid to process.

    Error occured at: goaccess.c - main - Nothing valid to process.

    Hello! I'm using goaccess to track evemilano's log but I have some problem :( Any idea how to solve this?

    GoAccess - version 0.8.3 - Oct 24 2014 17:27:50

    Fatal error has occurred
    Error occured at: goaccess.c - main - 832
    Nothing valid to process.
    

    LOG example:

    162.158.151.53 - - [03/Feb/2016:11:02:50 -0500]  "GET /servizi-web-marketing/ HTTP/1.1" 500 278 "https://www.evemilano.com/servizi-seo/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.97 Safari/537.36" 0.128 0.128 . BYPASS
    141.101.105.70 - - [03/Feb/2016:11:02:57 -0500]  "POST /wp-cron.php?doing_wp_cron=1454515377.2718749046325683593750 HTTP/1.1" 200 31 "-" "WordPress/4.4.2; https://www.evemilano.com" 0.156 0.156 . -
    162.158.151.53 - - [03/Feb/2016:11:02:58 -0500]  "GET / HTTP/1.1" 200 15326 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.97 Safari/537.36" 1.437 1.437 . BYPASS
    
    question 
    opened by evemilano 32
  • GoAccess 0.9.5 crashed by Signal 11

    GoAccess 0.9.5 crashed by Signal 11

    ==48347== GoAccess 0.9.5 crashed by Signal 11
    ==48347==
    ==48347== VALUES AT CRASH POINT
    ==48347==
    ==48347== Line number: 84899
    ==48347== Offset: 84642
    ==48347== Invalid data: 183
    ==48347== Piping: 0
    ==48347== Response size: 0 bytes
    ==48347==
    ==48347== STACK TRACE:
    ==48347==
    ==48347== 0 goaccess(sigsegv_handler+0x166) [0x408446]
    ==48347== 1 /lib/x86_64-linux-gnu/libpthread.so.0(+0xfcb0) [0x7f95d7f39cb0]
    ==48347== 2 /lib/x86_64-linux-gnu/libc.so.6(+0x1314d7) [0x7f95d7c9d4d7]
    ==48347== 3 goaccess(str2enum+0x39) [0x407679]
    ==48347== 4 goaccess(ignore_panel+0x28) [0x414ab8]
    ==48347== 5 goaccess() [0x40c697]
    ==48347== 6 goaccess() [0x40ccba]
    ==48347== 7 goaccess(main+0x1e0) [0x4055b0]
    ==48347== 8 /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) [0x7f95d7b8d76d]
    ==48347== 9 goaccess() [0x406205]
    ==48347==
    ==48347== Please report it by opening an issue on GitHub:
    ==48347== https://github.com/allinurl/goaccess/issues
    
    bug 
    opened by jps3 32
  • Issue processing multiple logs in real-time in version 1.4.1

    Issue processing multiple logs in real-time in version 1.4.1

    Hi @allinurl .

    I have deployed the new version 1.4.1 on my servers. Just yesterday, I had to do some statistics for a new partner. And everything was ok, for statistical LOGs.

    The good, the bad and the ugly

    However, this morning I checked the statistics for my sites in real-time. In all instances, I was overloaded to run GoAccess. What used to be only 40% of CPU processing in previous version, today is around 100%. The only exception was the WordPress instance, which has only 1 LOG.

    So, I found the prank, in the code below:

    static void
    perform_tail_follow (uint64_t * size1, const char *fn) {
    ...
      len = MIN (glog->snippetlen, size2);
      if ((fread (buf, len, 1, fp)) != 1 && ferror (fp))
        FATAL ("Unable to fread the specified log file '%s'", fn);
    
      if (glog->snippet[0] != '\0' && buf[0] != '\0' && memcmp (glog->snippet, buf, len) != 0)
        *size1 = glog->bytes = 0;
    

    The code looks ok. But, wait... There is only one variable/value glog-> snippet and I have 4 LOGs for real-time processing.

    Therefore, GoAccess will continue processing the last LOG. But for the other 3 LOGs it will always reprocess from the beginning. As I write here, my sites are registering more than 600 million requests for each. My Gosh.

    The problem does not occur for the processing of the LOGs in non-real-time, because the read_log routine calls set_initial_persisted_data ~~after each record~~ for each LOG, before process them. The same does not happen in perform_tail_follow and therefore glog-> snippet has the value of only the last LOG. And so, the behaviour will be ok if you have only 1 LOG.

    I hope I was clear. :)

    bug log-processing 
    opened by 0bi-w6n-K3nobi 31
  • Parse custom nginx JSON log

    Parse custom nginx JSON log

    I have a simple log but I cannot figure out how to parse the datetime component

    sample line

    {"time":1673085237.244,"client":"10.10.10.10","method":"GET","request":"GET /search/? HTTP/1.1","request_length":390,"status":"307","bytes_sent":238,"gzip_ratio":"","body_bytes_sent":0,"referer":"","user_agent":"python-requests/2.28.1","accept_encoding":"gzip","accept":"*/*","request_time":0.002}
    
    goaccess sample_access_logs.txt -o report.html --date-format %* --time-format %* --log-format '{"time":%T,"client":"%h","method":"%m","request":"%r","request_length":%^,"status":"%s","bytes_sent":"%b","gzip_ratio":"%^","body_bytes_sent":%^,"referer":"%R","user_agent":"%u","accept_encoding":"%^","accept":"%^","request_time":%L}'
    

    This produces the output

    A valid date is required
    Format Errors - Verify your log/date/time format
    

    If I change it to

    goaccess sample_access_logs.txt -o report.html --date-format %* --time-format %* --log-format '{"time":%d,"client":"%h","method":"%m","request":"%r","request_length":%^,"status":"%s","bytes_sent":"%b","gzip_ratio":"%^","body_bytes_sent":%^,"referer":"%R","user_agent":"%u","accept_encoding":"%^","accept":"%^","request_time":%L}'
    
    
    Token '1673085237.244' doesn't match specifier '%d'
    
    question log/date/time format 
    opened by mannickutd 1
  • vuln test requests flagged as invalid requests

    vuln test requests flagged as invalid requests

    requests like the following are ending up in the invalid-requests.log file (goaccess 1.7):

    aaa.bbb.ccc.ddd - - [31/Dec/2022:06:27:37 +0000] "GET /api/?a=proxy:unix:aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa|http://reddit.com/? HTTP/1.1" 404 3951 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2866.71 Safari/537.36" "-" 0.706 TLSv1.2
    

    not a big loss not being counted in the stats, but i don't see anything invalid about per se...

    log-processing 
    opened by minusf 1
  • Link release notes

    Link release notes

    Could you just add the static link to https://goaccess.io/release-notes in every new release (or label, as releases are not used)?

    This way one could easily check what actually has changed in a new version.

    question documentation 
    opened by bcutter 4
  • Multiple log file support

    Multiple log file support

    So i love goaccess so far and i've set it up via docker to monitor the log file of a nginx proxy manager instance. 👍

    The instance however has multiple log files that use different log formats and it seems (i could be wrong) that goaccess can't support that in the goaccess.conf file when using the real time html setup on a hosted port.

    This would be awesome to view multiple logs perhaps in a tabbed view. I'm more than happy to provide the default log formats that go access can read for this as i do have it working one at a time but not both.

    duplicate enhancement log/date/time format log-processing 
    opened by JS-E 2
  • Treat files as vhosts

    Treat files as vhosts

    Seeing stats per vhost is useful, especially to identify one site on a server that is being attacked or hogging resources.

    --files-as-vhosts when combined with multiple input files would effectively give this feature without modifying the log format.

    This would be especially useful on servers using control panels like cpanel or plesk where you could glob the input files like goaccess /var/log/apache2/domlogs// or goaccess /var/www/vhosts//logs/accesslog

    enhancement 
    opened by ollybee 6
  • Added GeoLite2-City.mmdb but showing country GeoLocation

    Added GeoLite2-City.mmdb but showing country GeoLocation

    I am using goaccess version - 1.6. with Build configure arguments: --enable-utf8 --enable-geoip=mmdb I have set the geoip-database in config file and also try to add --geoip-database=/usr/local/share/GeoIP/GeoLite2-City.mmdb in command line arguments but it always show Country database on HTML page. GEO LOCATION CONTINENT > COUNTRY SORTED BY UNIQUE HITS [, AVGTS, CUMTS, MAXTS]

    duplicate enhancement question 
    opened by joginder89 1
Owner
Gerardo O.
The one behind goaccess & gwsocket.
Gerardo O.
Real-time metrics for nginx server

ngxtop - real-time metrics for nginx server (and others) ngxtop parses your nginx access log and outputs useful, top-like, metrics of your nginx serve

Binh Le 6.4k Dec 22, 2022
System monitor - A python-based real-time system monitoring tool

System monitor A python-based real-time system monitoring tool Screenshots Installation Run My project with these commands pip install -r requiremen

Sachit Yadav 4 Feb 11, 2022
Was an interactive continuous Python profiler.

☠ This project is not maintained anymore. We highly recommend switching to py-spy which provides better performance and usability. Profiling The profi

What! Studio 3k Dec 27, 2022
Automatically monitor the evolving performance of Flask/Python web services.

Flask Monitoring Dashboard A dashboard for automatic monitoring of Flask web-services. Key Features • How to use • Live Demo • Feedback • Documentatio

null 663 Dec 29, 2022
Yet Another Python Profiler, but this time thread&coroutine&greenlet aware.

Yappi Yet Another Python Profiler, but this time thread&coroutine&greenlet aware. Highlights Fast: Yappi is fast. It is completely written in C and lo

Sümer Cip 1k Jan 1, 2023
🚴 Call stack profiler for Python. Shows you why your code is slow!

pyinstrument Pyinstrument is a Python profiler. A profiler is a tool to help you 'optimize' your code - make it faster. It sounds obvious, but to get

Joe Rickerby 5k Jan 1, 2023
A watch dog providing a piece in mind that your Chia farm is running smoothly 24/7.

Photo by Zoltan Tukacs on Unsplash Watchdog for your Chia farm So you've become a Chia farmer and want to maximize the probability of getting a reward

Martin Mihaylov 466 Dec 11, 2022
Watch your Docker registry project size, then monitor it with Grafana.

Watch your Docker registry project size, then monitor it with Grafana.

Nova Kwok 33 Apr 5, 2022
Middleware for Starlette that allows you to store and access the context data of a request. Can be used with logging so logs automatically use request headers such as x-request-id or x-correlation-id.

starlette context Middleware for Starlette that allows you to store and access the context data of a request. Can be used with logging so logs automat

Tomasz Wójcik 300 Dec 26, 2022
ASGI middleware to record and emit timing metrics (to something like statsd)

timing-asgi This is a timing middleware for ASGI, useful for automatic instrumentation of ASGI endpoints. This was developed at GRID for use with our

Steinn Eldjárn Sigurðarson 99 Nov 21, 2022
Cross-platform lib for process and system monitoring in Python

Home Install Documentation Download Forum Blog Funding What's new Summary psutil (process and system utilities) is a cross-platform library for retrie

Giampaolo Rodola 9k Jan 2, 2023
Development tool to measure, monitor and analyze the memory behavior of Python objects in a running Python application.

README for pympler Before installing Pympler, try it with your Python version: python setup.py try If any errors are reported, check whether your Pyt

null 996 Jan 1, 2023
Scalene: a high-performance, high-precision CPU and memory profiler for Python

scalene: a high-performance CPU and memory profiler for Python by Emery Berger 中文版本 (Chinese version) About Scalene % pip install -U scalene Scalen

Emery Berger 138 Dec 30, 2022
poetry2nix turns Poetry projects into Nix derivations without the need to actually write Nix expressions

poetry2nix poetry2nix turns Poetry projects into Nix derivations without the need to actually write Nix expressions. It does so by parsing pyproject.t

Nix community projects 405 Dec 29, 2022
Minecraft.nix - Command line Minecraft launcher managed by nix

minecraft.nix Inspired by this thread, this flake contains derivations of both v

null 12 Sep 6, 2022
Image-Viewer is a Windows image viewer based on Python 3.

Image-Viewer Hi! Image-Viewer is a Windows image viewer based on Python 3. Using You must download Image-Viewer.exe from the root of the repository. T

null 2 Apr 18, 2022
Napari 3D Ortho Viewer - an ortho viewer for napari for 3D images

napari-3d-ortho-viewer Napari 3D Ortho Viewer - an ortho viewer for napari for 3D images This napari plugin was generated with Cookiecutter using @nap

niklas netter 5 Nov 28, 2022
Vpw analyzer - A visual J1850 VPW analyzer written in Python

VPW Analyzer A visual J1850 VPW analyzer written in Python Requires Tkinter, Pan

null 7 May 1, 2022
lfb (light file browser) is a terminal file browser

lfb (light file browser) is a terminal file browser. The whole program is a mess as of now. In the feature I will remove the need for external dependencies, tidy up the code, make an actual readme, add documentation, and change the name.

null 2 Apr 9, 2022