Visualise Ansible execution time across playbooks, tasks, and hosts.

Overview

ansible-trace

Visualise where time is spent in your Ansible playbooks: what tasks, and what hosts, so you can find where to optimise and decrease playbook latency.

An Ansible Callback Function which traces the execution time of Ansible playooks, outputting Chrome's Trace Event Format for visualising in the Perfetto in-browser trace UI.

Here's a trace of me deploying to my home Raspberry Pi cluster, with the default strategy: linear. You can see that now all the tasks are synchronized across hosts, with each host waiting for the slowest host before proceeding to the next task:

Perfetto window showing tasks all happening synchronized

Here's the same playbook ran with strategy: free so fast hosts run to completion without waiting for slow hosts:

Perfetto window showing durations

You can click on tasks to see details about them:

Perfetto window showing details showing arguments and filename of task

Interactive Example

  1. Download (Right-click -> Save Link As) example-trace.json.
  2. Open https://ui.perfetto.dev/, and drag and drop in the downloaded example-trace.json.

Usage

  1. Copy trace.py into your Ansible's callback_plugins directory, or in other positions Ansible accepts, e.g.:

    ansible-root
    ├── ansible.cfg
    ├── site.yml
    └── callback_plugins
        └── trace.py
    
  2. Enable the trace callback plugin in your ansible.cfg:

    [defaults]
    callback_enabled = trace

    Or, enable it at the top of your playbook yml:

    ansible:
      env:
        CALLBACKS_ENABLED: trace
        TRACE_OUTPUT_DIR: .
        TRACE_HIDE_TASK_ARGUMENTS: True
  3. Run your Ansible Playbook:

    $ ansible-playbook site.yml

    This will output trace.json in the TRACE_OUTPUT_DIR, defaulting to your current working directory.

  4. Open https://ui.perfetto.dev/, and drag-and-drop in the trace.json.

    You don't have to wait for the trace to finish; you can open in-progress trace files.

Other Trace Viewers

Perfetto is the most mature trace viewer, but here are some other options:

  • chrome://tracing (aka Catapult Trace Viewer) is the older version of Perfetto. Supports generating a standalone HTML page.
  • Speedscope can open the traces, but only shows one host at a time.
  • Firefox Profiler can open the traces, showing trace spans in the "Marker Chart" tab: example.

Other Ansible Profiling Tools

profile_tasks

ansible.posix.profile_tasks displays task timing as console output, but can't visualise gaps in the timing (e.g. with strategy: linear when fast hosts wait for slow hosts).

ansible-playbook -vvvv

Adding extra vs adds more debug info, -vvvv enables connection debugging.

Mitogen for Ansible

Mitogen promises to speed up your Ansible playbooks with a persistent interpreter. They profile their runs for bandwidth an time by analysing network packet captures.

You need to install from HEAD to support latest Ansible versions, because there hasn't been a tagged release since 2019.

Comments
  • feat(integration tests): Utilities for testing ansible-trace

    feat(integration tests): Utilities for testing ansible-trace

    Hello everyone :wave: !

    This is the first step for #10

    This proposition allow creating tests using pytest framework where we can chose strategy, inventory for the same playbook file (see tests/integration/tests/basic.py) I have also made some utilities functions that parse trace to ensure its integrity. This include also related github action CI.yml (here a sample)

    Waiting for your feedback, Thanks!

    opened by louisquentinjoucla 2
  • Cannot open trace file with either firefox profiler or chrome

    Cannot open trace file with either firefox profiler or chrome

    Actual Behavior

    Chrome:

    Cannot open file. Error Message: image

    Firefox Profiler

    Cannot open file. Error Message: image

    Perfetto UI

    Trace is shown correctly: image

    Expected Behavior

    trace files can be open opened by all viewers that are mentioned in the readme, not just perfetto UI.

    Versions

    • ansible: ansible 4.6.0 ansible-core 2.11.6
    • ansible-trace: main commit 8f9270205172554f760fe8bc61089f6b855fdbef
    • python: Python 3.9.2

    simple playbook used for testing

    ---
    - hosts: localhost
      gather_facts: true
      tasks:
        - name: "output if not debian 11"
          debug:
            msg: "test"
          when: 'not (ansible_distribution == "Debian" and ansible_distribution_major_version == "11")'
    
        - name: "output if debian 11"
          debug:
            msg: "test"
          when: (ansible_distribution == "Debian" and ansible_distribution_major_version == "11")
    

    output of trace callback for test playbook

    trace.json

    [
    {
      "args": {
        "name": "localhost"
      },
      "cat": "process",
      "name": "process_name",
      "ph": "M",
      "pid": 1
    }[
    {
      "args": {
        "args": null,
        "host": "localhost",
        "path": "/home/user/repos/ansible-test/testenv-test.yml:2",
        "task": "Gathering Facts"
      },
      "cat": "runner",
      "id": 6508616308570991776,
      "name": "Gathering Facts",
      "ph": "B",
      "pid": 1,
      "ts": 1634299695390039.8
    }[
    {
      "args": {
        "status": "ok"
      },
      "cat": "runner",
      "id": 6508616308570991776,
      "name": "Gathering Facts",
      "ph": "E",
      "pid": 1,
      "ts": 1634299696186142.0
    }[
    {
      "args": {
        "args": null,
        "host": "localhost",
        "path": "/home/user/repos/ansible-test/testenv-test.yml:5",
        "task": "output if not debian 11"
      },
      "cat": "runner",
      "id": 746051916975514963,
      "name": "output if not debian 11",
      "ph": "B",
      "pid": 1,
      "ts": 1634299696197881.8
    }[
    {
      "args": {
        "status": "skipped"
      },
      "cat": "runner",
      "id": 746051916975514963,
      "name": "output if not debian 11",
      "ph": "E",
      "pid": 1,
      "ts": 1634299696210522.0
    }[
    {
      "args": {
        "args": null,
        "host": "localhost",
        "path": "/home/user/repos/ansible-test/testenv-test.yml:10",
        "task": "output if debian 11"
      },
      "cat": "runner",
      "id": 1296121251497600435,
      "name": "output if debian 11",
      "ph": "B",
      "pid": 1,
      "ts": 1634299696215914.5
    }[
    {
      "args": {
        "status": "ok"
      },
      "cat": "runner",
      "id": 1296121251497600435,
      "name": "output if debian 11",
      "ph": "E",
      "pid": 1,
      "ts": 1634299696230237.2
    }
    ]
    

    modified trace file that i could open with perfettoUI, chrome and firefox profiler

    [
    {
      "args": {
        "name": "localhost"
      },
      "cat": "process",
      "name": "process_name",
      "ph": "M",
      "pid": 1
    },
    {
      "args": {
        "args": null,
        "host": "localhost",
        "path": "/home/user/repos/ansible-test/testenv-test.yml:2",
        "task": "Gathering Facts"
      },
      "cat": "runner",
      "id": 6508616308570991776,
      "name": "Gathering Facts",
      "ph": "B",
      "pid": 1,
      "ts": 1634299695390039.8
    },
    {
      "args": {
        "status": "ok"
      },
      "cat": "runner",
      "id": 6508616308570991776,
      "name": "Gathering Facts",
      "ph": "E",
      "pid": 1,
      "ts": 1634299696186142.0
    },
    {
      "args": {
        "args": null,
        "host": "localhost",
        "path": "/home/user/repos/ansible-test/testenv-test.yml:5",
        "task": "output if not debian 11"
      },
      "cat": "runner",
      "id": 746051916975514963,
      "name": "output if not debian 11",
      "ph": "B",
      "pid": 1,
      "ts": 1634299696197881.8
    },
    {
      "args": {
        "status": "skipped"
      },
      "cat": "runner",
      "id": 746051916975514963,
      "name": "output if not debian 11",
      "ph": "E",
      "pid": 1,
      "ts": 1634299696210522.0
    },
    {
      "args": {
        "args": null,
        "host": "localhost",
        "path": "/home/user/repos/ansible-test/testenv-test.yml:10",
        "task": "output if debian 11"
      },
      "cat": "runner",
      "id": 1296121251497600435,
      "name": "output if debian 11",
      "ph": "B",
      "pid": 1,
      "ts": 1634299696215914.5
    },
    {
      "args": {
        "status": "ok"
      },
      "cat": "runner",
      "id": 1296121251497600435,
      "name": "output if debian 11",
      "ph": "E",
      "pid": 1,
      "ts": 1634299696230237.2
    }
    ]
    
    opened by flodabo 2
  • feat(play): Add play level trace span

    feat(play): Add play level trace span

    Hey :wave: ,

    This PR close #3 ! Here how it does looks like with all playbooks with strategy free: image

    It is not accurate as we lack of callback to achieve this, but here how it is done: The play duration event will only be created right before when a task for a given host will start. At the start of a new play, we add an E event for all the plays we created before.

    We do one as well at the very end to handle the last play condition.

    Even if it is not accurate, it gives a better pictures of what is going on.

    Let me know if it is ok for you!

    opened by louisquentinjoucla 1
  • Handle task name interpolation

    Handle task name interpolation

    Hi everyone :wave: ,

    With Ansible you are able to interpolate task name in certain circumstances. One is when you use include_tasks, you are able to interpolate your tasks name in the included file like the following:

    ---
    
    - name: "Echo {{ include_message }}"
      shell: "echo {{ include_message }}"
    

    Unfortunately the task name is not yet interpolated in hook v2_runner_on_start for linear strategy, which cause B event to have a different name than the E event. As the E & B events name differs, the end is not catch, here is an example:

    image

    Fortunately v2_playbook_on_task_start hook (which is triggered before v2_runner_on_start) returns the interpolated name. We could use that.

    opened by louisquentinjoucla 1
  • Support for python 3.6

    Support for python 3.6

    I have been experiencing the following warning with ansible-trace using python3.6:

    [WARNING]: Failure using method (v2_runner_on_failed) in callback plugin (<ansible_collections.mhansen.ansible_trace.plugins.callback.trace.CallbackModule object at 0x7fcddc736358>):
    module 'time' has no attribute 'time_ns'
    

    It causes not having trace at all, such as:

    [
    {
      "args": {
        "name": "localhost"
      },
      "cat": "process",
      "name": "process_name",
      "ph": "M",
      "pid": 1
    }
    ]
    

    This issue is caused by the fact that python have implemented time.time_ns() since 3.7 only.

    For python 3.6 we could be using time.time() * 1000000 instead of time.time_ns() / 1000 when time_ns is not present in time package, such as:

    "ts": time.time_ns() / 1000 if "time_ns" in time.__dict__ else time.time() * 100000,
    

    for the follwing lines:

    • https://github.com/mhansen/ansible-trace/blob/main/plugins/callback/trace.py#L120
    • https://github.com/mhansen/ansible-trace/blob/main/plugins/callback/trace.py#L142

    The precision level seems to be good enough to don't impact the behavior of ansible-trace:

    image

    opened by Xisabla 1
  • keep multiple traces

    keep multiple traces

    Currently this plugin keep only one trace.json, thus not allowing to run multiple plays and have the trace for each of them.

    I suggest to :

    • change the default output directory to ./trace (or something else) to avoid cluttering the current dir with too many files
    • change the output file to something like play-<timestamp>-<playbook_name>.json
    • add an option to set a rotation for these trace files (ie. keep only 30 of them)

    I already have proof of concept locally, if you are OK with i can submit a pull request that we can discuss further.

    opened by nikaro 0
  • Wrong time compute

    Wrong time compute

    :wave: , To compute the timestamp (in microseconds) we use:

    time.time_ns() / 1000  if "time_ns" in time.__dict__ else time.time() * 100000
    

    While the first one does the right conversion, the other one does not as time.time() gives time as second. So we need 10^6 and not 10^5

    opened by louisquentinjoucla 1
  • feat(spans): Add role level spans, item spans, rework test

    feat(spans): Add role level spans, item spans, rework test

    :wave: !

    Close #18 Close #1 Close #23

    Rework tests to take in account there can be multiple duration events with the same uuid As we are lacking of callback this is mostly hack atm

    Here some examples: image image

    And here an use case:

    image

    Sorry for the noise, with all tests will fix it asap, Waiting for your review !

    opened by louisquentinjoucla 0
  • Add role level

    Add role level

    :wave: everyone !

    I'd like to do like #3 but for roles. I cannot see hooks that would allow us to do that easily, however I think we could manage it with some hacks.

    opened by louisquentinjoucla 1
  • Add trace spans for SSH connection

    Add trace spans for SSH connection

    Judging by how many blog posts there are about pipelining, this is an important aspect of overall latency.

    This definitely needs more callbacks though, we just don't have the hooks to achieve this yet.

    opened by mhansen 0
Owner
Mark Hansen
Mark Hansen
Here I plotted data for the average test scores across schools and class sizes across school districts.

HW_02 Here I plotted data for the average test scores across schools and class sizes across school districts. Average Test Score by Race This graph re

null 7 Oct 27, 2021
A python script to visualise explain plans as a graph using graphviz

README Needs to be improved Prerequisites Need to have graphiz installed on the machine. Refer to https://graphviz.readthedocs.io/en/stable/manual.htm

Edward Mallia 1 Sep 28, 2021
Visualise top-rated GitHub repositories in a barchart by keyword

This python script was written for simple purpose -- to visualise top-rated GitHub repositories in a barchart by keyword. Script generates html-page with barchart and information about repository owner, description, create date and update date.

Cur1iosity 2 Feb 7, 2022
Visualizing weather changes across the world using third party APIs and Python.

WEATHER FORECASTING ACROSS THE WORLD Overview Python scripts were created to visualize the weather for over 500 cities across the world at varying di

G Johnson 0 Jun 12, 2021
Domain Connectivity Analysis Tools to analyze aggregate connectivity patterns across a set of domains during security investigations

DomainCAT (Domain Connectivity Analysis Tool) Domain Connectivity Analysis Tool is used to analyze aggregate connectivity patterns across a set of dom

DomainTools 34 Dec 9, 2022
Glue is a python project to link visualizations of scientific datasets across many files.

Glue Glue is a python project to link visualizations of scientific datasets across many files. Click on the image for a quick demo: Features Interacti

null 675 Dec 9, 2022
A dashboard built using Plotly-Dash for interactive visualization of Dex-connected individuals across the country.

Dashboard For The DexConnect Platform of Dexterity Global Working prototype submission for internship at Dexterity Global Group. Dashboard for real ti

Yashasvi Misra 2 Jun 15, 2021
Friday Night Funkin - converts a chart from 4/4 time to 6/8 time, or from regular to swing tempo.

Chart to swing converter As seen in https://twitter.com/i_winxd/status/1462220493558366214 A program written in python that converts a chart from 4/4

null 5 Dec 23, 2022
Project coded in Python using Pandas to look at changes in chase% for batters facing a pitcher first time through the order vs. thrid time

Project coded in Python using Pandas to look at changes in chase% for batters facing a pitcher first time through the order vs. thrid time

Jason Kraynak 1 Jan 7, 2022
Smoking Simulation is an app to simulate the spreading of smokers and non-smokers, their interactions and population during certain amount of time.

Smoking Simulation is an app to simulate the spreading of smokers and non-smokers, their interactions and population during certain

Bohdan Ruban 5 Nov 8, 2022
LabGraph is a a Python-first framework used to build sophisticated research systems with real-time streaming, graph API, and parallelism.

LabGraph is a a Python-first framework used to build sophisticated research systems with real-time streaming, graph API, and parallelism.

MLH Fellowship 7 Oct 5, 2022
Movies-chart - A CLI app gets the top 250 movies of all time from imdb.com and the top 100 movies from rottentomatoes.com

movies-chart This CLI app gets the top 250 movies of all time from imdb.com and

null 3 Feb 17, 2022
BrowZen correlates your emotional states with the web sites you visit to give you actionable insights about how you spend your time browsing the web.

BrowZen BrowZen correlates your emotional states with the web sites you visit to give you actionable insights about how you spend your time browsing t

Nick Bild 36 Sep 28, 2022
A minimalistic wrapper around PyOpenGL to save development time

glpy glpy is pyOpenGl wrapper which lets you work with pyOpenGl easily.It is not meant to be a replacement for pyOpenGl but runs on top of pyOpenGl to

Abhinav 9 Apr 2, 2022
This plugin plots the time you spent on a tag as a histogram.

This plugin plots the time you spent on a tag as a histogram.

Tom Dörr 7 Sep 9, 2022
The Spectral Diagram (SD) is a new tool for the comparison of time series in the frequency domain

The Spectral Diagram (SD) is a new tool for the comparison of time series in the frequency domain. The SD provides a novel way to display the coherence function, power, amplitude, phase, and skill score of discrete frequencies of two time series. Each SD summarises these quantities in a single plot for multiple targeted frequencies.

Mabel 3 Oct 10, 2022
Python ts2vg package provides high-performance algorithm implementations to build visibility graphs from time series data.

ts2vg: Time series to visibility graphs The Python ts2vg package provides high-performance algorithm implementations to build visibility graphs from t

Carlos Bergillos 26 Dec 17, 2022
Arras.io Highest Scores Over Time Bar Chart Race

Arras.io Highest Scores Over Time Bar Chart Race This repo contains a python script (make_racing_bar_chart.py) that can generate a csv file which can

Road 2 Jan 16, 2022
Time series visualizer is a flexible extension that provides filling world map by country from real data.

Time-series-visualizer Time series visualizer is a flexible extension that provides filling world map by country from csv or json file. You can know d

Long Ng 3 Jul 9, 2021