Diamond is a python daemon that collects system metrics and publishes them to Graphite (and others). It is capable of collecting cpu, memory, network, i/o, load and disk metrics. Additionally, it features an API for implementing custom collectors for gathering metrics from almost any source.

Related tags

Monitoring Diamond
Overview

Diamond

Join the chat at https://gitter.im/python-diamond/Diamond Build Status

Diamond is a python daemon that collects system metrics and publishes them to Graphite (and others). It is capable of collecting cpu, memory, network, i/o, load and disk metrics. Additionally, it features an API for implementing custom collectors for gathering metrics from almost any source.

Getting Started

Steps to getting started:

  • Read the documentation
  • Install via pip install diamond. The releases on GitHub are not recommended for use. Use pypi-install diamond on Debian/Ubuntu systems with python-stdeb installed to build packages.
  • Copy the diamond.conf.example file to diamond.conf.
  • Optional: Run diamond-setup to help set collectors in diamond.conf.
  • Modify diamond.conf for your needs.
  • Run diamond with one of: diamond or initctl start diamond or /etc/init.d/diamond restart.

Success Stories

  • Diamond has successfully been deployed to a cluster of 1000 machines pushing 3 million points per minute.
  • Diamond is deployed on Fabric's infrastructure, polling hundreds of metric sources and pushing millions of points per minute.
  • Have a story? Please share!

Repos

Historically Diamond was a brightcove project and hosted at BrightcoveOS. However none of the active developers are brightcove employees and so the development has moved to python-diamond. We request that any new pull requests and issues be cut against python-diamond. We will keep BrightcoveOS updated and still honor issues/tickets cut on that repo.

Diamond Related Projects

Contact

Comments
  • Removing dev/proc/sys mount restriction from DiskSpaceCollector

    Removing dev/proc/sys mount restriction from DiskSpaceCollector

    We're trying to collect data about a tmpfs mount at /dev and under /sys and diskspace.py will skip right over those mount points. There's no comment as to why this is done, unlike the previous checks, and given issue#262 I wouldn't be surprised if the original purpose has been lost. We aren't seeing any issue when we manually do the heavy lifting:

    >>> def test(string):
    ...     stat = os.stat(string)
    ...     major = os.major(stat.st_dev)
    ...     minor = os.minor(stat.st_dev)
    ...     print stat
    ...     print major
    ...     print minor
    ...
    >>> test('/dev')
    posix.stat_result(st_mode=16877, st_ino=1025, st_dev=5L, st_nlink=14, st_uid=0, st_gid=0, st_size=4280, st_atime=1456860792, st_mtime=1460050361, st_ctime=1460050361)
    0
    5
    >>> test('/run')
    posix.stat_result(st_mode=16877, st_ino=1212, st_dev=16L, st_nlink=20, st_uid=0, st_gid=0, st_size=700, st_atime=1456868938, st_mtime=1460151861, st_ctime=1460151861)
    0
    16
    

    This check should be removed if it doesn't have a purpose. If it does then there should at least be a comment or abstraction to explain why this restriction is in place.

    To give some context, these are our tmpfs mount locations that would be great to monitor:

    $ sudo cat /proc/mounts | grep tmpfs
    udev /dev devtmpfs rw,relatime,size=24650316k,nr_inodes=6162579,mode=755 0 0
    tmpfs /run tmpfs rw,nosuid,noexec,relatime,size=4932236k,mode=755 0 0
    none /sys/fs/cgroup tmpfs rw,relatime,size=4k,mode=755 0 0
    none /run/lock tmpfs rw,nosuid,nodev,noexec,relatime,size=5120k 0 0
    none /run/shm tmpfs rw,nosuid,nodev,relatime 0 0
    none /run/user tmpfs rw,nosuid,nodev,noexec,relatime,size=102400k,mode=755 0 0
    
    type: enhancement category: collector collector: diskspace 
    opened by JScott 31
  • Add regex bean matching, and regex key search/replace to normalize names

    Add regex bean matching, and regex key search/replace to normalize names

    This patch adds two more config items to the JolokiaCollector.conf mbeansre: works like mbeans: but matches with regularexpressions rewrite: which allows rewrite pairs in order to rename collected keys before data is passed to the handler

    opened by sbrynen 23
  • python-2.4 is not working

    python-2.4 is not working

    The docs claim that python-2.4 is supported (eg on RHEL5) but diamond fails with:

    ERROR: Failed to set UID/GID. 'module' object has no attribute 'initgroups'

    /usr/bin/diamond:204: os.initgroups(pwd.getpwuid(uid).pw_name, gid)

    type: bug 
    opened by bhepple 18
  • Update InfluxDBHandler for post InfluxDB 0.9

    Update InfluxDBHandler for post InfluxDB 0.9

    • Fix for ticket #297 where InfluxDBHandler was formatting metrics based on an older version of influxdb-python
    • Maintains support for InfluxDB version 0.8 by way of an additional 'influxdb_version' attribute in the config.
    • Reformats measurement schema to be more useful for InfluxDB 0.9's removal of merge and joins.
    • Adds several test cases for the handler.
    type: enhancement category: handler needs: rebase handler: influxdb 
    opened by cj-dimaggio 17
  • Kafka Collector error with urllib2

    Kafka Collector error with urllib2

    Hi, I try connect kafka collection with kafka 0.8.2.

    [2015-07-17 11:44:56,298] [MainThread] '' Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/diamond/collector.py", line 472, in _run self.collect() File "/usr/share/diamond/collectors/kafkastat/kafkastat.py", line 163, in collect match = self.get_mbeans(pattern) File "/usr/share/diamond/collectors/kafkastat/kafkastat.py", line 84, in get_mbeans mbeans = self._get('/serverbydomain', query_args) File "/usr/share/diamond/collectors/kafkastat/kafkastat.py", line 70, in _get response = urllib2.urlopen(url) File "/usr/lib64/python2.7/urllib2.py", line 127, in urlopen return _opener.open(url, data, timeout) File "/usr/lib64/python2.7/urllib2.py", line 404, in open response = self._open(req, data) File "/usr/lib64/python2.7/urllib2.py", line 422, in _open '_open', req) File "/usr/lib64/python2.7/urllib2.py", line 382, in _call_chain result = func(*args) File "/usr/lib64/python2.7/urllib2.py", line 1216, in http_open return self.do_open(httplib.HTTPConnection, req) File "/usr/lib64/python2.7/urllib2.py", line 1189, in do_open r = h.getresponse(buffering=True) File "/usr/lib64/python2.7/httplib.py", line 1045, in getresponse response.begin() File "/usr/lib64/python2.7/httplib.py", line 409, in begin version, status, reason = self._read_status() File "/usr/lib64/python2.7/httplib.py", line 373, in _read_status raise BadStatusLine(line) BadStatusLine: ''

    Any idea?

    status: fix-provided collector: kafka 
    opened by maauso 17
  • Current release

    Current release "has problems".

    I had heard good things about Diamond, and I'm not that happy with collectd, so I thought I'd give it a look.

    First problem, there are absolutely no docs. No problem I thought, I'll write a GettingStarted and submit a pullreq for it. Except that I could never get started myself.

    I notice there is a "debian" subdirectory, so I try a debuild, and find that the 4.0 release produces a debian file named "3.1.0". Submitted a PR.

    Then I start up Diamond after copying the example config over to diamond.conf and I'm getting a Traceback that the handlers list object has no split(',') method. Ooook, so that's supposed to be a string instead? I quote it in the config file, but then I'm running into more problems.

    I finally decide to grab the latest code and see if that worked without quotes on the handlers line. It does and at least I have it writing to the archive handler.

    Buuut... The Influxdb handler doesn't seem to be doing anything. No logs I can see to say why, just silently isn't doing anything.

    Recommendations:

    You probably at least need to tag a new release, fix the debian/changelog file, and put out some sort of docs.

    type: bug category: collector status: fix-provided handler: influxdb 
    opened by linsomniac 15
  • Diamond hemorrhages memory when Graphite server is inaccessible

    Diamond hemorrhages memory when Graphite server is inaccessible

    I recently ran into a situation on one of my hosts where Diamond 4.0 series was consuming over 3 GB of memory after being unable to connect to Graphite for a couple of hours -- RES in htop appears to grow by 2-3 MB per minute. I'm not sure if this is an issue with unbounded write buffer growth or leaked objects in connection handling, but it presents a serious threat to system stability.

    type: bug 
    opened by jgoldschrafe 15
  • Add Basic/Shield auth to elasticsearch collector

    Add Basic/Shield auth to elasticsearch collector

    Current implementation of Elasticsearch collector has no ability to authenticate against Shield plugin for Elasticsearch, so I've added optional parameters to authenticate with. I'm using this version of collector in prod right now and it works just fine.

    type: enhancement category: collector collector: elasticsearch 
    opened by okushchenko 14
  • TSDB basic authorization, gzip, batch, prefix

    TSDB basic authorization, gzip, batch, prefix

    Added some features to the TSDB Handler basic authorization: simple header with user and password for firewall protection gzip: added gzip support to compress metrics batch: added support to send metrics in batch prefix: you can add a prefix to all you metrics like diamond.myhostname.cpu.cpu_count

    All these features can be disabled by not defining or setting a value lower than 1.

    Sending metrics in batch can give you quite a nice performance boost and lower you cpu load. Compressing will reduce the buffer size you need to configure on the tsdb end.

    On the negative side this will break the recently added tests and I was not able to recreate them. I would need some help as there is currently no handler (using urllib2) with tests. But I didn't want to keep these changes for my self. batch This graph shows you how long it took to send a metric to the db.

    type: enhancement handler: tsdb 
    opened by Grotax 13
  • Allow precision to be set in nginx collector

    Allow precision to be set in nginx collector

    This patch allows us to set precision in the Nginx collector's config file. Eg:

    enabled = True
    precision = 2
    req_port = 9080
    

    Changing the precision from 0 to 2 resulted in this change on a testing system when viewing the data in Grafana:

    asdf

    Note that the number of requests/sec hasn't changed, just the precision config value.

    When you hover over a datapoint in Grafana:

    • Some point in time before the patch:

      asdf

    • After the patch and changing the config value (with requests coming in at the same rate):

      asdf

    type: enhancement category: collector collector: nginx 
    opened by scottcunningham 13
  • SNMP collector not working

    SNMP collector not working

    [root@graphite collectors]# cat SNMPInterfaceCollector.conf enabled = True path_suffix = "" retries = 3 measure_collector_time = False byte_unit = byte timeout = 5

    path = interface interval = 60

    [devices] [fw01]] host = 192.168.1.1 port = 161 community = public

    here goes logs

    Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/diamond/utils/scheduler.py", line 70, in collector_process collector._run() File "/usr/lib/python2.6/site-packages/diamond/collector.py", line 472, in _run self.collect() File "/usr/lib/python2.6/site-packages/diamond/collector.py", line 366, in collect raise NotImplementedError() NotImplementedError [2015-07-08 13:56:56,318] [MainThread] Collector failed! Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/diamond/utils/scheduler.py", line 70, in collector_process collector._run() File "/usr/lib/python2.6/site-packages/diamond/collector.py", line 472, in _run self.collect() File "/usr/lib/python2.6/site-packages/diamond/collector.py", line 366, in collect raise NotImplementedError() NotImplementedError [2015-07-08 13:56:58,332] [MainThread] Collector failed! Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/diamond/utils/scheduler.py", line 70, in collector_process collector._run() File "/usr/lib/python2.6/site-packages/diamond/collector.py", line 472, in _run self.collect() File "/usr/lib/python2.6/site-packages/diamond/collector.py", line 366, in collect raise NotImplementedError() NotImplementedError

    type: bug category: collector status: fix-provided 
    opened by harishva 13
  • fix counter metric spike using statsD handler

    fix counter metric spike using statsD handler

    • Current implementation in statsD handler may work fine for general use case where the counter metric is sent from application directly to statsD (push), but in our prod env we have another stats application which pulls metrics from multiple applications and send to statsD, similar to how Promethus works (pull), and thus it will cause spike in the first data point every time when this stats application starts because the difference calculated automatically will be wrong if the actual application is already running for long time
    • The fix is to adding an option for the caller to manually override the counter value using the value attribute but the internal update to the old value map will still use the raw value attribute
    opened by cdong8812 0
  • [wip] Python 3

    [wip] Python 3

    It felt kinda weird that there was absolutely nothing in this repo about how to run Diamond using Python 3. Like many others I recently upgraded to Ubuntu 20.04 which does not ship with a fully formed python 2.7 environment so I gave a stab at making it work with Python 3.

    This is very much a WIP but I got it running fully for our setup.

    How to install for you:

    pip3 install distro
    pip3 install git+https://github.com/feederco/Diamond.git@d8429765009fdf115c85aca09a5f2c1b570f8078
    

    Since platform.distro() was deprecated in 3.8 which setup.py relied on it now uses distro, which is a pip module that needs to be installed before running pip install diamond.

    Todo:

    • Tests still not ported
    • So far only tested in production with MySQL collector and hostedgraphite and archive handler
    • This branch is all-out p3k, so we'd need to figure out how to interop these two languages, or tbh just make a fork, or keep python2 support in a branch, because python 2 seems pretty ded

    References #396

    type: bug type: py3k 
    opened by erkie 4
  • Unable lo load second handler metrices

    Unable lo load second handler metrices

    Hi Team,

    i am having one handler in test-agent.conf, when i trying to add another andler [[TestHandler2]] the metrices is not loading can some one help me on this

    [handlers]

    daemon logging handler(s)

    keys = rotated_file

    Defaults options for all Handlers

    [[default]]

    [[TestHandler]]

    abc URL to post the metrics

    url = abac

    abc Datasource api key

    api_key = *******

    opened by rbellary-vi 0
  • diamond fails to run at boot with

    diamond fails to run at boot with "Name or service not known"

    On CentOS7 using diamond-4.0.515 with the included systemd service file, diamond always fails to run at boot but starts fine by a hand later.

    The error in the log is:

    
    [2020-10-23 16:07:50,198] [MainThread] Unhandled exception: [Errno -2] Name or service not known
    [2020-10-23 16:07:50,199] [MainThread] traceback: Traceback (most recent call last):
      File "/usr/local/diamond/bin/diamond", line 298, in main
        server.run()
      File "/usr/local/diamond/lib/python2.7/site-packages/diamond/server.py", line 108, in run
        self.handlers = load_handlers(self.config, handlers)
      File "/usr/local/diamond/lib/python2.7/site-packages/diamond/utils/classes.py", line 89, in load_handlers
        h = cls(handler_config)
      File "/usr/local/diamond/lib/python2.7/site-packages/diamond/handler/stats_d.py", line 66, in __init__
        self._connect()
      File "/usr/local/diamond/lib/python2.7/site-packages/diamond/handler/stats_d.py", line 161, in _connect
        port=self.port
      File "/usr/lib/python2.7/site-packages/statsd/client.py", line 139, in __init__
        host, port, fam, socket.SOCK_DGRAM)[0]
    gaierror: [Errno -2] Name or service not known
    

    I tried adding the following to [Unit] in /etc/systemd/system/diamond.service but it made no difference

    After=network.target

    I then changed that to the following and it did work:

    After=network.target remote-fs.target nss-lookup.target

    needs: patch 
    opened by paulraines68 2
  • Problem with GlusterFS

    Problem with GlusterFS

    Hi.

    I just fought with DiskSpaceCollector because it didn't report the disk usage for a fuse.glusterfs mounted directory.

    The first problem (trivial) was that in the default config "gluster" is given instead of "fuse.gluster" . No problem, corrected.

    But even after correction the metrics were silently dropped. I found that the if at diskspace.py:153 assumes the "device" starts with a '/'. Too bad, usually mountpoints for gluster use the format srv1[,srv2]:volume_name

    It's possible to use /srv1[,srv2]:volume_name (with a leading '/') but I think it's quite uncommon. I made that if a no-op (adding as first condition "(1==1) or" ), but I think it could be better to consider the GlusterFS case and/or give a meaningful message stating why that mountpoint is getting discarded.

    HIH

    opened by NdK73 1
System monitor - A python-based real-time system monitoring tool

System monitor A python-based real-time system monitoring tool Screenshots Installation Run My project with these commands pip install -r requiremen

Sachit Yadav 4 Feb 11, 2022
Development tool to measure, monitor and analyze the memory behavior of Python objects in a running Python application.

README for pympler Before installing Pympler, try it with your Python version: python setup.py try If any errors are reported, check whether your Pyt

null 996 Jan 1, 2023
Monitor Memory usage of Python code

Memory Profiler This is a python module for monitoring memory consumption of a process as well as line-by-line analysis of memory consumption for pyth

null 3.7k Dec 30, 2022
Monitor Memory usage of Python code

Memory Profiler This is a python module for monitoring memory consumption of a process as well as line-by-line analysis of memory consumption for pyth

Fabian Pedregosa 80 Nov 18, 2022
ASGI middleware to record and emit timing metrics (to something like statsd)

timing-asgi This is a timing middleware for ASGI, useful for automatic instrumentation of ASGI endpoints. This was developed at GRID for use with our

Steinn Eldjárn Sigurðarson 99 Nov 21, 2022
Real-time metrics for nginx server

ngxtop - real-time metrics for nginx server (and others) ngxtop parses your nginx access log and outputs useful, top-like, metrics of your nginx serve

Binh Le 6.4k Dec 22, 2022
Exports osu! user stats to prometheus metrics for a specified set of users

osu! to prometheus exporter This tool exports osu! user statistics into prometheus metrics for a specified set of user ids. Just copy the config.json.

Peter Oettig 1 Feb 24, 2022
Cross-platform lib for process and system monitoring in Python

Home Install Documentation Download Forum Blog Funding What's new Summary psutil (process and system utilities) is a cross-platform library for retrie

Giampaolo Rodola 9k Jan 2, 2023
Glances an Eye on your system. A top/htop alternative for GNU/Linux, BSD, Mac OS and Windows operating systems.

Glances - An eye on your system Summary Glances is a cross-platform monitoring tool which aims to present a large amount of monitoring information thr

Nicolas Hennion 22k Jan 4, 2023
Prometheus instrumentation library for Python applications

Prometheus Python Client The official Python 2 and 3 client for Prometheus. Three Step Demo One: Install the client: pip install prometheus-client Tw

Prometheus 3.2k Jan 7, 2023
Automatically monitor the evolving performance of Flask/Python web services.

Flask Monitoring Dashboard A dashboard for automatic monitoring of Flask web-services. Key Features • How to use • Live Demo • Feedback • Documentatio

null 663 Dec 29, 2022
Sampling profiler for Python programs

py-spy: Sampling profiler for Python programs py-spy is a sampling profiler for Python programs. It lets you visualize what your Python program is spe

Ben Frederickson 9.5k Jan 8, 2023
Yet Another Python Profiler, but this time thread&coroutine&greenlet aware.

Yappi Yet Another Python Profiler, but this time thread&coroutine&greenlet aware. Highlights Fast: Yappi is fast. It is completely written in C and lo

Sümer Cip 1k Jan 1, 2023
Line-by-line profiling for Python

line_profiler and kernprof NOTICE: This is the official line_profiler repository. The most recent version of line-profiler on pypi points to this repo

OpenPyUtils 1.6k Dec 31, 2022
🚴 Call stack profiler for Python. Shows you why your code is slow!

pyinstrument Pyinstrument is a Python profiler. A profiler is a tool to help you 'optimize' your code - make it faster. It sounds obvious, but to get

Joe Rickerby 5k Jan 1, 2023
Visual profiler for Python

vprof vprof is a Python package providing rich and interactive visualizations for various Python program characteristics such as running time and memo

Nick Volynets 3.9k Dec 19, 2022
Was an interactive continuous Python profiler.

☠ This project is not maintained anymore. We highly recommend switching to py-spy which provides better performance and usability. Profiling The profi

What! Studio 3k Dec 27, 2022
pprofile + matplotlib = Python program profiled as an awesome heatmap!

pyheat Profilers are extremely helpful tools. They help us dig deep into code, find and understand performance bottlenecks. But sometimes we just want

Vishwas B Sharma 735 Dec 27, 2022
GoAccess is a real-time web log analyzer and interactive viewer that runs in a terminal in *nix systems or through your browser.

GoAccess What is it? GoAccess is an open source real-time web log analyzer and interactive viewer that runs in a terminal on *nix systems or through y

Gerardo O. 15.6k Jan 2, 2023