A little logger for machine learning research

Overview

Build Status codecov Docs License PyPI version

dowel

dowel is a little logger for machine learning research.

Installation

pip install dowel

Usage

import dowel
from dowel import logger, tabular

logger.add_output(dowel.StdOutput())
logger.add_output(dowel.TensorBoardOutput('tensorboard_logdir'))

logger.log('Starting up...')
for i in range(1000):
    logger.push_prefix('itr {}'.format(i))
    logger.log('Running training step')

    tabular.record('itr', i)
    tabular.record('loss', 100.0 / (2 + i))
    logger.log(tabular)

    logger.pop_prefix()
    logger.dump_all()

logger.remove_all()
Comments
  • Add custom x axes to TensorBoard

    Add custom x axes to TensorBoard

    This PR adds support for custom x-axes to the TensorBoard scalar plot. The axis names are specified in the tensorboard output constructor.

    class TensorBoardOutput(LogOutput):
        def __init__(self,
                     log_dir,
                     x_axes=None,
                     flush_secs=120,
                     histogram_samples=1e3): 
    

    When x_axes is None, it falls back to use iterations as the x-axis. If any x_axis is not present in the scalar tabular. A warning will be logged to the console. If all x_axes are not present, it falls back to use iteration as the x-axis.

    Screenshots of an experiment with Epoch and TotalEnvSteps as x-axes. Screenshot from 2019-11-22 20-43-24 Screenshot from 2019-11-22 20-43-43

    I will open a separate PR in garage repo to set TotalEnvSteps as the default x-axis.

    opened by naeioi 6
  • Add Python 3.8 build

    Add Python 3.8 build

    The current issue #45 is caused by inconsistent keys in tabularInput with csvOutput headers. To fix this:

    1. Read the existing log file
    2. Update the csv Dictwriter with the new header using union
    • Union of tabularInput's and csvOutput's header is used to make sure all keys from both instances are captured
    1. Write the same log file with new key headers and old data. If the value of new key is missing, the cell is left blank.

    The cons of this solution:

    • The csv log file is read and write to update the inconsistent key headers. It may be time-consuming to handle a large number of csv log files because of the heavy I/O. To ensure the I/O is as efficient as possible, all file reading and writing is done in RAM and file is closed upon used.

    A better approach can be:

    • Expand the header on-the-go as a new key(s) is encountered. Write the log file with the new header exactly once.

    rlworkgroup#45 @avnishn @zequnyu

    opened by irisliucy 3
  • Ignore packages pre-cached by travis

    Ignore packages pre-cached by travis

    Travis keeps several packages like numpy pre-installed to speed up builds but this sometimes leads to wrong version being picked up by pip.

    This change would ignore the pre-installed versions

    opened by gitanshu 2
  • Fix tests, add Tensorflow 2 compat

    Fix tests, add Tensorflow 2 compat

    Unfortunately, fixing the tests involves monkey patching unittest, since it has bug. That bug is fixed in CPython PR #4800, which has gone unmerged for over a year.

    opened by krzentner 2
  • Starter Project for Directed Research in Robotics

    Starter Project for Directed Research in Robotics

    This attempts to allow a more robust handling of dynamic fileds in module dowel. This is implemented by the following logic:

    • The CSV headers are now ordered
    • When a new field is encountered, it is appended to the ordered list of headers
    • The output files gets flushed and re-read to allow header replacement
    • The new records with new fields get written

    In order to accompanish this, these following changes to internal data structures and logic are added:

    • csv._fieldnames becomes list() instead of set()
    • The output file is now opened in w+ mode to allow both read and write access

    Unit tests were added for the following cases:

    • Adding a new field at the end of the list of fields
    • Adding a new field in the middle of the list of fields

    Sincerely

    opened by late-in-autumn 1
  • Robust handling of inconsistent TabularInput keys

    Robust handling of inconsistent TabularInput keys

    Currently, CsvOutput emits a warning if the keys of a TabularInput change after the first call to logger.log(TabularInput). A new key not seen before will be ignored and an old key not presented will be left blank. In other words, CsvOutput conservatively handles dynamic fieldnames.

    This behaviour of CsvOutput makes it tricky to log performance of Multi- and Meta- ML algorithms, where there are usually per-task fields but not every task is presented in every iteration, resulting in missing of logs for some tasks.

    The desired behaviour to handle inconsistent keys should be

    • When a new key is encountered
      • Expand header with the new key.
      • Expand old rows with empty cells for the new key.
    • If the value of any key is missing, leave the cell blank.
    opened by naeioi 1
  • Rewrite automatic versioning

    Rewrite automatic versioning

    The previous automatic versioning script was flawed. It produced the correct package version for documentation builds and building PyPI distributions, but produced an incorrect version when you run setup.py from the downloaded package. Unfortunately, Python environment managers (e.g. Pipenv, conda) resolve package version by evaluating setup.py, not using the PyPI version.

    This PR makes version generation simpler by reading the version string from a simple file. Automatic versioning from tags is handled by clobbering the version file from within the CI, rather than looking for a CI environment variable on every usage.

    opened by ryanjulian 1
  • Remove Snapshotter

    Remove Snapshotter

    This was accidentally included during the import from rlworkgroup/garage.

    The Snapshotter is not part of the Logger API, so it doesn't really belong in this package.

    opened by ryanjulian 1
  • Move unit tests to tests/dowel

    Move unit tests to tests/dowel

    This PR moves unit tests modules to tests/dowel. This makes it so that the unit tests paths for a module are easy to predict. For instance, the tests for the module dowel.csv_output will now live at tests.dowel.test_csv_output rather than tests.test_csv_output.

    opened by ryanjulian 1
  • Add Python 3.8 build

    Add Python 3.8 build

    Inconsistent header keys are handled. When a new key is introduced, the previous data is augmented line by line. In the logger, if data is a TabularInput instance, then it has its values emptied, with its keys remaining so that there is no data bleed when a key is omitted in the future. Tensorboard incompatibility is an issue as tensorboard does not accept the empty character (or string) as a legal numpy scalar.

    Nine tests have been provided to cover the various circumstances:

    1. No change in keys
    2. Single increase in keys with consistent future usage
    3. Multiple increase in keys with consistent future usage
    4. Single increase in keys with inconsistent future usage
    5. Multiple increase in keys with inconsistent future usage
    6. Overlapping increase in keys with immediate inconsistency
    7. Static keys - tensorboard incompatibility test
    8. Dynamic keys - tensorboard incompatibility test
    9. Empty tabulation test

    Note: I have left the comment structure to be consistent with the preexisting code.

    opened by koverman47 0
  • Add Python 3.8 build

    Add Python 3.8 build

    Fix for #45

    Used DictReader to read in old file values and created new DictWriter object to rewrite all records. Old records without values for these new keys will be null as required.

    Also, apologies for the unnecessary mentions to this issue earlier!

    @avnishn @zequnyu

    opened by dxlin17 0
  • Robust handling of inconsistent TabularInput keys

    Robust handling of inconsistent TabularInput keys

    Introduction

    Dowel is a tool that the garage Team uses for logging results from our various Reinforcement learning experiments.

    Dowel can be used to log different types of data such as floats or strings. The logs can be logged to stdout (the console), CSV files, and Tensorboard.

    You can check out an example of how Dowel is used here. In fact, almost all parts of the Dowel API are used in this example.

    The problem

    After statistics such as loss have been logged, and a call to logger.dump_all() is made for the first time, new tabular data can’t be written to a CSV output. This is because currently data cannot be inconsistently logged to CSV, meaning that on every single call to dump_all, the same logger keys must appear. Data that is inconsistently logged will not appear in the CSV output. This is a design flaw that we have been able to work around but affects our workflows.

    Your goal is to solve the problem as well as introduce tests into our testing framework in order to verify your solution.

    Some General Instructions

    1. Fork Dowel and install all necessary dependencies.
    2. Take a look at this toy example which when run exposes the bug and the accompanying issue mentioned above.
    3. When you have finished writing your solution and tests, upload a PR onto your fork, not onto the upstream repository.
    4. When you are done email us back with the link to your pull request.
    5. Follow the rules of the contributing.md.

    If you have any questions, open an issue in your fork, and tag @avnishn and @haydenshively. Our preferred mode of communication on any questions that you have is through github issues and pull requests, as this is how the Garage team communicates generally. For this reason, we won’t respond to any direct emails with regards to help with your project. We will however respond to any other questions that you have via email (interview scheduling, etc).

    Best of luck, and let us know if there are any issues as early on as possible

    opened by avnishn 0
  • Mention SSH setup in CONTRIBUTING.md

    Mention SSH setup in CONTRIBUTING.md

    When I tried to run the following commands from the "Git recipes" in CONTRIBUTING.md, I got error messages:

    git remote add rlworkgroup [email protected]:rlworkgroup/dowel.git
    
    git reset --hard master rlworkgroup/master
    

    However, the following would work:

    git remote add rlworkgroup https://github.com/rlworkgroup/dowel.git
    
    git checkout master
    git fetch rlworkgroup
    git reset --hard rlworkgroup/master
    

    Should CONTRIBUTING.md be updated?

    opened by GuanyangLuo 1
  • Logging Numpy arrays, Torch Tensors and Tensorflow Tensors

    Logging Numpy arrays, Torch Tensors and Tensorflow Tensors

    Hi,

    Thank you for this nice a simple tool for logging machine learning research. I often encounter situations where I would like to save multi-dimensional Numpy arrays. For example, the observation at each time-step in a reinforcement learning experiment.

    It would be nice to have an output logger that supports Numpy arrays, Pytorch Tensors and Tensorflow Tensors.

    I have written a simple output logger, NpzOutput, that writes Numpy arrays to a .npz file using Numpys savez functions. It is not optimal (no incremental saving), but thought I share it in case somebody is interested.

    opened by BartKeulen 2
  • dowel causes the main process to hang forever, if it contains a TensorboardOutput when the process is closing

    dowel causes the main process to hang forever, if it contains a TensorboardOutput when the process is closing

    This is because it attempts to close the underlying TensorboardX writer in TensorboardOutput.__del__. However, global teardown of the python interpreter has already closed the thread used by TensorboardX.

    opened by krzentner 0
  • Add dots to alternating tabular lines

    Add dots to alternating tabular lines

    Tables with keys of varying lengths are hard to read, since some of the key names end up far from their values. This change adds a sequence of dots on all odd lines, so that lines are easier to match up with their keys.

    opened by krzentner 5
Owner
Reinforcement Learning Working Group
Coalition of researchers which develop open source reinforcement learning research software
Reinforcement Learning Working Group
A little word cloud generator in Python

Linux macOS Windows PyPI word_cloud A little word cloud generator in Python. Read more about it on the blog post or the website. The code is tested ag

Andreas Mueller 7.9k Feb 17, 2021
Generate a roam research like Network Graph view from your Notion pages.

Notion Graph View Export Notion pages to a Roam Research like graph view.

Steve Sun 214 Jan 7, 2023
LabGraph is a a Python-first framework used to build sophisticated research systems with real-time streaming, graph API, and parallelism.

LabGraph is a a Python-first framework used to build sophisticated research systems with real-time streaming, graph API, and parallelism.

MLH Fellowship 7 Oct 5, 2022
Lime: Explaining the predictions of any machine learning classifier

lime This project is about explaining what machine learning classifiers (or models) are doing. At the moment, we support explaining individual predict

Marco Tulio Correia Ribeiro 10.3k Dec 29, 2022
Debugging, monitoring and visualization for Python Machine Learning and Data Science

Welcome to TensorWatch TensorWatch is a debugging and visualization tool designed for data science, deep learning and reinforcement learning from Micr

Microsoft 3.3k Dec 27, 2022
Library for exploring and validating machine learning data

TensorFlow Data Validation TensorFlow Data Validation (TFDV) is a library for exploring and validating machine learning data. It is designed to be hig

null 688 Jan 3, 2023
Visualizations for machine learning datasets

Introduction The facets project contains two visualizations for understanding and analyzing machine learning datasets: Facets Overview and Facets Dive

PAIR code 7.1k Jan 7, 2023
Library for exploring and validating machine learning data

TensorFlow Data Validation TensorFlow Data Validation (TFDV) is a library for exploring and validating machine learning data. It is designed to be hig

null 520 Feb 17, 2021
Visualizations for machine learning datasets

Introduction The facets project contains two visualizations for understanding and analyzing machine learning datasets: Facets Overview and Facets Dive

PAIR code 6.5k Feb 17, 2021
3D Vision functions with end-to-end support for deep learning developers, written in Ivy.

Ivy vision focuses predominantly on 3D vision, with functions for camera geometry, image projections, co-ordinate frame transformations, forward warping, inverse warping, optical flow, depth triangulation, voxel grids, point clouds, signed distance functions, and others. Check out the docs for more info!

Ivy 61 Dec 29, 2022
A collection of 100 Deep Learning images and visualizations

A collection of Deep Learning images and visualizations. The project has been developed by the AI Summer team and currently contains almost 100 images.

AI Summer 65 Sep 12, 2022
An interactive dashboard for visualisation, integration and classification of data using Active Learning.

AstronomicAL An interactive dashboard for visualisation, integration and classification of data using Active Learning. AstronomicAL is a human-in-the-

null 45 Nov 28, 2022
Learning Convolutional Neural Networks with Interactive Visualization.

CNN Explainer An interactive visualization system designed to help non-experts learn about Convolutional Neural Networks (CNNs) For more information,

Polo Club of Data Science 6.3k Jan 1, 2023
Resources for teaching & learning practical data visualization with python.

Practical Data Visualization with Python Overview All views expressed on this site are my own and do not represent the opinions of any entity with whi

Paul Jeffries 98 Sep 24, 2022
A Graph Learning library for Humans

A Graph Learning library for Humans These novel algorithms include but are not limited to: A graph construction and graph searching class can be found

Richard Tjörnhammar 1 Feb 8, 2022
Key Logger - Key Logger using Python

Key_Logger Key Logger using Python This is the basic Keylogger that i have made

Mudit Sinha 2 Jan 15, 2022
Discord-Image-Logger - Discord Image Logger With Python

Discord-Image-Logger A exploit I found in discord. Working as of now. Explanatio

null 111 Dec 31, 2022
A little Python application to auto tag your photos with the power of machine learning.

Tag Machine A little Python application to auto tag your photos with the power of machine learning. Report a bug or request a feature Table of Content

Florian Torres 14 Dec 21, 2022
Json Formatter for the standard python logger

Overview This library is provided to allow standard python logging to output log data as json objects. With JSON we can make our logs more readable by

Zakaria Zajac 1.4k Jan 4, 2023