Histogramming for analysis powered by boost-histogram

Overview

histogram

Hist

Actions Status Documentation Status pre-commit.ci status Code style: black

PyPI version Conda-Forge PyPI platforms DOI

GitHub Discussion Gitter Scikit-HEP

Hist is an analyst-friendly front-end for boost-histogram, designed for Python 3.7+ (3.6 users get version 2.4). See what's new.

Slideshow of features. See docs/banner_slides.md for text if the image is not readable.

Installation

You can install this library from PyPI with pip:

python3 -m pip install "hist[plot]"

If you do not need the plotting features, you can skip the [plot] extra.

Features

Hist currently provides everything boost-histogram provides, and the following enhancements:

  • Hist augments axes with names:

    • name= is a unique label describing each axis.
    • label= is an optional string that is used in plotting (defaults to name if not provided).
    • Indexing, projection, and more support named axes.
    • Experimental NamedHist is a Hist that disables most forms of positional access, forcing users to use only names.
  • The Hist class augments bh.Histogram with simpler construction:

    • flow=False is a fast way to turn off flow for the axes on construction.
    • Storages can be given by string.
    • storage= can be omitted, strings and storages can be positional.
    • data= can initialize a histogram with existing data.
    • Hist.from_columns can be used to initialize with a DataFrame or dict.
    • You can cast back and forth with boost-histogram (or any other extensions).
  • Hist support QuickConstruct, an import-free construction system that does not require extra imports:

    • Use Hist.new.<axis>().<axis>().<storage>().
    • Axes names can be full (Regular) or short (Reg).
    • Histogram arguments (like data=) can go in the storage.
  • Extended Histogram features:

    • Direct support for .name and .label, like axes.
    • .density() computes the density as an array.
    • .profile(remove_ax) can convert a ND COUNT histogram into a (N-1)D MEAN histogram.
    • .sort(axis) supports sorting a histogram by a categorical axis. Optionally takes a function to sort by.
  • Hist implements UHI+; an extension to the UHI (Unified Histogram Indexing) system designed for import-free interactivity:

    • Uses j suffix to switch to data coordinates in access or slices.
    • Uses j suffix on slices to rebin.
    • Strings can be used directly to index into string category axes.
  • Quick plotting routines encourage exploration:

    • .plot() provides 1D and 2D plots (or use plot1d(), plot2d())
    • .plot2d_full() shows 1D projects around a 2D plot.
    • .plot_ratio(...) make a ratio plot between the histogram and another histogram or callable.
    • .plot_pull(...) performs a pull plot.
    • .plot_pie() makes a pie plot.
    • .show() provides a nice str printout using Histoprint.
  • Stacks: work with groups of histograms with identical axes

    • Stacks can be created with h.stack(axis), using index or name of an axis (StrCategory axes ideal).
    • You can also create with hist.stacks.Stack(h1, h2, ...), or use from_iter or from_dict.
    • You can index a stack, and set an entry with a matching histogram.
    • Stacks support .plot() and .show(), with names (plot labels default to original axes info).
    • Stacks pass through .project, *, +, and -.
  • New modules

    • intervals supports frequentist coverage intervals.
  • Notebook ready: Hist has gorgeous in-notebook representation.

    • No dependencies required

Usage

from hist import Hist

# Quick construction, no other imports needed:
h = (
  Hist.new
  .Reg(10, 0 ,1, name="x", label="x-axis")
  .Var(range(10), name="y", label="y-axis")
  .Int64()
)

# Filling by names is allowed:
h.fill(y=[1, 4, 6], x=[3, 5, 2])

# Names can be used to manipulate the histogram:
h.project("x")
h[{"y": 0.5j + 3, "x": 5j}]

# You can access data coordinates or rebin with a `j` suffix:
h[.3j:, ::2j] # x from .3 to the end, y is rebinned by 2

# Elegant plotting functions:
h.plot()
h.plot2d_full()
h.plot_pull(Callable)

Development

From a git checkout, either use nox, or run:

python -m pip install -e .[dev]

See Contributing guidelines for information on setting up a development environment.

Contributors

We would like to acknowledge the contributors that made this project possible (emoji key):


Henry Schreiner

🚧 πŸ’» πŸ“–

Nino Lau

🚧 πŸ’» πŸ“–

Chris Burr

πŸ’»

Nick Amin

πŸ’»

Eduardo Rodrigues

πŸ’»

Andrzej Novak

πŸ’»

Matthew Feickert

πŸ’»

Kyle Cranmer

πŸ“–

Daniel Antrim

πŸ’»

Nicholas Smith

πŸ’»

Michael Eliachevitch

πŸ’»

Jonas Eschle

πŸ“–

This project follows the all-contributors specification.

Talks


Acknowledgements

This library was primarily developed by Henry Schreiner and Nino Lau.

Support for this work was provided by the National Science Foundation cooperative agreement OAC-1836650 (IRIS-HEP) and OAC-1450377 (DIANA/HEP). Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

Comments
  • feat: Add ratio plot support through .plot_ratio API

    feat: Add ratio plot support through .plot_ratio API

    Resolves #148

    This PR adds support for ratio plots to the .plot... API by adding .plot_ratio and refactoring .plot_pull to use .plot_ratio.

    This PR is still very rough, but I thought I'd open it up as there is a minimal working example now and then revise it heavily from there.

    As the commit log is huge and messy I'll rebase and squash things into more reasonable commit sections to make review more digestible.

    Questions to come to consensus on

    (Discussion can happen on Issue #148)

    • Should the interval functions be moved to their own stat_interval module?

    Answer: Yes, this is PR #176.

    TODO before requesting review

    • [x] Finish refactoring .plot_pull
    • [x] Add kwargs and kwargs filtering support
    • [x] Fix typing errors
    • [x] Add tests
    • [x] All tests are passing
    • [x] Docstrings added
    • [x] README updated
    • [x] Changelog updated.

    Suggested squash and merge message

    * Add .plot_ratio to BastHist API
       - Adds ratio plot support for other Hists and callables
    * Factor out plot_ratio and plot_pull to perform the final subplot plotting
       - Majority of logic is factored out of plot_pull into _plot_ratiolike
       - _plot_ratiolike then calls plot_ratio or plot_pull as needed
    * Add plotting tests and test images for plot_ratio
    * Update test image for plot_pull
    * Update README and Changelog to reflect addition of plot_ratio
    
    Co-authored-by: Henry Schreiner <[email protected]>
    
    enhancement 
    opened by matthewfeickert 34
  • Hist.plot_pull: more suitable bands in the pull bands 1sigma, 2 sigma, etc.

    Hist.plot_pull: more suitable bands in the pull bands 1sigma, 2 sigma, etc.

    I was playing with Hist.plot_pull and noticed that the range of the pulls is dynamic, meaning it is calculated from max(np.abs(pulls)) / pp_num in the code. This is not ideal since (1) pull plots are basically always displayed between -5 and 5, and (2) having several pull plots side-by-side would likely provide ranges on a plot-by-plot case, which is not ideal.

    This little PR fixes the issue. It also makes sure that the colour bands by default are set at 1/2/.../5 sigma.

    I will post an issue with some other matters related to the function.

    Next week I will be giving some lectures and will be showing Hist. If at all possible it would be great to have this in a release :-).

    opened by eduardo-rodrigues 13
  • Should Every Axis Have Name?

    Should Every Axis Have Name?

    https://github.com/scikit-hep/hist/blob/be2c8380793807966f1ab58c4e50962fae7e45f9/src/hist/_internal/axis.py#L5-L17

    Should every axis be forced to have a name? I see you let title to be an arg (cannot be omitted). Hence that name should have the same status, it is should be indispensable, too. But in the unit tests, we are supposed to pass the test where axes have no names, e.g.

    https://github.com/scikit-hep/hist/blob/be2c8380793807966f1ab58c4e50962fae7e45f9/tests/test_general.py#L5

    and

    https://github.com/scikit-hep/hist/blob/be2c8380793807966f1ab58c4e50962fae7e45f9/tests/test_named.py#L6

    P.S. The current practice, i.e., #13, is to modify the tests to enforce them to have names. An alternative plan is to make title and name both omissible, which might be subject to the issue of filling-by-name (this functionality should be thrown out for hist with anonymous axes).

    opened by LovelyBuggies 13
  • More flexible fitting function, allow likelihood, remove uncertainties dependency

    More flexible fitting function, allow likelihood, remove uncertainties dependency

    Based on the discussion in https://github.com/scikit-hep/hist/issues/146, I added some features to plot_pull. The theme is making fitting more streamlined for exploration.

    • Pull curve_fit initial guess (p0) from default arguments, if they exist
    • Allow a string as an alternative to a lambda function (plot_pull("a+b*x"))
    • Cosmetic change to the band, and embedding fit result into the legend
    • Likelihood fit (plot_pull(..., likelihood=True)) (chi2 by default, as before)
    • Remove uncertainties.numpy dependency and construct band by resampling covariance matrix
    • Introduce iminuit (gets the covariance matrix right, unlike scipy.optimize most of the time), but the initial guesses for iminuit are seeded from scipy

    Setup

    import numpy as np
    from hist import Hist
    
    np.random.seed(42)
    hh = Hist.new.Reg(50, -5, 5).Double().fill(np.random.normal(0,1,int(1e5)))
    

    Before (including a bug-fix for the variances from the above issue):

    from uncertainties import unumpy as unp
    def func(x, constant, mean, sigma):
        exp = unp.exp if constant.dtype == np.dtype("O") else np.exp
        return constant * exp(-((x - mean) ** 2.0) / (2 * sigma ** 2))
    
    hh.plot_pull(func)
    

    image

    After:

    # as before, but no need for `uncertainties.numpy` as the error band comes
    # from resampling the covariance matrix
    def func(x, constant, mean, sigma):
        return constant * np.exp(-((x - mean) ** 2.0) / (2 * sigma ** 2))
    hh.plot_pull(func)
    
    # `curve_fit` `p0` extracted from defaults, if any
    def func(x, constant=80, mean=0., sigma=1.):
        return constant * np.exp(-((x - mean) ** 2.0) / (2 * sigma ** 2))
    hh.plot_pull(func)
    
    # strings are allowed to allow for more compactness than a lambda
    # x is assumed to be the main variable
    hh.plot_pull("constant*np.exp(-(x-mean)**2. / (2*sigma**2))")
    
    # gaussian is a common/special function, so this also works
    # reasonable guesses are made for constant/mean/sigma
    hh.plot_pull("gaus")
    

    image

    # chi2 puts `a` around 5, but likelihood puts `a` around 1e3/50 = 20
    hh.plot_pull("a+b*x", likelihood=True)
    

    image

    opened by aminnj 12
  • [FEATURE] Typing Hints for `get_item` and `set_item`

    [FEATURE] Typing Hints for `get_item` and `set_item`

    Describe the problem, if any, that your feature request is related to

    Don't know how to add typing hints for these two func.

    https://github.com/scikit-hep/hist/blob/dc9b209dfa5d2fa934a803c9cb589b414314fc13/src/hist/named.py#L50

    https://github.com/scikit-hep/hist/blob/dc9b209dfa5d2fa934a803c9cb589b414314fc13/src/hist/named.py#L66

    enhancement 
    opened by LovelyBuggies 11
  • feat: axis sort

    feat: axis sort

    When completed, closes #222

    To implement:

    • sort by passing index
    • sort by passing lambda to sorted
    • add tests

    @henryiii could you take a look? I had to abuse some design decisions of hist to get this work, so maybe you can recommend better workarounds.

    • AxisNamedTuple (and I assume axes) as well cannot be assigned to, so I had to recreate the sorted axis
    • Couldn't easily copy all meta/traits from old axis to new - hence the helper with inspect (name/label don't seem to be part of traits/metadata) - also cannot create axis and edit metadata

    Here is a couple of examples 2D - yax image 2D - xax image 1D image 3D - sort on not-projected axis image

    opened by andrzejnovak 10
  • feat: adding something like the classic hist

    feat: adding something like the classic hist

    Addressing #35.

    Currently it seems that axes without names are broken in Hist. If I do a new pip install -e .[dev], then try, from a command line:

    hist
    1
    2
    3
    2
    

    Then press control-D, I get:

    Traceback (most recent call last):
      File "/Users/henryschreiner/git/scikit-hep/hist/.env/bin/hist", line 33, in <module>
        sys.exit(load_entry_point('hist', 'console_scripts', 'hist')())
      File "/Users/henryschreiner/git/scikit-hep/hist/src/hist/classic_hist.py", line 22, in main
        h = bh.numpy.histogram(values, bins=args.buckets, histogram=hist.Hist)
      File "/Users/henryschreiner/git/scikit-hep/hist/.env/lib/python3.8/site-packages/boost_histogram/numpy.py", line 111, in histogram
        return histogramdd((a,), (bins,), (range,), normed, weights, density, **kwargs)
      File "/Users/henryschreiner/git/scikit-hep/hist/.env/lib/python3.8/site-packages/boost_histogram/numpy.py", line 74, in histogramdd
        hist = cls(*axs, storage=bh_storage).fill(*a, weight=weights, threads=threads)
      File "/Users/henryschreiner/git/scikit-hep/hist/src/hist/core.py", line 38, in __init__
        if ax.name in self.names:
      File "/Users/henryschreiner/git/scikit-hep/hist/src/hist/_internal/axis.py", line 52, in name
        return self.metadata["name"]
    TypeError: 'NoneType' object is not subscriptable
    

    @LovelyBuggies, could you take a look? This is really just running bh.numpy.histogram([1,2,3,2], histogram=hist.Hist).

    opened by henryiii 9
  • [SUPPORT] How to Check Test files by Mypy CI

    [SUPPORT] How to Check Test files by Mypy CI

    Describe your questions

    I have used mypy according to https://scikit-hep.org/developer/style#type-checking-new. My pre-commit profig file is like this:

    repos:
    - repo: https://github.com/psf/black
      rev: 19.10b0
      hooks:
      - id: black
    - repo: https://github.com/pre-commit/pre-commit-hooks
      rev: v2.5.0
      hooks:
      - id: check-added-large-files
      - id: mixed-line-ending
      - id: trailing-whitespace
      - id: check-merge-conflict
      - id: check-case-conflict
      - id: check-symlinks
      - id: check-yaml
    - repo: https://github.com/pre-commit/mirrors-mypy
      rev: v0.770
      hooks:
      - id: mypy
        files: all  # I tried tests and src, but none worked
    

    I deliberately put a type error in a test file, but pre-commit did not find it. How to make mypy CI work for unit tests?

    question 
    opened by LovelyBuggies 9
  • feat: adding profile for COUNT -> MEAN conversion

    feat: adding profile for COUNT -> MEAN conversion

    Closes #156, adds a .profile method. Based heavily on @aminnj's example.

    TODO:

    • [x] Needs tests
    • This probably could be a WeightedMean, probably give a choice? Just got the basics working for now. (Can revisit later)
    • [x] Should hide or handle warnings (true for some of our tests, too)

    Followup:

    • histoprint should handle showing kind=MEAN histograms (it doesn't like the NaN's, I think) CC @ast0815 (edit: done!)
    • mplhep should show a better plot style for kind=MEAN, and should also ensure it handles kind=MEAN's requirements (we could also dispatch differently, but the best fix would be in mplhep) CC @andrzejnovak
    • Boost-histogram should describe accumulators/storages a bit more in the docs, especially setting with a stack, including the correct order. CC Me

    If kind=MEAN, you are only supposed to plot variances if counts() > 1, and values if counts() >= 1.

    opened by henryiii 7
  • Fix Hist and NamedHist

    Fix Hist and NamedHist

    • Does named indexing work on all Hist's? I don’t see the code for it in BaseHist, only in NamedHist. Something like this: h[{β€œx”:2}] should be converted into h[{0:2}] assuming that axes 0 has name "x".
    • NamedHist should simply verify that the index item is a dict, and that it has only string keys. Then it should just call BaseHist's indexing.
    • NamedHist's fill should simply be: def fill(self, **kwargs): return super().fill(**kwargs) (maybe with a few more keyword arguments listed to be nice to the user's inspection tools). BaseHist should be able to fill via kwargs or via position.

    Overall, NamedHist should be pretty short, it is just taking away functionally (position-based access) from BaseHist. I'm not sure how strict we want to be - we can also have an OnlyNamedAxesTuple too.

    Also we should make sure you can cast between Hist and NamedHist. Hist(NamedHist(...)), etc. I can do that one.

    bug 
    opened by henryiii 7
  • [BUG] Uncertainty bands in efficiency ratios go above unity

    [BUG] Uncertainty bands in efficiency ratios go above unity

    Describe the bug

    The error bars produced in the ratio of an efficiency style plot (hist.plot_ratio(..., rp_uncertainty_type = "poisson-ratio", ...)) can extend above unity. This is unexpected and not meaningful if the numerator is a true subset of the denominator.

    Steps to reproduce

    This is observed using hist==2.4.0.

    Following the example from the docs,

    hist0 = hist.Hist(hist.axis.Regular(50, -5, 5), underflow = False, overflow = False).fill(np.random.normal(size=1700))
    hist1 = hist0.copy() * 0.98
    
    fig = plt.figure(figsize = (10,8))
    _ = hist1.plot_ratio(
        hist0,
        rp_ylabel="Efficiency",
        rp_num_label="hist1",
        rp_denom_label="hist0",
        rp_uncert_draw_type="line",
        rp_uncertainty_type="poisson-ratio",
        rp_ylim=[0.9,1.1],
    )
    

    This produces something like the following, image

    where we see the error bars going above unity.

    If I take the same data used in hist0 and hist1 above, and produce a TEfficiency object in ROOT, I get something like the following, image which is something more like what I would expect since I believe hist is using a Clopper-Pearson interval (default in ROOT).

    bug 
    opened by dantrim 6
  • [FEATURE] dask histogram backend

    [FEATURE] dask histogram backend

    Describe the problem, if any, that your feature request is related to

    Right now the https://github.com/dask-contrib/dask-histogram package exposes an interface that is very nearly 1:1 with boost histogram, which I think is the correct scope for this package. However, many users of columnar analysis tools in HEP prefer hist since its UI is more intuitive and comfortable. It would be best to have a sub-module within hist that exposed a dask-histogram backed interface and suite of tools.

    Describe the feature you'd like

    I would like users to be able to do something like:

    import hist.dask as hist
    

    or

    import hist
    
    h = hist.Hist(**axes).as_dask()
    

    and have access to the hist interface with (nearly) all of its features but backed by computation in dask.

    This is very close to practice since dask_histogram already very closely follows (and mostly directly uses) the boost-histogram interface, which means that most of what hist does should be easy to adapt.

    This will require a few pieces of interface to be smoothened out in dask_histogram since it's not 1:1 with boost-histogram in some cases. (e.g. what is mentioned in the bottom of the comment here: https://github.com/dask-contrib/dask-histogram/issues/35#issuecomment-1368108690)

    Describe alternatives, if any, you've considered

    The alternative is using the dask histogram interface directly, which while effective, is significantly less widely used and less pleasant than hist. This would require significant readoption and generate attrition risk in our user base.

    enhancement 
    opened by lgray 0
  • chore(deps): bump pypa/gh-action-pypi-publish from 1.5.1 to 1.6.4

    chore(deps): bump pypa/gh-action-pypi-publish from 1.5.1 to 1.6.4

    Bumps pypa/gh-action-pypi-publish from 1.5.1 to 1.6.4.

    Release notes

    Sourced from pypa/gh-action-pypi-publish's releases.

    v1.6.4

    oh, boi! again?

    This is the last one tonight, promise! It fixes this embarrassing bug that was actually caught by the CI but got overlooked due to the lack of sleep. TL;DR GH passed $HOME from the external env into the container and that tricked the Python's site module to think that the home directory is elsewhere, adding non-existent paths to the env vars. See #115.

    Full Diff: https://github.com/pypa/gh-action-pypi-publish/compare/v1.6.3...v1.6.4

    v1.6.3

    Another Release!? Why?

    In pypa/gh-action-pypi-publish#112, it was discovered that passing a $PATH variable even breaks the shebang. So this version adds more safeguards to make sure it keeps working with a fully broken $PATH.

    Full Diff: https://github.com/pypa/gh-action-pypi-publish/compare/v1.6.2...v1.6.3

    v1.6.2

    What's Fixed

    • Made the $PATH and $PYTHONPATH environment variables resilient to broken values passed from the host runner environment, which previously allowed the users to accidentally break the container's internal runtime as reported in pypa/gh-action-pypi-publish#112

    Internal Maintenance Improvements

    New Contributors

    Full Diff: https://github.com/pypa/gh-action-pypi-publish/compare/v1.6.1...v1.6.2

    v1.6.1

    What's happened?!

    There was a sneaky bug in v1.6.0 which caused Twine to be outside the import path in the Python runtime. It is fixed in v1.6.1 by updating $PYTHONPATH to point to a correct location of the user-global site-packages/ directory.

    Full Diff: https://github.com/pypa/gh-action-pypi-publish/compare/v1.6.0...v1.6.1

    v1.6.0

    Anything's changed?

    The only update is that the Python runtime has been upgraded from 3.9 to 3.11. There are no functional changes in this release.

    Full Changelog: https://github.com/pypa/gh-action-pypi-publish/compare/v1.5.2...v1.6.0

    v1.5.2

    What's Improved

    Full Diff: https://github.com/pypa/gh-action-pypi-publish/compare/v1.5.1...v1.5.2

    Commits
    • c7f29f7 πŸ› Override $HOME in the container with /root
    • 644926c πŸ§ͺ Always run smoke testing in debug mode
    • e71a4a4 Add support for verbose bash execusion w/ $DEBUG
    • e56e821 πŸ› Make id always available in twine-upload
    • c879b84 πŸ› Use full path to bash in shebang
    • 57e7d53 πŸ›Ensure the default $PATH value is pre-loaded
    • ce291dc πŸŽ¨πŸ›Fix the branch @ pre-commit.ci badge links
    • 102d8ab πŸ› Rehardcode devpi port for GHA srv container
    • 3a9eaef πŸ›Use different ports in/out of GHA containers
    • a01fa74 πŸ› Use localhost @ GHA outside the containers
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 0
  • Make it easier to plot ratio of multiple histograms [FEATURE]

    Make it easier to plot ratio of multiple histograms [FEATURE]

    In:

    import matplotlib.pyplot as plt
    import numpy
    from   hist import Hist
    
    #------------------------------------------------------
    def get_hist():
        h = (
          Hist.new
          .Reg(10, -2 ,2, name='x', label='x-axis')
          .Int64()
        )
        data=numpy.random.normal(size=10000)
        h.fill(data)
    
        return h
    #------------------------------------------------------
    def main():
        h_1 = get_hist()
        h_2 = get_hist()
        h_3 = get_hist()
    
        fig, (ax, rax) = plt.subplots(nrows=2, gridspec_kw={"height_ratios": (3, 1)})
        axs = {"main_ax": ax, "ratio_ax": rax}
    
        h_2.plot_ratio(h_1,
            rp_ylabel          ='Ratio',
            rp_num_label       ='data',
            rp_denom_label     ='sim 1',
            rp_uncert_draw_type='bar',
            ax_dict = axs,
        )
    
        h_3.plot_ratio(h_1,
            rp_ylabel          ='Ratio',
            rp_num_label       ='data',
            rp_denom_label     ='sim 2',
            rp_uncert_draw_type='bar',
            ax_dict = axs,
        )
    
        plt.show()
    #------------------------------------------------------
    if __name__ == '__main__':
        main()
    

    The code is unable to:

    1. Plot the ratio of data to both simulations and there is no easy way to make this work. I see repeated histograms.
    2. The axes are not aligned by default
    3. I do not see documentation (or at least it is not easy to find) fr the rp_* arguments.

    Describe the feature you'd like

    The user should be able to do:

        h_1 = get_hist()
        h_2 = get_hist()
        h_3 = get_hist()
    
       h_1.plot_ratio([h_2, h_3])
    
       plt.show()
    

    and we should get two figures, upper one with h_* overlaid. Lower with two ratios, to h_2 and h_3. The axes should be aligned, the labels and legends should be taken from the histograms themselves and we should not have to do any manipulation of artists.

    Describe alternatives, if any, you've considered

    The way the code is implemented is bad, it's too complicated, and I do not have time to make it work the way I need to, so I am moving back to pure matplotlib. The plots I need do not need to be perfect and matplotlib is good enough for me. It would be nice if hist can do quickly what we need though.

    Cheers.

    enhancement 
    opened by angelcampoverde 0
  • [BUG] Raise an error when adding hists of different storage types.

    [BUG] Raise an error when adding hists of different storage types.

    Describe the bug

    Adding two hists of different storage types just results in having an empty hist without raising an error:

    Steps to reproduce

    import numpy as np
    from hist import Hist
    import hist
    a_hist = hist.Hist(hist.axis.Regular(3, -1, 1), storage=hist.storage.Int64())
    a_hist.sum()
    

    0.0

    b_hist = hist.Hist(hist.axis.Regular(3, -1, 1), storage=hist.storage.Double())
    b_hist.fill(np.random.normal(size=1000))
    b_hist.sum()
    

    681.0

    a_hist += b_hist
    a_hist.sum()
    

    0.0

    opened by JohanWulff 0
  • [FEATURE] plot_ratio() to support Weighted histograms

    [FEATURE] plot_ratio() to support Weighted histograms

    Describe the problem, if any, that your feature request is related to I believe the current plot_ratio() method does not take into account weights of the histograms when calculating the errors.

    Describe the feature you'd like

    Propagate errors correctly into the ratio uncertainty, taking into account the weights

    Describe alternatives, if any, you've considered

    Using coffea.hist.plotratio instead

    enhancement 
    opened by andreypz 0
Releases(v2.6.2)
  • v2.6.2(Sep 20, 2022)

  • v2.6.1(Mar 10, 2022)

  • v2.6.0(Feb 16, 2022)

  • v2.5.2(Nov 18, 2021)

  • v2.5.1(Sep 21, 2021)

  • v2.5.0(Sep 21, 2021)

    • Stacks support axes, math operations, projection, setting items, and iter/dict construction. They also support histogram titles in legends. Added histoprint support for Stacks. #291 #315 #317 #318
    • Added name= and label= to histograms, include Hist arguments in QuickConstruct. #297
    • AxesTuple now supports bulk name setting, h.axes.name = ("a", "b", ...). #288
    • Added hist.new alias for hist.Hist.new. #296
    • Added "efficiency" uncertainty_type option for ratio_plot API. #266 #278

    Smaller features or fixes:

    • Dropped Python 3.6 support. #194
    • Uses boost-histogram 1.2.x series, includes all features and fixes, and Python 3.10 support.
    • No longer require scipy or iminuit unless actually needed. #316
    • Improve and clarify treatment of confidence intervals in intervals submodule. #281
    • Use NumPy 1.21 for static typing. #285
    • Support running tests without plotting requirements. #321
    Source code(tar.gz)
    Source code(zip)
  • v2.4.0(Jul 7, 2021)

    • Support .stack(axis) and stacked histograms. #244 #257 #258
    • Support selection lists (experimental with boost-histogram 1.1.0). #255
    • Support full names for QuickConstruct, and support mistaken usage in constructor. #256
    • Add .sort(axis) for quickly sorting a categorical axis. #243

    Smaller features or fixes:

    • Support nox for easier contributor setup. #228
    • Better name axis error. #232
    • Fix for issue plotting size 0 axes. #238
    • Fix issues with repr information missing. #241
    • Fix issues with wrong plot shortcut being triggered by Integer axes. #247
    • Warn and better error if overlapping keyword used as axis name. #250

    Along with lots of smaller docs updates.

    Source code(tar.gz)
    Source code(zip)
  • v2.3.0(Apr 12, 2021)

    • Add plot_ratio to the public API, which allows for making ratio plots between the histogram and either another histogram or a callable. #161
    • Add .profile to compute a (N-1)D profile histogram. #160
    • Support plot1d / plot on Histograms with a categorical axis. #174
    • Add frequentist coverage interval support in the intervals module. #176
    • Allow plot_pull to take a more generic callable or a string as a fitting function. Introduce an option to perform a likelihood fit. Write fit parameters' values and uncertainties in the legend. #149
    • Add fit_fmt= to plot_pull to control display of fit params. #168
    • Support <prefix>_kw arguments for setting each axis in plotting. #193
    • Cleaner IPython completion for Python 3.7+. #179
    Source code(tar.gz)
    Source code(zip)
  • v2.2.1(Mar 17, 2021)

    • Fix bug with plot_pull missing a sqrt. #150
    • Fix static typing with ellipses. #145
    • Require boost-histogram 1.0.1+, fixing typing related issues, allowing subclassing Hist without a family and including a important Mean/WeighedMean summing fix. #151
    Source code(tar.gz)
    Source code(zip)
  • v2.2.0(Mar 9, 2021)

    • Support boost-histogram 1.0. Better plain reprs. Full Static Typing. #137

    • Support data= when construction a histogram to copy in initial data. #142

    • Support Hist.from_columns, for simple conversion of DataFrames and similar structures #140

    • Support .plot_pie for quick pie plots #140

    Source code(tar.gz)
    Source code(zip)
  • v2.1.1(Mar 4, 2021)

  • v2.1.0(Feb 20, 2021)

    This version provides many new features from boost-histogram 0.12 and 0.13; see the changelog in boost-histogram for details.

    • Support shortcuts for setting storages by string or position #129

    Updated dependencies:

    • boost-histogram 0.11.0 to 0.13.0.
      • Major new features, including PlottableProtocol
    • histoprint >=1.4 to >=1.6.
    • mplhep >=0.2.16 when [plot] given
    Source code(tar.gz)
    Source code(zip)
  • v2.0.1(Oct 11, 2020)

    Hist version 2.0.1 comes out. The following fixes are applied:

    • Make sum of bins explicit in notebook representations. #106
    • Fixed plot2d_full incorrectly mirroring the y-axis. #105
    • Hist.plot_pull: more suitable bands in the pull bands 1sigma, 2 sigma, etc. #102
    • Fixed classichist's usage of get_terminal_size to support not running in a terminal #99
    Source code(tar.gz)
    Source code(zip)
  • v2.0.0(Sep 28, 2020)

    Final 2.0 release of Hist! Since beta 1, the following changes were made:

    • Based on boost-histogram 0.11; now supports two way conversion without metadata issues.
    • mplhep is now used for all plotting. Return types changed; fig dropped, new figures only created if needed.
    • QuickConstruct was rewritten, uses new.Reg(...).Double(); not as magical but clearer types and usage.
    • Plotting requirements are no longer required, use [plot] to request.

    The following new features were added:

    • Jupyter HTML repr's were added.
    • flow=False shortcut added.
    • Static type checker support for dependent projects.

    The following fixes were applied:

    • .fill was broken for WeighedMean storage.
    Source code(tar.gz)
    Source code(zip)
  • v2.0.0b1(Sep 8, 2020)

    First beta release. title has been renamed label. Significant improvements to documentation, and a bug fix for plotting (#87). Uses Boost-Histogram 0.10+.

    Source code(tar.gz)
    Source code(zip)
  • v2.0.0a5(Jul 17, 2020)

  • v2.0.0a4(Jul 16, 2020)

  • v2.0.0a3(Jul 16, 2020)

  • v2.0.0a2(Jul 12, 2020)

  • v2.0.0a1(Jul 12, 2020)

Owner
Scikit-HEP Project
Scikit-HEP Project
Python histogram library - histograms as updateable, fully semantic objects with visualization tools. [P]ython [HYST]ograms.

physt P(i/y)thon h(i/y)stograms. Inspired (and based on) numpy.histogram, but designed for humans(TM) on steroids(TM). The goal is to unify different

Jan Pipek 120 Dec 8, 2022
This plugin plots the time you spent on a tag as a histogram.

This plugin plots the time you spent on a tag as a histogram.

Tom DΓΆrr 7 Sep 9, 2022
Bcc2telegraf: An integration that sends ebpf-based bcc histogram metrics to telegraf daemon

bcc2telegraf bcc2telegraf is an integration that sends ebpf-based bcc histogram

Peter Bobrov 2 Feb 17, 2022
Regress.me is an easy to use data visualization tool powered by Dash/Plotly.

Regress.me Regress.me is an easy to use data visualization tool powered by Dash/Plotly. Regress.me.-.Google.Chrome.2022-05-10.15-58-59.mp4 Get Started

Amar 14 Aug 14, 2022
Tools for exploratory data analysis in Python

Dora Exploratory data analysis toolkit for Python. Contents Summary Setup Usage Reading Data & Configuration Cleaning Feature Selection & Extraction V

Nathan Epstein 599 Dec 25, 2022
3D plotting and mesh analysis through a streamlined interface for the Visualization Toolkit (VTK)

PyVista Deployment Build Status Metrics Citation License Community 3D plotting and mesh analysis through a streamlined interface for the Visualization

PyVista 1.6k Jan 8, 2023
3D plotting and mesh analysis through a streamlined interface for the Visualization Toolkit (VTK)

PyVista Deployment Build Status Metrics Citation License Community 3D plotting and mesh analysis through a streamlined interface for the Visualization

PyVista 692 Feb 18, 2021
Python package for hypergraph analysis and visualization.

The HyperNetX library provides classes and methods for the analysis and visualization of complex network data. HyperNetX uses data structures designed to represent set systems containing nested data and/or multi-way relationships. The library generalizes traditional graph metrics to hypergraphs.

Pacific Northwest National Laboratory 304 Dec 27, 2022
Domain Connectivity Analysis Tools to analyze aggregate connectivity patterns across a set of domains during security investigations

DomainCAT (Domain Connectivity Analysis Tool) Domain Connectivity Analysis Tool is used to analyze aggregate connectivity patterns across a set of dom

DomainTools 34 Dec 9, 2022
Squidpy is a tool for the analysis and visualization of spatial molecular data.

Squidpy is a tool for the analysis and visualization of spatial molecular data. It builds on top of scanpy and anndata, from which it inherits modularity and scalability. It provides analysis tools that leverages the spatial coordinates of the data, as well as tissue images if available.

Theis Lab 251 Dec 19, 2022
πŸ“ŠπŸ“ˆ Serves up Pandas dataframes via the Django REST Framework for use in client-side (i.e. d3.js) visualizations and offline analysis (e.g. Excel)

???? Serves up Pandas dataframes via the Django REST Framework for use in client-side (i.e. d3.js) visualizations and offline analysis (e.g. Excel)

wq framework 1.2k Jan 1, 2023
Exploratory analysis and data visualization of aircraft accidents and incidents in Brazil.

Exploring aircraft accidents in Brazil Occurrencies with aircraft in Brazil are investigated by the Center for Investigation and Prevention of Aircraf

Augusto Herrmann 5 Dec 14, 2021
Automate the case review on legal case documents and find the most critical cases using network analysis

Automation on Legal Court Cases Review This project is to automate the case review on legal case documents and find the most critical cases using netw

Yi Yin 7 Dec 28, 2022
Political elections, appointment, analysis and visualization in Python

Political elections, appointment, analysis and visualization in Python poli-sci-kit is a Python package for political science appointment and election

Andrew Tavis McAllister 9 Dec 1, 2022
Sentiment Analysis application created with Python and Dash, hosted at socialsentiment.net

Social Sentiment Dash Application Live-streaming sentiment analysis application created with Python and Dash, hosted at SocialSentiment.net. Dash Tuto

Harrison 456 Dec 25, 2022
Python package for the analysis and visualisation of finite-difference fields.

discretisedfield Marijan Beg1,2, Martin Lang2, Samuel Holt3, Ryan A. Pepper4, Hans Fangohr2,5,6 1 Department of Earth Science and Engineering, Imperia

ubermag 12 Dec 14, 2022
Runtime analysis of code with plotting

Runtime analysis of code with plotting A quick comparison among Python, Cython, and the C languages A Programming Assignment regarding the Programming

Cena Ashoori 2 Dec 24, 2021
Analysis and plotting for motor/prop/ESC characterization, thrust vs RPM and torque vs thrust

esc_test This is a Python package used to plot and analyze data collected for the purpose of characterizing a particular propeller, motor, and ESC con

Alex Spitzer 1 Dec 28, 2021
AB-test-analyzer - Python class to perform AB test analysis

AB-test-analyzer Python class to perform AB test analysis Overview This repo con

null 13 Jul 16, 2022