Joyplots in Python with matplotlib & pandas :chart_with_upwards_trend:

Overview

JoyPy

PyPI version python version Build Status License: MIT Downloads

JoyPy is a one-function Python package based on matplotlib + pandas with a single purpose: drawing joyplots (a.k.a. ridgeline plots).

A joyplot.

The code for JoyPy borrows from the code for kdes in pandas.plotting, and uses a couple of utility functions therein.

What are joyplots?

Joyplots are stacked, partially overlapping density plots, simple as that. They are a nice way to plot data to visually compare distributions, especially those that change across one dimension (e.g., over time). Though hardly a new technique, they have become very popular lately thanks to the R package ggjoy (which is much better developed/maintained than this one -- and I strongly suggest you use that if you can use R and ggplot.) Update: the ggjoy package has now been renamed ggridges.

Why are they called joyplots?

If you don't know Joy Division, you are lucky: you can still listen to them for the first time! Here's a hint: google "Unknown Pleasures". This kind of plot is now also known as ridgeline plot, since the original name is controversial.

Documentation and examples

JoyPy has no real documentation. You're strongly encouraged to take a look at this jupyter notebook with a growing number of examples. Similarly, github issues may contain some wisdom :-)

A minimal example is the following:

import joypy
import pandas as pd

iris = pd.read_csv("data/iris.csv")
fig, axes = joypy.joyplot(iris)

By default, joypy.joyplot() will draw joyplot with a density subplot for each numeric column in the dataframe. The density is obtained with the gaussian_kde function of scipy.

Note: joyplot() returns n+1 axes, where n is the number of visible rows (subplots). Each subplot has its own axis, while the last axis (axes[-1]) is the one that is used for things such as plotting the background or changing xticks, and is the one you might need to play with in case you want to manually tweak something.

Dependencies

  • Python 3.5+
    Compatibility with python 2.7 has been dropped with release 0.2.0.

  • numpy

  • scipy >= 0.11

  • matplotlib

  • pandas >= 0.20 Warning: compatibility with pandas >= 0.25 requires joypy >= 0.2.1

Not sure what are the oldest supported versions. As long as you have somewhat recent versions, you should be fine.

Installation

It's actually on PyPI, because why not:

pip install joypy

To install from github, run:

git clone [email protected]:sbebo/joypy.git
cd joypy
pip install .

License

Released under the MIT license.

Disclaimer + contributing

This is just a sunday afternoon hack, so no guarantees! If you want to contribute or just copy/fork, feel free to.

Comments
  • Adding an option to input manually a list of colors for the different density plots

    Adding an option to input manually a list of colors for the different density plots

    In the case of large number of labels, there are not enough colors for drawing all the density matrices (get an error: plt.rcParams['axes.prop_cycle'].by_key()['color'][j] list index out of range). This change enables to manually input a list of colors to the JoyPy function, in the length of the labels (all(len(g) == len(color) for g in data)).

    opened by yaront 9
  • Unexpected behaviour of figsize and overlap

    Unexpected behaviour of figsize and overlap

    Hi @sbebo, Thanks very much for the fantastic module. I encountered a strange behaviour when trying to control figsize and overlap, which I assume is a bug. Here's a minimum code example:

    # load modules and create a list of random data
    import numpy as np
    import joypy as jp
    
    dist_list = []
    for i in range(20):
        dist_list.append(np.random.normal(loc=np.random.random(), size=100))
    

    With these arguments the plot is created normally:

    fig, ax = jp.joyplot(dist_list, overlap=1, figsize=(8,3.3))
    fig.show()
    

    image

    But for any height value of figsize that is higher than 3.3 in this case, all density plots appear extremely flattened, and changing the overlap has no effect at all:

    fig, ax = jp.joyplot(dist_list, overlap=1, figsize=(8,3.4))
    fig.show()
    

    image

    I noticed this when working on some larger data, and the "threshold" of this sudden change was at a different value for the height of figsize Is there a way to have a larger figure but prevent this from happening? Thanks!

    opened by W-L 6
  • Calling `xlim` has unexpected effect

    Calling `xlim` has unexpected effect

    If you create a joyplot, and then subsequently set the xlimits by calling matplotlib.pyplot.xlim(xmin, xmax), the x-axis labels shift but the data does not shift to reflect the new x range.

    This can easily lead to incorrectly plotted data because no warning is given -- the axis labels just stop reflecting the actual data.

    For instance, below is the exact same data plotted with joyplot, but in the first case there was a subsequent call to xlim to set a narrower xrange and in the second case there was not. As you can see, the xticks shifted but not the data.

    I'm guessing the preferred way to set the xlimits is instead via the xrange parameter of joyplot, but I still worry the behavior described in this issue could easily lead to incorrect plots.

    screen shot 2017-11-24 at 8 15 40 am

    opened by jbloom 5
  • Different issue on import for 0.1.4

    Different issue on import for 0.1.4

    I saw the other import error in the solved issues, but this one is different. I'm using Pandas 0.20.3

    File "myscript.py", line 17, in <module> import joypy File "/home/user/anaconda2/lib/python2.7/site-packages/joypy/init.py", line 1, in from .joyplot import joyplot, plot_density, _joyplot File "/home/user/anaconda2/lib/python2.7/site-packages/joypy/joyplot.py", line 221 raise ValueError("At least a column/group haNo numeric values found. Joyplot requires at least a numeric column/group.")

    opened by WMGoBuffs 4
  • Aggregate data in a specific column

    Aggregate data in a specific column

    Hi,

    I saw #17 explaining how to use kind="values and x_range in order to use aggregated data, but I'm finding it hard to combine this functionality with setting a specific column for output. I've got a dataset with a set of locations (lat & long) with measurements taken at each point giving a numerical value.

         lat        long        val
    
        001.012    001.01        11
    
        001.431    004.49        25
    
        001.769    008.72        04
    
        002.100    001.32        03
    
        002.504    003.49        17
    
        ...
    

    (rubbish formatting but it gives you the right idea)

    I've binned the y-values (lat) using pd.cut to give a lat_bins column and plotted this using:

    fig, axes = jp.joyplot(df, by="lat_bins", column="long")

    But what I'd like to have is something like:

    fig, axes = jp.joyplot(df, by="lat_bins", column="long", kind="values", x_range=<some range>))

    This displays a plot without any data; and no error appears when running the code so I don't know what to fix. The goal is to have a plot at each latitude showing the distribution according to longitude. (i.e., the dataset shown above would have the first layer showing 11 - 25 - 4 going across from 001 to 008 lat.

    opened by JKalnins 3
  • Colourmap depending upon a third variable or passing a list of colors

    Colourmap depending upon a third variable or passing a list of colors

    I created this joyplot below D9-y6IhUIAAQvLzpng

    However, for every one of the 20 teams on the y-axis, I want to colour them according to a Pandas dataframe column which contains a total value for every team. I was wondering if the colourmap could be customised to colour the plots according to a third data variable - either a Pandas Series, numpy array, or something similar. So for my dataframe the Total would look like this:

       Team     Total
    Angers      356
    Monaco      452
    PSG         501
    Lyon        369
    

    Here's what works as expected:

    fig, axes = joypy.joyplot(df, by="Team", column="Minute", figsize =(10,16), x_range = [0,94], linewidth = 1,
    legend=False, colormap = color ,  title= "When do teams in Ligue 1 take their shots?")
    

    Here are a few solutions I tried and failed at. Firstly, I created a colormap for the entire series depending on the Total data using this -

    norm = plt.Normalize(group_df["Total"].min(), group_df["Total"].max())
    cmap = plt.cm.viridis
    sm = matplotlib.cm.ScalarMappable(cmap=cmap, norm=norm)
    ar = np.array(group_df["Total"])
    Cm = cmap(norm(ar))
    sm.set_array([])
    

    1 - Passing the customised colourmap in the function

    fig, axes = joypy.joyplot(df, by="Team", column="Minute", figsize =(10,16), x_range = [0,94], linewidth = 1,
    legend=False, colormap = Cm ,  title= "When do teams in Ligue 1 take their shots?")
    

    2 - Passing the array of RGBA tuples to the colour

    color = np.array(tuple(map(tuple,Cm)))
    fig, axes = joypy.joyplot(df, by="Team", column="Minute", figsize =(10,16), x_range = [0,94], linewidth = 1,
    legend=False, color = color ,  title= "When do teams in Ligue 1 take their shots?")
    

    3 - Using the axes handles to set_facecolor - I saw this here

    for col, ax in zip(Cm, axes):
            ax.patch.set_facecolor(col)
    

    1 and 2 threw up an invalid RGBA argument and Assertion Error respectively. 3 had no error but the change was not reflected in the resulting plot. All I could gather is that currently, there's a no way to pass a list of colors (with a size similar to the grouped y-axis) or a customised colormap. If that's true, does there exist a work-around which I could implement? Or could a new feature be added?

    opened by sharmaabhishekk 3
  • Getting y values of each subplot

    Getting y values of each subplot

    Is there a way to get the y values of each subplot? I would like to plot several vertical lines corresponding to different quantities (for example, mean) and would like the line to go from the bottom of the plot up to the "top edge" of the plot (i.e. the y value).

    For example, in the plot below I wanted to plot the mean but the vertical overshoots to plot (because I plotted until y=1)

    image

    opened by IreneCrisologo 3
  • Titles in X and Y axis

    Titles in X and Y axis

    Hi, I have a question about Y labels and believe this functionality exists, but I cannot get it right. E.g.: I can put Y axis labels on every subplot something like this:

    fig, axes = joyploy(...)
    for ax in axes:
        ax.set_ylabel("mylab")
    

    This creates a label for every density plot. Can I somehow add a "global" Y axis label, instead of a Y axis label for every subplot. I suppose I can do that by providing an ax argument to joyploy, but it does not seem to work.

    Thanks for your help, Simon

    opened by dirmeier 3
  • Can't get colormap to work properly

    Can't get colormap to work properly

    I'm trying to make a joyplot and when I set colormap=cm.autumn_r , my plot only uses the first color in the colormap. I wanted to use cm.tab20c and I eventually got it working by hacking the _get_color function to return colormap(i % len(colormap.colors)) . I don't think this is the right general solution but I'm not sure what to fix before submitting a pull request. Any ideas what might be going wrong? I think there might also be some weird interaction between gradient and named-color colormaps too.

    Also, this package is amazing! Thank you so much for putting it together.

    opened by davidjurgens 3
  • Entering the name of a non-numeric column raises

    Entering the name of a non-numeric column raises "min() arg is an empty sequence"

    If you enter the name of a column which corresponds to non-numeric data, the error message given is:

    ~/miniconda3/lib/python3.6/site-packages/joypy/joyplot.py in _joyplot(data, grid, labels, sublabels, xlabels, xlabelsize, xrot, ylabelsize, yrot, ax, figsize, hist, bins, fade, xlim, ylim, fill, linecolor, overlap, background, range_style, x_range, tails, title, legend, loc, colormap, color, **kwargs)
        352     else:
        353         global_x_range = _x_range(x_range, 0.0)
    --> 354     global_x_min, global_x_max = min(global_x_range), max(global_x_range)
        355 
        356     # Each plot will have its own axis
    
    ValueError: min() arg is an empty sequence
    

    This was encountered when trying to make plots of datetime values. The error message is not informative that the error is due to datetime not being a numeric type. Perhaps in _grouped_df_to_standard, when you check if the column is numeric, you can throw a warning or error if there are no numeric columns returned.

    opened by jeffreyliu 3
  • Github Tags/Releases?

    Github Tags/Releases?

    Hi,

    The latest tag on github here is still 0.2.2 whilst the version has progressed to 0.2.5. Actually, I maintain this package in debian, and we take the source package from github (not pypi for several reasons) and hence it'd be very helpful if you could propagate the missing tag(s) (at least the latest one)

    Could you please push the relevant tag?

    =============================================================================================

    CC: @leotac

    opened by nileshpatra 2
  • Fix ValueError when trying to plot with `hist=True` and color as list  of colors.

    Fix ValueError when trying to plot with `hist=True` and color as list of colors.

    Hi, I tried to plot a table with joypy with hist=True and "color" being a list of colors and got an error from matplotlib (v3.6.0).

    Error message:

    "ValueError: The 'color' keyword argument must have one color per dataset, but 1 datasets and 2 colors were provided"

    How to reproduce:

    import numpy as np
    import pandas as pd
    import joypy
    
    table = pd.DataFrame({'one':np.random.randn(1000), 'two': 
    np.random.randn(1000) + 5})
    
    joypy.joyplot(table, color=['skyblue', 'fuchsia'], hist=True)
    
    opened by abrazhe 0
  • Passing axis parameter into new joyplot function call?

    Passing axis parameter into new joyplot function call?

    Hi!

    I'm using joyplot with raw data (loving it so far). Attached are two figures I've produced. I would like to pass the axis generated from the first one as an argument into the second, in order to plot them on the same figure. However, I get the error below. Is there an easy work-around, without putting everything into dataframes? Screen Shot 2022-03-30 at 11 59 38 AM Screen Shot 2022-03-30 at 12 00 17 PM Screen Shot 2022-03-30 at 12 01 29 PM

    opened by claysmyth 0
  • Is it possible to inverse the plot?

    Is it possible to inverse the plot?

    Is it possible to inverse the plot eg: image

    Above is plotted by Target Field and each predictors on same line, if there are many features the plot looks too clustered.

    Is it possible to split plot by predictors and each line showing target distribution, like the ggridges in R

    ggplot(gather.data.train,aes(y = variable, fill = Target, x = value, alpha=0.4)) +  geom_density_ridges(scale = 5)
    

    image Thanks

    opened by hanzigs 0
  • Joypy and gridspec

    Joypy and gridspec

    I'd like to add a plot, side-by-side the joyplot, showing all of the distributions on one plot. I am successful at doing that, but something is weird with the xticks. I've been looking through the code at how the xticks are set and I don't see why what is in your package would be wrong.

    The problem can be seen in the figure below. It looks like the xticks are being set relative to the figure size and not the last axes instance. Would you happen to know what's going on here? Because I sure don't. Any thoughts would be appreciated.

    The figure: download

    The code:

    from scipy.stats import gaussian_kde
    import numpy as np
    import pandas as pd
    from sklearn import datasets
    
    import joypy
    import matplotlib.gridspec as gridspec
    import matplotlib.pyplot as plt
    from matplotlib import cm
    from matplotlib.colors import rgb2hex
    
    iris = datasets.load_iris()
    df=pd.DataFrame(iris.data,columns=iris.feature_names)
    
    fig = plt.figure(dpi=300)
    gs = fig.add_gridspec(nrows=4, ncols=2)
    
    ax1 = fig.add_subplot(gs[0, 0])
    ax2 = fig.add_subplot(gs[1, 0])
    ax3 = fig.add_subplot(gs[2, 0])
    ax4 = fig.add_subplot(gs[3, 0])
    
    ax5 = fig.add_subplot(gs[:, 1])
    
    axes = [ax1, ax2, ax3, ax4]
    joypy.joyplot(df, ax=axes, colormap=cm.autumn_r, linewidth=0.5)
    
    limits = ax1.get_xlim()
    x = np.arange(limits[0], limits[1], 0.01)
    ncols = len(df.columns)
    
    for i, (col, vals) in enumerate(df.items()):
        gkde = gaussian_kde(vals, bw_method=None)
        y = gkde.evaluate(x)
        ax5.plot(x, y, lw=0.5, color='k')
        ax5.fill_between(x, y, color=rgb2hex(cm.autumn_r(i / ncols)))
    
    opened by K20shores 2
Owner
Leonardo Taccari
Leonardo Taccari
Cartopy - a cartographic python library with matplotlib support

Cartopy is a Python package designed to make drawing maps for data analysis and visualisation easy. Table of contents Overview Get in touch License an

null 1.2k Jan 1, 2023
The windML framework provides an easy-to-use access to wind data sources within the Python world, building upon numpy, scipy, sklearn, and matplotlib. Renewable Wind Energy, Forecasting, Prediction

windml Build status : The importance of wind in smart grids with a large number of renewable energy resources is increasing. With the growing infrastr

Computational Intelligence Group 125 Dec 24, 2022
NorthPitch is a python soccer plotting library that sits on top of Matplotlib

NorthPitch is a python soccer plotting library that sits on top of Matplotlib.

Devin Pleuler 30 Feb 22, 2022
matplotlib: plotting with Python

Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python. Check out our home page for more inform

Matplotlib Developers 16.7k Jan 8, 2023
:small_red_triangle: Ternary plotting library for python with matplotlib

python-ternary This is a plotting library for use with matplotlib to make ternary plots plots in the two dimensional simplex projected onto a two dime

Marc 611 Dec 29, 2022
A python package for animating plots build on matplotlib.

animatplot A python package for making interactive as well as animated plots with matplotlib. Requires Python >= 3.5 Matplotlib >= 2.2 (because slider

Tyler Makaro 394 Dec 18, 2022
matplotlib: plotting with Python

Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python. Check out our home page for more inform

Matplotlib Developers 13.1k Feb 18, 2021
:small_red_triangle: Ternary plotting library for python with matplotlib

python-ternary This is a plotting library for use with matplotlib to make ternary plots plots in the two dimensional simplex projected onto a two dime

Marc 391 Feb 17, 2021
A python package for animating plots build on matplotlib.

animatplot A python package for making interactive as well as animated plots with matplotlib. Requires Python >= 3.5 Matplotlib >= 2.2 (because slider

Tyler Makaro 356 Feb 16, 2021
Easily convert matplotlib plots from Python into interactive Leaflet web maps.

mplleaflet mplleaflet is a Python library that converts a matplotlib plot into a webpage containing a pannable, zoomable Leaflet map. It can also embe

Jacob Wasserman 502 Dec 28, 2022
A Python library for plotting hockey rinks with Matplotlib.

Hockey Rink A Python library for plotting hockey rinks with Matplotlib. Installation pip install hockey_rink Current Rinks The following shows the cus

null 24 Jan 2, 2023
Simple python implementation with matplotlib to manually fit MIST isochrones to Gaia DR2 color-magnitude diagrams

Simple python implementation with matplotlib to manually fit MIST isochrones to Gaia DR2 color-magnitude diagrams

Karl Jaehnig 7 Oct 22, 2022
649 Pokémon palettes as CSVs, with a Python lib to turn names/IDs into palettes, or MatPlotLib compatible ListedColormaps.

PokePalette 649 Pokémon, broken down into CSVs of their RGB colour palettes. Complete with a Python library to convert names or Pokédex IDs into eithe

null 11 Dec 5, 2022
Some examples with MatPlotLib library in Python

MatPlotLib Example Some examples with MatPlotLib library in Python Point: Run files only in project's directory About me Full name: Matin Ardestani Ag

Matin Ardestani 4 Mar 29, 2022
Flexitext is a Python library that makes it easier to draw text with multiple styles in Matplotlib

Flexitext is a Python library that makes it easier to draw text with multiple styles in Matplotlib

Tomás Capretto 93 Dec 28, 2022
Wikipedia WordCloud App generate Wikipedia word cloud art created using python's streamlit, matplotlib, wikipedia and wordcloud packages

Wikipedia WordCloud App Wikipedia WordCloud App generate Wikipedia word cloud art created using python's streamlit, matplotlib, wikipedia and wordclou

Siva Prakash 5 Jan 2, 2022
Simple CLI python app to show a stocks graph performance. Made with Matplotlib and Tiingo.

stock-graph-python Simple CLI python app to show a stocks graph performance. Made with Matplotlib and Tiingo. Tiingo API Key You will need to add your

Toby 3 May 14, 2022
MPL Plotter is a Matplotlib based Python plotting library built with the goal of delivering publication-quality plots concisely.

MPL Plotter is a Matplotlib based Python plotting library built with the goal of delivering publication-quality plots concisely.

Antonio López Rivera 162 Nov 11, 2022
Scientific Visualization: Python + Matplotlib

An open access book on scientific visualization using python and matplotlib

Nicolas P. Rougier 8.6k Dec 31, 2022