POT : Python Optimal Transport

Overview

POT: Python Optimal Transport

PyPI version Anaconda Cloud Build Status Codecov Status Downloads Anaconda downloads License

This open source Python library provide several solvers for optimization problems related to Optimal Transport for signal, image processing and machine learning.

Website and documentation: https://PythonOT.github.io/

Source Code (MIT): https://github.com/PythonOT/POT

POT provides the following generic OT solvers (links to examples):

POT provides the following Machine Learning related solvers:

Some other examples are available in the documentation.

Using and citing the toolbox

If you use this toolbox in your research and find it useful, please cite POT using the following reference from our JMLR paper:

Rémi Flamary, Nicolas Courty, Alexandre Gramfort, Mokhtar Z. Alaya, Aurélie Boisbunon, Stanislas Chambon, Laetitia Chapel, Adrien Corenflos, Kilian Fatras, Nemo Fournier, Léo Gautheron, Nathalie T.H. Gayraud, Hicham Janati, Alain Rakotomamonjy, Ievgen Redko, Antoine Rolet, Antony Schutz, Vivien Seguy, Danica J. Sutherland, Romain Tavenard, Alexander Tong, Titouan Vayer,
POT Python Optimal Transport library,
Journal of Machine Learning Research, 22(78):1−8, 2021.
Website: https://pythonot.github.io/

In Bibtex format:

@article{flamary2021pot,
  author  = {R{\'e}mi Flamary and Nicolas Courty and Alexandre Gramfort and Mokhtar Z. Alaya and Aur{\'e}lie Boisbunon and Stanislas Chambon and Laetitia Chapel and Adrien Corenflos and Kilian Fatras and Nemo Fournier and L{\'e}o Gautheron and Nathalie T.H. Gayraud and Hicham Janati and Alain Rakotomamonjy and Ievgen Redko and Antoine Rolet and Antony Schutz and Vivien Seguy and Danica J. Sutherland and Romain Tavenard and Alexander Tong and Titouan Vayer},
  title   = {POT: Python Optimal Transport},
  journal = {Journal of Machine Learning Research},
  year    = {2021},
  volume  = {22},
  number  = {78},
  pages   = {1-8},
  url     = {http://jmlr.org/papers/v22/20-451.html}
}

Installation

The library has been tested on Linux, MacOSX and Windows. It requires a C++ compiler for building/installing the EMD solver and relies on the following Python modules:

  • Numpy (>=1.16)
  • Scipy (>=1.0)
  • Cython (>=0.23)
  • Matplotlib (>=1.5)

Pip installation

Note that due to a limitation of pip, cython and numpy need to be installed prior to installing POT. This can be done easily with

pip install numpy cython

You can install the toolbox through PyPI with:

pip install POT

or get the very latest version by running:

pip install -U https://github.com/PythonOT/POT/archive/master.zip # with --user for user install (no root)

Anaconda installation with conda-forge

If you use the Anaconda python distribution, POT is available in conda-forge. To install it and the required dependencies:

conda install -c conda-forge pot

Post installation check

After a correct installation, you should be able to import the module without errors:

import ot

Note that for easier access the module is name ot instead of pot.

Dependencies

Some sub-modules require additional dependences which are discussed below

  • ot.dr (Wasserstein dimensionality reduction) depends on autograd and pymanopt that can be installed with:
pip install pymanopt autograd
  • ot.gpu (GPU accelerated OT) depends on cupy that have to be installed following instructions on this page. Obviously you will need CUDA installed and a compatible GPU.

Examples

Short examples

  • Import the toolbox
import ot
  • Compute Wasserstein distances
# a,b are 1D histograms (sum to 1 and positive)
# M is the ground cost matrix
Wd = ot.emd2(a, b, M) # exact linear program
Wd_reg = ot.sinkhorn2(a, b, M, reg) # entropic regularized OT
# if b is a matrix compute all distances to a and return a vector
  • Compute OT matrix
# a,b are 1D histograms (sum to 1 and positive)
# M is the ground cost matrix
T = ot.emd(a, b, M) # exact linear program
T_reg = ot.sinkhorn(a, b, M, reg) # entropic regularized OT
  • Compute Wasserstein barycenter
# A is a n*d matrix containing d  1D histograms
# M is the ground cost matrix
ba = ot.barycenter(A, M, reg) # reg is regularization parameter

Examples and Notebooks

The examples folder contain several examples and use case for the library. The full documentation with examples and output is available on https://PythonOT.github.io/.

Acknowledgements

This toolbox has been created and is maintained by

The contributors to this library are

This toolbox benefit a lot from open source research and we would like to thank the following persons for providing some code (in various languages):

Contributions and code of conduct

Every contribution is welcome and should respect the contribution guidelines. Each member of the project is expected to follow the code of conduct.

Support

You can ask questions and join the development discussion:

You can also post bug reports and feature requests in Github issues. Make sure to read our guidelines first.

References

[1] Bonneel, N., Van De Panne, M., Paris, S., & Heidrich, W. (2011, December). Displacement interpolation using Lagrangian mass transport. In ACM Transactions on Graphics (TOG) (Vol. 30, No. 6, p. 158). ACM.

[2] Cuturi, M. (2013). Sinkhorn distances: Lightspeed computation of optimal transport. In Advances in Neural Information Processing Systems (pp. 2292-2300).

[3] Benamou, J. D., Carlier, G., Cuturi, M., Nenna, L., & Peyré, G. (2015). Iterative Bregman projections for regularized transportation problems. SIAM Journal on Scientific Computing, 37(2), A1111-A1138.

[4] S. Nakhostin, N. Courty, R. Flamary, D. Tuia, T. Corpetti, Supervised planetary unmixing with optimal transport, Whorkshop on Hyperspectral Image and Signal Processing : Evolution in Remote Sensing (WHISPERS), 2016.

[5] N. Courty; R. Flamary; D. Tuia; A. Rakotomamonjy, Optimal Transport for Domain Adaptation, in IEEE Transactions on Pattern Analysis and Machine Intelligence , vol.PP, no.99, pp.1-1

[6] Ferradans, S., Papadakis, N., Peyré, G., & Aujol, J. F. (2014). Regularized discrete optimal transport. SIAM Journal on Imaging Sciences, 7(3), 1853-1882.

[7] Rakotomamonjy, A., Flamary, R., & Courty, N. (2015). Generalized conditional gradient: analysis of convergence and applications. arXiv preprint arXiv:1510.06567.

[8] M. Perrot, N. Courty, R. Flamary, A. Habrard (2016), Mapping estimation for discrete optimal transport, Neural Information Processing Systems (NIPS).

[9] Schmitzer, B. (2016). Stabilized Sparse Scaling Algorithms for Entropy Regularized Transport Problems. arXiv preprint arXiv:1610.06519.

[10] Chizat, L., Peyré, G., Schmitzer, B., & Vialard, F. X. (2016). Scaling algorithms for unbalanced transport problems. arXiv preprint arXiv:1607.05816.

[11] Flamary, R., Cuturi, M., Courty, N., & Rakotomamonjy, A. (2016). Wasserstein Discriminant Analysis. arXiv preprint arXiv:1608.08063.

[12] Gabriel Peyré, Marco Cuturi, and Justin Solomon (2016), Gromov-Wasserstein averaging of kernel and distance matrices International Conference on Machine Learning (ICML).

[13] Mémoli, Facundo (2011). Gromov–Wasserstein distances and the metric approach to object matching. Foundations of computational mathematics 11.4 : 417-487.

[14] Knott, M. and Smith, C. S. (1984).On the optimal mapping of distributions, Journal of Optimization Theory and Applications Vol 43.

[15] Peyré, G., & Cuturi, M. (2018). Computational Optimal Transport .

[16] Agueh, M., & Carlier, G. (2011). Barycenters in the Wasserstein space. SIAM Journal on Mathematical Analysis, 43(2), 904-924.

[17] Blondel, M., Seguy, V., & Rolet, A. (2018). Smooth and Sparse Optimal Transport. Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics (AISTATS).

[18] Genevay, A., Cuturi, M., Peyré, G. & Bach, F. (2016) Stochastic Optimization for Large-scale Optimal Transport. Advances in Neural Information Processing Systems (2016).

[19] Seguy, V., Bhushan Damodaran, B., Flamary, R., Courty, N., Rolet, A.& Blondel, M. Large-scale Optimal Transport and Mapping Estimation. International Conference on Learning Representation (2018)

[20] Cuturi, M. and Doucet, A. (2014) Fast Computation of Wasserstein Barycenters. International Conference in Machine Learning

[21] Solomon, J., De Goes, F., Peyré, G., Cuturi, M., Butscher, A., Nguyen, A. & Guibas, L. (2015). Convolutional wasserstein distances: Efficient optimal transportation on geometric domains. ACM Transactions on Graphics (TOG), 34(4), 66.

[22] J. Altschuler, J.Weed, P. Rigollet, (2017) Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration, Advances in Neural Information Processing Systems (NIPS) 31

[23] Aude, G., Peyré, G., Cuturi, M., Learning Generative Models with Sinkhorn Divergences, Proceedings of the Twenty-First International Conference on Artficial Intelligence and Statistics, (AISTATS) 21, 2018

[24] Vayer, T., Chapel, L., Flamary, R., Tavenard, R. and Courty, N. (2019). Optimal Transport for structured data with application on graphs Proceedings of the 36th International Conference on Machine Learning (ICML).

[25] Frogner C., Zhang C., Mobahi H., Araya-Polo M., Poggio T. (2015). Learning with a Wasserstein Loss Advances in Neural Information Processing Systems (NIPS).

[26] Alaya M. Z., Bérar M., Gasso G., Rakotomamonjy A. (2019). Screening Sinkhorn Algorithm for Regularized Optimal Transport, Advances in Neural Information Processing Systems 33 (NeurIPS).

[27] Redko I., Courty N., Flamary R., Tuia D. (2019). Optimal Transport for Multi-source Domain Adaptation under Target Shift, Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics (AISTATS) 22, 2019.

[28] Caffarelli, L. A., McCann, R. J. (2010). Free boundaries in optimal transport and Monge-Ampere obstacle problems, Annals of mathematics, 673-730.

[29] Chapel, L., Alaya, M., Gasso, G. (2020). Partial Optimal Transport with Applications on Positive-Unlabeled Learning, Advances in Neural Information Processing Systems (NeurIPS), 2020.

[30] Flamary R., Courty N., Tuia D., Rakotomamonjy A. (2014). Optimal transport with Laplacian regularization: Applications to domain adaptation and shape matching, NIPS Workshop on Optimal Transport and Machine Learning OTML, 2014.

[31] Bonneel, Nicolas, et al. Sliced and radon wasserstein barycenters of measures, Journal of Mathematical Imaging and Vision 51.1 (2015): 22-45

Issues
  •  [WIP] small tentative for EMD 1D in torch

    [WIP] small tentative for EMD 1D in torch

    I saw the torch branch for LP stuff. Would you be interested in my implementation for the 1d EMD (and the sliced wasserstein with it)?

    I'm not a huge fan of Pytorch so I can't vouch that what I'm doing here is the best implementation, but it feels to me like it should be fairly ok for batched inputs which is what you want for slice stuff anyway.

    opened by AdrienCorenflos 47
  • GPU changes:

    GPU changes:

    • Replace cudamat by cupy for GPU implementations (cupy is still in active development, while cudamat is not)
    • Use the new DA class instead of the old deprecated one

    TODO for another PR:

    • Performances are still a bit lower than with cudamat (even if better than CPU for large matrices). Some speedups should be possible by tweaking the code
    opened by toto6 38
  • Domain adaptation Classes

    Domain adaptation Classes

    • first proposal of DA class structure
    • BaseEstimator: OTDA wrapper (does not work as a stand-alone but implements the methods common to any OTDA algorithm)
    • SinkhornTransport: implements Sinkhorn algorithm for OTDA
    • try doc strings compliant with numpy requirements
    opened by Slasnista 29
  • Domain adaptation Classes

    Domain adaptation Classes

    We should change the domain adaptation Classes to be more sklearn compliant.

    Main issues:

    • Use CamelCase for classes
    • Use init for setting parameters and instead of fit.

    @agramfort proposed to Creat new Clases with proper names and begin deprecating the old classes.

    I think it is a good move.

    enhancement 
    opened by rflamary 28
  • Not in simplex -- two sets of largely different sizes

    Not in simplex -- two sets of largely different sizes

    I am trying to calculate the EMD of two sets. When one set has a few hundred entries and the other has only 2, the EMD calculation fails and returns Problem Infeasible.

    Steps to reproduce the behavior: ** SEE BELOW COMMENT FOR FIXED SCRIPT **

    Expected behavior Should return EMD around 1, instead says that the sets spherEng1 and pencilEnergy are not in the simplex

    Screenshots Here is comparing the EMDs calculated for less densely tiled to most densely tiled (number of particles = number of segments) with the two element set image

    Desktop (please complete the following information):

    • OS: [MacOSX]
    • Python version [3.6]
    • POT installed with pip

    import platform; print(platform.platform()) Darwin-16.7.0-x86_64-i386-64bit import sys; print("Python", sys.version) ('Python', '2.7.15 |Anaconda, Inc.| (default, Dec 14 2018, 13:10:39) \n[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)]') import numpy; print("NumPy", numpy.version) ('NumPy', '1.15.4') import scipy; print("SciPy", scipy.version) ('SciPy', '1.1.0') import ot; print("POT", ot.version) ('POT', '0.5.1')

    opened by caricesarotti 25
  • [MRG] Sliced wasserstein

    [MRG] Sliced wasserstein

    Types of changes

    • [ ] Docs change / refactoring / dependency upgrade
    • [ ] Bug fix (non-breaking change which fixes an issue)
    • [X] New feature (non-breaking change which adds functionality)
    • [ ] Breaking change (fix or feature that would cause existing functionality to change)

    Motivation and context / Related issue

    Implement SWD: https://github.com/PythonOT/POT/issues/202

    How has this been tested (if it applies)

    Added specific tests (positive definiteness + matching the EMD in the 1D case)

    Checklist

    • [X] The documentation is up-to-date with the changes I made.
    • [X] I have read the CONTRIBUTING document.
    • [x] All tests passed, and additional code has been covered with new tests.

    Not sure why yet but the stuff doesn't build.

    I'm publishing this as a draft as I have some other changes in my branch that are pending for another merge (cf this: https://github.com/PythonOT/POT/issues/200)

    opened by AdrienCorenflos 19
  • [WIP] torch implementation of the Sliced Wasserstein Distance

    [WIP] torch implementation of the Sliced Wasserstein Distance

    Types of changes

    • [ ] Docs change / refactoring / dependency upgrade
    • [ ] Bug fix (non-breaking change which fixes an issue)
    • [x] New feature (non-breaking change which adds functionality)
    • [ ] Breaking change (fix or feature that would cause existing functionality to change)

    Motivation and context / Related issue

    Torch implementation of the SWD (or sliced OT loss? how do you want to call it?)

    https://github.com/PythonOT/POT/issues/225

    How has this been tested (if it applies)

    Added a few unittests, needs to be tested further (WIP)

    Checklist

    • [ ] The documentation is up-to-date with the changes I made.
    • [x] I have read the CONTRIBUTING document.
    • [ ] All tests passed, and additional code has been covered with new tests.
    opened by AdrienCorenflos 16
  • fail when using

    fail when using "pip install POT"

    When I use "pip install POT", it failed. It depended on Cython. However, it seems that it forgets to tell pip that it depends on Cython.

    I solve this problem by install Cython first. However, if we write both Cython and POT into requirements.txt, the installation will fail.

    Could anyone solve that?

    documentation 
    opened by Adoni 15
  • [MRG] Improved docs and changed scipy version

    [MRG] Improved docs and changed scipy version

    I changed the scipy version requirements since version scipy 1.2.1 made my POT crash (cannot remember on which call, sorry, it happened while building the docs) and the issue was fixed when upgrading to scipy 1.3

    Apart from that, the main goal of this PR is to homogenize a bit the presentation in the docs.

    opened by rtavenar 15
  • [MRG] Add Unbalanced KL Wasserstein distance + barycenter

    [MRG] Add Unbalanced KL Wasserstein distance + barycenter

    new unbalanced module for UOT with KL relaxation with the funcs:

    • sinkhorn_unbalanced: generalized Sinkhorn to compute W

    • barycenter_unbalanced: unbalanced Wasserstein barycenter

    • Tests of convergence for both algorithms

    • Examples plot_UOT_1D and plot_UOT_barycenter_1D with unbalanced gaussian distributions.

    new feature 
    opened by hichamjanati 14
  •  pot installation fails with numpy < 1.20

    pot installation fails with numpy < 1.20

    Describe the bug

    import ot fails with after pip install pot with numpy<=1.19.5

    
    ---> 22 from .emd_wrap import emd_c, check_result, emd_1d_sorted
         23 from .solver_1d import emd_1d, emd2_1d, wasserstein_1d
         24 
    
    ot/lp/emd_wrap.pyx in init ot.lp.emd_wrap()
    
    ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject
    

    Fixes

    Either we re-do the wheels with earlier versions of numpy or we require numpy >= 1.20. Since 1.20 however, numpy has dropped support of Python 3.6.

    bug help wanted 
    opened by hichamjanati 2
  • License terms for files in data subdirectory unclear

    License terms for files in data subdirectory unclear

    The only licensing terms given for POT is the MIT license in LICENSE. It seems unclear to me whether this license applies to the data files in the data directory. This should be made explicit.

    Notes seem to indicate that this file comes from a Google search, casting the licensing situation into doubt.

    bug help wanted 
    opened by gspr 0
  • POT doesn't build with poetry and Python 3.9

    POT doesn't build with poetry and Python 3.9

    Describe the bug

    poetry add POT fails at build time due to missing build dependency "Cython"

    I reported the bug to the poetry developers who diagnosed the issue: https://github.com/python-poetry/poetry/issues/4543#issuecomment-925158474

    these packages doesn't provide a wheel in all cases. So it's necessary to build them from the sdist files. The build step happens in a temporary build environment. That's why you don't have success by installing the build-dependency manually. All these package doesn't provide a pyproject.toml containing build-system requirements according to PEP 518. Please report it to the maintainer of the projects.

    It looks like POT provide wheel files for python < 3.9. So there is no need to build the package.

    Would it be possible to specify the build-time requirements? Or alternatively wheels for Python 3.9?

    Many thanks

    bug help wanted 
    opened by mbahri 1
  • Implement set_gradient for JAX backend

    Implement set_gradient for JAX backend

    One can implement it easily by using compensating sums.

    opened by AdrienCorenflos 0
  • Build warnings on 0.7.0

    Build warnings on 0.7.0

    Describe the bug

    When building POT using Spack I get a bunch of warnings:

    ot/lp/EMD_wrapper.cpp: In function 'int EMD_wrap(int, int, double*, double*, double*, double*, double*, double*, double*, int)':
    ot/lp/EMD_wrapper.cpp:21:16: warning: unused variable 'i' [-Wunused-variable]
       21 |      int n, m, i, cur;
          |                ^
    In file included from ot/lp/full_bipartitegraph.h:29,
                     from ot/lp/network_simplex_simple.h:64,
                     from ot/lp/EMD.h:21,
                     from ot/lp/EMD_wrapper.cpp:15:
    ot/lp/core.h:85:25: warning: typedef 'Node' locally defined but not used [-Wunused-local-typedefs]
       85 |   typedef Digraph::Node Node;                                           \
          |                         ^~~~
    ot/lp/EMD_wrapper.cpp:24:5: note: in expansion of macro 'DIGRAPH_TYPEDEFS'
       24 |     DIGRAPH_TYPEDEFS(FullBipartiteDigraph);
          |     ^~~~~~~~~~~~~~~~
    In file included from ot/lp/EMD.h:21,
                     from ot/lp/EMD_wrapper.cpp:15:
    ot/lp/network_simplex_simple.h: In instantiation of 'lemon::NetworkSimplexSimple<GR, V, C, NodesType>::NetworkSimplexSimple(const GR&, bool, int, long long int, int) [with GR = lemon::FullBipartiteDigraph; V = double; C = double; NodesType = unsigned int]':
    ot/lp/EMD_wrapper.cpp:51:94:   required from here
    ot/lp/network_simplex_simple.h:519:19: warning: 'lemon::NetworkSimplexSimple<lemon::FullBipartiteDigraph, double, double, unsigned int>::_init_nb_arcs' will be initialized after [-Wreorder]
      519 |         long long _init_nb_arcs;
          |                   ^~~~~~~~~~~~~
    In file included from ot/lp/EMD.h:21,
                     from ot/lp/EMD_wrapper.cpp:15:
    ot/lp/network_simplex_simple.h:360:21: warning:   'const Value lemon::NetworkSimplexSimple<lemon::FullBipartiteDigraph, double, double, unsigned int>::MAX' [-Wreorder]
      360 |         const Value MAX;
          |                     ^~~
    ot/lp/network_simplex_simple.h:231:9: warning:   when initialized here [-Wreorder]
      231 |         NetworkSimplexSimple(const GR& graph, bool arc_mixing, int nbnodes, long long nb_arcs,int maxiters) :
          |         ^~~~~~~~~~~~~~~~~~~~
    In file included from /spack/opt/spack/linux-ubuntu20.04-zen2/gcc-10.3.0/py-numpy-1.21.2-jiksbcztk2f7z5squ72oqa5vxroxz662/lib/python3.8/site-packages/numpy/core/include/numpy/ndarraytypes.h:1969,
                     from /spack/opt/spack/linux-ubuntu20.04-zen2/gcc-10.3.0/py-numpy-1.21.2-jiksbcztk2f7z5squ72oqa5vxroxz662/lib/python3.8/site-packages/numpy/core/include/numpy/ndarrayobject.h:12,
                     from /spack/opt/spack/linux-ubuntu20.04-zen2/gcc-10.3.0/py-numpy-1.21.2-jiksbcztk2f7z5squ72oqa5vxroxz662/lib/python3.8/site-packages/numpy/core/include/numpy/arrayobject.h:4,
                     from ot/lp/emd_wrap.cpp:664:
    /spack/opt/spack/linux-ubuntu20.04-zen2/gcc-10.3.0/py-numpy-1.21.2-jiksbcztk2f7z5squ72oqa5vxroxz662/lib/python3.8/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:17:2: warning: #warning "Using deprecated NumPy API, disable it with " "#define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp]
       17 | #warning "Using deprecated NumPy API, disable it with " \
          |  ^~~~~~~
    /spack/lib/spack/env/gcc/g++ -shared build/temp.linux-x86_64-3.8/ot/lp/emd_wrap.o build/temp.linux-x86_64-3.8/ot/lp/EMD_wrapper.o -L/spack/opt/spack/linux-ubuntu20.04-zen2/gcc-10.3.0/python-3.8.11-4li4ogjlwogc5ng2osdiwmytichl2sdl/lib -o build/lib.linux-x86_64-3.8/ot/lp/emd_wrap.cpython-38-x86_64-linux-gnu.so
    /spack/opt/spack/linux-ubuntu20.04-zen2/gcc-10.3.0/py-cython-0.29.24-dewyxvex7bz2xr2lc2rfq37fb4idyftn/lib/python3.8/site-packages/Cython/Compiler/Main.py:369: FutureWarning: Cython directive 'language_level' not set, using 2 for now (Py2). This will change in a later release! File: /tmp/harmen/spack-stage/spack-stage-py-pot-0.7.0-r4n3aswykr2uhg3fn4iw34lahuewarad/spack-src/ot/lp/emd_wrap.pyx
      tree = Parsing.p_module(s, pxd, full_module_name)
    /spack/opt/spack/linux-ubuntu20.04-zen2/gcc-10.3.0/py-setuptools-57.4.0-folza24teugwuwimv3z7oxkft6afl4ng/lib/python3.8/site-packages/setuptools/dist.py:697: UserWarning: Usage of dash-separated 'description-file' will not be supported in future versions. Please use the underscore name 'description_file' instead
      warnings.warn(
    

    Environment

    Screenshot from 2021-08-31 13-00-06

    bug help wanted 
    opened by haampie 2
  • Linear OT mapping across different spaces

    Linear OT mapping across different spaces

    🚀 Feature

    Would it be possible to perform linear domain adaptation (like is done by ot.da.LinearTransport) but across different spaces ?

    Motivation

    The LinearTransport is very useful to align distributions without losing information from the original geometry. It would be great to have the same feature but for distributions in different spaces / in the Gromov-Wasserstein context

    enhancement 
    opened by j-bac 0
  • nan values and division by zero warnings on stochastic solvers

    nan values and division by zero warnings on stochastic solvers

    Description

    The following output showed up when playing with the codes on the docs of the stochastic sub-module. I think the output of the following code snippet is clear enough to show where the problem is but I didn't want to put a PR request directly since I am really new to these topics.

    To Reproduce

    Below is the same code samples from the docs except n_source here is significantly higher.

    import cupy as cp
    import ot
    n_source = 70000
    n_target = 100
    reg = 1
    numItermax = 100000
    lr = 0.1
    batch_size = 3
    log = True
    
    a = ot.utils.unif(n_source)
    b = ot.utils.unif(n_target)
    
    rng = np.random.RandomState(0)
    X_source = rng.randn(n_source, 2)
    Y_target = rng.randn(n_target, 2)
    M = ot.dist(X_source, Y_target)
    
    method = "ASGD"
    asgd_pi, log_asgd = ot.stochastic.solve_semi_dual_entropic(a, b, M, reg, method, numItermax, log=log)
    print(log_asgd['alpha'], log_asgd['beta'])
    print(asgd_pi)
    
    /home/selman/anaconda3/envs/sd/lib/python3.9/site-packages/ot/stochastic.py:85: RuntimeWarning: overflow encountered in exp
      exp_beta = np.exp(-r / reg) * b
    /home/selman/anaconda3/envs/sd/lib/python3.9/site-packages/ot/stochastic.py:86: RuntimeWarning: invalid value encountered in true_divide
      khi = exp_beta / (np.sum(exp_beta))
    [nan nan nan ... nan nan nan] [nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan
     nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan
     nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan
     nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan
     nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan
     nan nan nan nan nan nan nan nan nan nan]
    [[nan nan nan ... nan nan nan]
     [nan nan nan ... nan nan nan]
     [nan nan nan ... nan nan nan]
     ...
     [nan nan nan ... nan nan nan]
     [nan nan nan ... nan nan nan]
     [nan nan nan ... nan nan nan]]
    

    Additional Context

    Right now I use stochastic.py file on my own project seperately because of this problem. I added a small value on the divisions and it seems to work fine but I am not sure if it is an appropiate aproach. For example:

    khi = exp_beta / (np.sum(exp_beta) + 1e-8)
    

    Environment:

    Linux-5.10.42-1-MANJARO-x86_64-with-glibc2.33
    Python 3.9.5 (default, Jun  4 2021, 12:28:51) 
    [GCC 7.5.0]
    NumPy 1.20.2
    SciPy 1.6.2
    POT 0.7.0
    
    bug help wanted 
    opened by syelman 3
  • dim>1 for ot.bregman.barycenter

    dim>1 for ot.bregman.barycenter

    🚀 Feature

    Extend ot.bregman.barycenter to higher dimensions (>1D data)

    Motivation

    Computing barycenters for color images requires at least 3 dimensions (rgb or lab).

    Pitch

    Recent papers have suggested this for data augmentation

    enhancement 
    opened by gabrieldernbach 3
  • Sinkhorn Divergence code does not match referenced paper

    Sinkhorn Divergence code does not match referenced paper

    Describe the bug

    I'm opening as a bug, but the code actually does work. The issue is the definition used for Sinkhorn Divergence. Basically, the Sinkhorn Divergence code does not match the formula in the referenced paper (["Learning Generative Models with Sinkhorn Divergences", 2017])(https://arxiv.org/pdf/1706.00292.pdf).

    Code sample and Expected behavior

    Here is a piece of the code from empirical_sinkhorn_divergence: sinkhorn_div = sinkhorn_loss_ab - 0.5 * (sinkhorn_loss_a + sinkhorn_loss_b).

    To match the sinkhorn divergence formula from the paper, the code should probably be: sinkhorn_div = 2* sinkhorn_loss_ab - (sinkhorn_loss_a + sinkhorn_loss_b).

    This is a minor issue, but perhaps the documentation should address this difference.

    Another issue is that the sinkhorn_loss returns

    W &= \min_\gamma <\gamma,M>_F + reg\cdot\Omega(\gamma)
    

    While in the paper, the sinkhorn cost is only

    W &= \gamma* <\gamma,M>_F,
    

    where \gamma* is the optimal plan for the regularized problem. In other words, the regularization term is only used to find the optimal plan and is then discarded.

    help wanted documentation 
    opened by davibarreira 3
  • Interpolated/partial transform?

    Interpolated/partial transform?

    🚀 Feature

    transform (and perhaps inverse_transform) should allow for interpolation (partial) transformation, given lambda (default=1).

    Motivation

    Interpolation allows one to seamlessly "morph" between two distributions.

    Pitch

    This is useful for all kinds of image processing tasks, where one does not want to fully transform a distribution, but gradually or partially transform the distribution.

    For example, consider this blog post where your toolkit is used to transform the color map from a day image onto a night image. Interpolated transport would allow this tranformation to happen gradually and generate a video.

    Alternatives

    Kludging the matrix and then running transform. Any guidance on how to do this would be greatly appreciated.

    Additional context

    Roma et al. (2020) describe this process for audio morphing: "Displacement interpolation is then accomplished by sliding through the non-zero entries of the transport matrix: given an interpolation parameter λ, each pair of masses in the matrix are interpolated to (1 − λ)xi + λyi and added to the output spectrum."

    Attached is an image from their work:

    image

    enhancement 
    opened by turian 0
Releases(0.8.0)
Owner
Python Optimal Transport
Python Optimal Transport
The world's simplest facial recognition api for Python and the command line

Face Recognition You can also read a translated version of this file in Chinese 简体中文版 or in Korean 한국어 or in Japanese 日本語. Recognize and manipulate fa

Adam Geitgey 42.3k Dec 1, 2021
A Python wrapper for Google Tesseract

Python Tesseract Python-tesseract is an optical character recognition (OCR) tool for python. That is, it will recognize and "read" the text embedded i

Matthias A Lee 3.9k Dec 2, 2021
A Python wrapper for the tesseract-ocr API

tesserocr A simple, Pillow-friendly, wrapper around the tesseract-ocr API for Optical Character Recognition (OCR). tesserocr integrates directly with

Fayez 1.6k Dec 1, 2021
Simple SDF mesh generation in Python

Generate 3D meshes based on SDFs (signed distance functions) with a dirt simple Python API.

Michael Fogleman 853 Nov 22, 2021
Image processing in Python

scikit-image: Image processing in Python Website (including documentation): https://scikit-image.org/ Mailing list: https://mail.python.org/mailman3/l

Image Processing Toolbox for SciPy 4.6k Nov 26, 2021
Image augmentation library in Python for machine learning.

Augmentor is an image augmentation library in Python for machine learning. It aims to be a standalone library that is platform and framework independe

Marcus D. Bloice 4.6k Nov 24, 2021
A simple document layout analysis using Python-OpenCV

Run the application: python main.py *Note: For first time running the application, create a folder named "output". The application is a simple documen

Roinand Aguila 96 Oct 29, 2021
Python library to extract tabular data from images and scanned PDFs

Overview ExtractTable - API to extract tabular data from images and scanned PDFs The motivation is to make it easy for developers to extract tabular d

Org. Account 96 Nov 23, 2021
Textboxes_plusplus implementation with Tensorflow (python)

TextBoxes++-TensorFlow TextBoxes++ re-implementation using tensorflow. This project is greatly inspired by slim project And many functions are modifie

null 84 Aug 30, 2021
Textboxes implementation with Tensorflow (python)

tb_tensorflow A python implementation of TextBoxes Dependencies TensorFlow r1.0 OpenCV2 Code from Chaoyue Wang 03/09/2017 Update: 1.Debugging optimize

Jayne Shin (신재인) 20 May 31, 2019
Textboxes : Image Text Detection Model : python package (tensorflow)

shinTB Abstract A python package for use Textboxes : Image Text Detection Model implemented by tensorflow, cv2 Textboxes Paper Review in Korean (My Bl

Jayne Shin (신재인) 90 Sep 9, 2021
python ocr using tesseract/ with EAST opencv detector

pytextractor python ocr using tesseract/ with EAST opencv text detector Uses the EAST opencv detector defined here with pytesseract to extract text(de

Danny Crasto 35 Nov 21, 2021
Python-based tools for document analysis and OCR

ocropy OCRopus is a collection of document analysis programs, not a turn-key OCR system. In order to apply it to your documents, you may need to do so

OCRopus 3.1k Nov 27, 2021
A Tensorflow model for text recognition (CNN + seq2seq with visual attention) available as a Python package and compatible with Google Cloud ML Engine.

Attention-based OCR Visual attention-based OCR model for image recognition with additional tools for creating TFRecords datasets and exporting the tra

Ed Medvedev 880 Nov 22, 2021
Detect text blocks and OCR poorly scanned PDFs in bulk. Python module available via pip.

doc2text doc2text extracts higher quality text by fixing common scan errors Developing text corpora can be a massive pain in the butt. Much of the tex

Joe Sutherland 1.3k Nov 27, 2021
~1000 book pages + OpenCV + python = page regions identified as paragraphs, lines, images, captions, etc.

cosc428-structor I had an open-ended Computer Vision assignment to complete, and an out-of-copyright book that I wanted to turn into an ebook. Convent

Chad Oliver 40 Nov 11, 2021
A version of nrsc5-gui that merges the interface developed by cmnybo with the architecture developed by zefie in order to start a new baseline that is not heavily dependent upon Python processing.

NRSC5-DUI is a graphical interface for nrsc5. It makes it easy to play your favorite FM HD radio stations using an RTL-SDR dongle. It will also displa

null 38 Nov 15, 2021
LEARN OPENCV IN 3 HOURS USING PYTHON - INCLUDING EXAMPLE PROJECTS

LEARN OPENCV IN 3 HOURS USING PYTHON - INCLUDING EXAMPLE PROJECTS

Murtaza Hassan 629 Nov 29, 2021
Rest API Written In Python To Classify NSFW Images.

✨ NSFW Classifier API ✨ Rest API Written In Python To Classify NSFW Images. Fastest Solution If you don't want to selfhost it, there's already an inst

Akshay Rajput 15 Nov 22, 2021