A high-performance topological machine learning toolbox in Python

Overview

Version Azure-build Azure-cov Azure-test Twitter-follow Slack-join

giotto-tda

giotto-tda is a high-performance topological machine learning toolbox in Python built on top of scikit-learn and is distributed under the GNU AGPLv3 license. It is part of the Giotto family of open-source projects.

Project genesis

giotto-tda is the result of a collaborative effort between L2F SA, the Laboratory for Topology and Neuroscience at EPFL, and the Institute of Reconfigurable & Embedded Digital Systems (REDS) of HEIG-VD.

License

giotto-tda is distributed under the AGPLv3 license. If you need a different distribution license, please contact the L2F team.

Documentation

Please visit https://giotto-ai.github.io/gtda-docs and navigate to the version you are interested in.

Installation

Dependencies

The latest stable version of giotto-tda requires:

  • Python (>= 3.6)
  • NumPy (>= 1.19.1)
  • SciPy (>= 1.5.0)
  • joblib (>= 0.16.0)
  • scikit-learn (>= 0.23.1)
  • pyflagser (>= 0.4.3)
  • python-igraph (>= 0.8.2)
  • plotly (>= 4.8.2)
  • ipywidgets (>= 7.5.1)

To run the examples, jupyter is required.

User installation

The simplest way to install giotto-tda is using pip

python -m pip install -U giotto-tda

If necessary, this will also automatically install all the above dependencies. Note: we recommend upgrading pip to a recent version as the above may fail on very old versions.

Pre-release, experimental builds containing recently added features, and/or bug fixes can be installed by running

python -m pip install -U giotto-tda-nightly

The main difference between giotto-tda-nightly and the developer installation (see the section on contributing, below) is that the former is shipped with pre-compiled wheels (similarly to the stable release) and hence does not require any C++ dependencies. As the main library module is called gtda in both the stable and nightly versions, giotto-tda and giotto-tda-nightly should not be installed in the same environment.

Developer installation

Please consult the dedicated page for detailed instructions on how to build giotto-tda from sources across different platforms.

Contributing

We welcome new contributors of all experience levels. The Giotto community goals are to be helpful, welcoming, and effective. To learn more about making a contribution to giotto-tda, please consult the relevant page.

Testing

After developer installation, you can launch the test suite from outside the source directory

pytest gtda

Important links

Citing giotto-tda

If you use giotto-tda in a scientific publication, we would appreciate citations to the following paper:

giotto-tda: A Topological Data Analysis Toolkit for Machine Learning and Data Exploration, Tauzin et al, J. Mach. Learn. Res. 22.39 (2021): 1-6.

You can use the following BibTeX entry:

@article{giotto-tda,
  author  = {Guillaume Tauzin and Umberto Lupo and Lewis Tunstall and Julian Burella P\'{e}rez and Matteo Caorsi and Anibal M. Medina-Mardones and Alberto Dassatti and Kathryn Hess},
  title   = {giotto-tda: A Topological Data Analysis Toolkit for Machine Learning and Data Exploration},
  journal = {Journal of Machine Learning Research},
  year    = {2021},
  volume  = {22},
  number  = {39},
  pages   = {1-6},
  url     = {http://jmlr.org/papers/v22/20-325.html}
}

Community

giotto-ai Slack workspace: https://slack.giotto.ai/

Contacts

[email protected]

Comments
  • Add notebook for topological time series example

    Add notebook for topological time series example

    Reference issues/PRs

    Types of changes

    • [ ] Bug fix (non-breaking change which fixes an issue)
    • [x] New feature (non-breaking change which adds functionality)
    • [ ] Breaking change (fix or feature that would cause existing functionality to change)

    Description

    This PR adds a Jupyter notebook example that showcases how giotto-tda can be used for time series analysis. It focuses on two main ideas:

    • Applying Takens' embedding theorem
    • Calculating persistence diagrams from the embeddings

    I decided against showcasing the resampling tricks, since I felt this would distract from the core aim of "getting started".

    Screenshots (if appropriate)

    Any other comments? I have not yet verified whether the resulting docs from this notebook look good or not. Will do it once I sync with @wreise

    Checklist

    • [x] I have read the guidelines for contributing.
    • [ ] My code follows the code style of this project. I used flake8 to check my Python changes.
    • [x] My change requires a change to the documentation.
    • [ ] I have updated the documentation accordingly.
    • [ ] I have added tests to cover my changes.
    • [x] All new and existing tests passed. I used pytest to check this on Python tests.
    enhancement 
    opened by lewtun 47
  • [BUG] Dev install fails on macOS Catalina with Python 3.8

    [BUG] Dev install fails on macOS Catalina with Python 3.8

    Describe the bug

    I am running into C++ errors during the developer install (stack trace below) 😭

    To reproduce

    1. Create a conda env with Python 3.8:
    conda create python=3.8 --name gtda && conda activate gtda
    
    1. Install the external dependencies:
    brew install gcc cmake boost
    
    1. Install the remaining dependencies:
    python -m pip install -e ".[dev]"
    

    Expected behavior

    Install step passes without error.

    Actual behaviour

      Running setup.py develop for giotto-tda
    ERROR: Command errored out with exit status 1:
         command: /Users/lewtun/miniconda3/envs/gtda/bin/python -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/Users/lewtun/git/giotto-tda/setup.py'"'"'; __file__='"'"'/Users/lewtun/git/giotto-tda/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' develop --no-deps
             cwd: /Users/lewtun/git/giotto-tda/
        Complete output (145 lines):
        running develop
        running egg_info
        writing giotto_tda.egg-info/PKG-INFO
        writing dependency_links to giotto_tda.egg-info/dependency_links.txt
        writing requirements to giotto_tda.egg-info/requires.txt
        writing top-level names to giotto_tda.egg-info/top_level.txt
        reading manifest file 'giotto_tda.egg-info/SOURCES.txt'
        reading manifest template 'MANIFEST.in'
        writing manifest file 'giotto_tda.egg-info/SOURCES.txt'
        running build_ext
        -- pybind11 v2.6.0 dev
        -- Configuring done
        -- Generating done
        -- Build files have been written to: /Users/lewtun/git/giotto-tda/build/temp.macosx-10.9-x86_64-3.8
        [  9%] Built target gtda_cech_complex
        [ 13%] Building CXX object CMakeFiles/gtda_wasserstein.dir/gtda/externals/bindings/wasserstein_bindings.cpp.o
        Scanning dependencies of target gtda_periodic_cubical_complex
        [ 18%] Building CXX object CMakeFiles/gtda_periodic_cubical_complex.dir/gtda/externals/bindings/periodic_cubical_complex_bindings.cpp.o
        clang: warning: optimization flag '-frounding-math' is not supported [-Wignored-optimization-argument]
        clang: warning: argument unused during compilation: '-shared' [-Wunused-command-line-argument]
        In file included from /Users/lewtun/git/giotto-tda/gtda/externals/bindings/wasserstein_bindings.cpp:6:
        In file included from /Users/lewtun/git/giotto-tda/gtda/externals/bindings/../hera/wasserstein/wasserstein.h:39:
        In file included from /Users/lewtun/git/giotto-tda/gtda/externals/bindings/../hera/wasserstein/auction_runner_gs.h:36:
        In file included from /Users/lewtun/git/giotto-tda/gtda/externals/bindings/../hera/wasserstein/auction_oracle.h:36:
        In file included from /Users/lewtun/git/giotto-tda/gtda/externals/bindings/../hera/wasserstein/auction_oracle_kdtree_restricted.h:44:
        In file included from /Users/lewtun/git/giotto-tda/gtda/externals/bindings/../hera/wasserstein/dnn/geometry/euclidean-fixed.h:16:
        In file included from /Users/lewtun/git/giotto-tda/gtda/externals/bindings/../hera/wasserstein/dnn/geometry/../parallel/tbb.h:138:
        /usr/local/include/boost/progress.hpp:23:1: warning: This header is deprecated. Use the facilities in <boost/timer/timer.hpp> or <boost/timer/progress_display.hpp> instead. [-W#pragma-messages]
        BOOST_HEADER_DEPRECATED( "the facilities in <boost/timer/timer.hpp> or <boost/timer/progress_display.hpp>" )
        ^
        /usr/local/include/boost/config/header_deprecated.hpp:23:37: note: expanded from macro 'BOOST_HEADER_DEPRECATED'
        # define BOOST_HEADER_DEPRECATED(a) BOOST_PRAGMA_MESSAGE("This header is deprecated. Use " a " instead.")
                                            ^
        /usr/local/include/boost/config/pragma_message.hpp:24:34: note: expanded from macro 'BOOST_PRAGMA_MESSAGE'
        # define BOOST_PRAGMA_MESSAGE(x) _Pragma(BOOST_STRINGIZE(message(x)))
                                         ^
        <scratch space>:141:2: note: expanded from here
         message("This header is deprecated. Use " "the facilities in <boost/timer/timer.hpp> or <boost/timer/progress_display.hpp>" " instead.")
         ^
        In file included from /Users/lewtun/git/giotto-tda/gtda/externals/bindings/wasserstein_bindings.cpp:6:
        In file included from /Users/lewtun/git/giotto-tda/gtda/externals/bindings/../hera/wasserstein/wasserstein.h:39:
        In file included from /Users/lewtun/git/giotto-tda/gtda/externals/bindings/../hera/wasserstein/auction_runner_gs.h:36:
        In file included from /Users/lewtun/git/giotto-tda/gtda/externals/bindings/../hera/wasserstein/auction_oracle.h:36:
        In file included from /Users/lewtun/git/giotto-tda/gtda/externals/bindings/../hera/wasserstein/auction_oracle_kdtree_restricted.h:44:
        In file included from /Users/lewtun/git/giotto-tda/gtda/externals/bindings/../hera/wasserstein/dnn/geometry/euclidean-fixed.h:16:
        In file included from /Users/lewtun/git/giotto-tda/gtda/externals/bindings/../hera/wasserstein/dnn/geometry/../parallel/tbb.h:138:
        In file included from /usr/local/include/boost/progress.hpp:25:
        /usr/local/include/boost/timer.hpp:21:1: warning: This header is deprecated. Use the facilities in <boost/timer/timer.hpp> instead. [-W#pragma-messages]
        BOOST_HEADER_DEPRECATED( "the facilities in <boost/timer/timer.hpp>" )
        ^
        /usr/local/include/boost/config/header_deprecated.hpp:23:37: note: expanded from macro 'BOOST_HEADER_DEPRECATED'
        # define BOOST_HEADER_DEPRECATED(a) BOOST_PRAGMA_MESSAGE("This header is deprecated. Use " a " instead.")
                                            ^
        /usr/local/include/boost/config/pragma_message.hpp:24:34: note: expanded from macro 'BOOST_PRAGMA_MESSAGE'
        # define BOOST_PRAGMA_MESSAGE(x) _Pragma(BOOST_STRINGIZE(message(x)))
                                         ^
        <scratch space>:144:2: note: expanded from here
         message("This header is deprecated. Use " "the facilities in <boost/timer/timer.hpp>" " instead.")
         ^
        In file included from /Users/lewtun/git/giotto-tda/gtda/externals/bindings/wasserstein_bindings.cpp:6:
        In file included from /Users/lewtun/git/giotto-tda/gtda/externals/bindings/../hera/wasserstein/wasserstein.h:39:
        In file included from /Users/lewtun/git/giotto-tda/gtda/externals/bindings/../hera/wasserstein/auction_runner_gs.h:36:
        In file included from /Users/lewtun/git/giotto-tda/gtda/externals/bindings/../hera/wasserstein/auction_oracle.h:37:
        /Users/lewtun/git/giotto-tda/gtda/externals/bindings/../hera/wasserstein/auction_oracle_kdtree_single_diag.h:89:14: warning: 'decltype(auto)' type specifier is a C++14 extension [-Wc++14-extensions]
            decltype(auto) emplace(Args&&... args)
                     ^
        /Users/lewtun/git/giotto-tda/gtda/externals/bindings/../hera/wasserstein/auction_oracle_kdtree_single_diag.h:89:5: error: deduced return types are a C++14 extension
            decltype(auto) emplace(Args&&... args)
            ^
        /Users/lewtun/git/giotto-tda/gtda/externals/bindings/../hera/wasserstein/auction_oracle_kdtree_single_diag.h:98:14: warning: 'decltype(auto)' type specifier is a C++14 extension [-Wc++14-extensions]
            decltype(auto) insert(const ItemSliceR& item) { return keeper.insert(item); }
                     ^
        /Users/lewtun/git/giotto-tda/gtda/externals/bindings/../hera/wasserstein/auction_oracle_kdtree_single_diag.h:98:5: error: deduced return types are a C++14 extension
            decltype(auto) insert(const ItemSliceR& item) { return keeper.insert(item); }
            ^
        In file included from /Users/lewtun/git/giotto-tda/gtda/externals/bindings/periodic_cubical_complex_bindings.cpp:6:
        In file included from /Users/lewtun/git/giotto-tda/gtda/externals/gudhi-devel/src/python/include/Cubical_complex_interface.h:14:
        In file included from /Users/lewtun/git/giotto-tda/gtda/externals/gudhi-devel/src/Bitmap_cubical_complex/include/gudhi/Bitmap_cubical_complex.h:15:
        /Users/lewtun/git/giotto-tda/gtda/externals/gudhi-devel/src/Bitmap_cubical_complex/include/gudhi/Bitmap_cubical_complex_periodic_boundary_conditions_base.h:117:15: warning: 'Gudhi::cubical_complex::Bitmap_cubical_complex_periodic_boundary_conditions_base<double>::compute_incidence_between_cells' hides overloaded virtual function [-Woverloaded-virtual]
          virtual int compute_incidence_between_cells(std::size_t coface, std::size_t face) {
                      ^
        /Users/lewtun/git/giotto-tda/gtda/externals/gudhi-devel/src/Bitmap_cubical_complex/include/gudhi/Bitmap_cubical_complex.h:48:39: note: in instantiation of template class 'Gudhi::cubical_complex::Bitmap_cubical_complex_periodic_boundary_conditions_base<double>' requested here
        class Bitmap_cubical_complex : public T {
                                              ^
        /Users/lewtun/git/giotto-tda/gtda/externals/gudhi-devel/src/python/include/Cubical_complex_interface.h:27:42: note: in instantiation of template class 'Gudhi::cubical_complex::Bitmap_cubical_complex<Gudhi::cubical_complex::Bitmap_cubical_complex_periodic_boundary_conditions_base<double> >' requested here
        class Cubical_complex_interface : public Bitmap_cubical_complex<CubicalComplexOptions> {
                                                 ^
        /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/type_traits:1661:38: note: in instantiation of template class 'Gudhi::cubical_complex::Cubical_complex_interface<Gudhi::cubical_complex::Bitmap_cubical_complex_periodic_boundary_conditions_base<double> >' requested here
            : public integral_constant<bool, __is_polymorphic(_Tp)> {};
                                             ^
        /Users/lewtun/git/giotto-tda/gtda/externals/pybind11/include/pybind11/pybind11.h:1096:38: note: in instantiation of template class 'std::__1::is_polymorphic<Gudhi::cubical_complex::Cubical_complex_interface<Gudhi::cubical_complex::Bitmap_cubical_complex_periodic_boundary_conditions_base<double> > >' requested here
            static_assert(!has_alias || std::is_polymorphic<type>::value,
                                             ^
        /Users/lewtun/git/giotto-tda/gtda/externals/bindings/periodic_cubical_complex_bindings.cpp:26:3: note: in instantiation of template class 'pybind11::class_<Gudhi::cubical_complex::Cubical_complex_interface<Gudhi::cubical_complex::Bitmap_cubical_complex_periodic_boundary_conditions_base<double> >>' requested here
          py::class_<Periodic_cubical_complex_inst>(
          ^
        /Users/lewtun/git/giotto-tda/gtda/externals/gudhi-devel/src/Bitmap_cubical_complex/include/gudhi/Bitmap_cubical_complex_base.h:131:15: note: hidden overloaded virtual function 'Gudhi::cubical_complex::Bitmap_cubical_complex_base<double>::compute_incidence_between_cells' declared here: different qualifiers ('const' vs unqualified)
          virtual int compute_incidence_between_cells(std::size_t coface, std::size_t face) const {
                      ^
        In file included from /Users/lewtun/git/giotto-tda/gtda/externals/bindings/periodic_cubical_complex_bindings.cpp:16:
        In file included from /Users/lewtun/git/giotto-tda/gtda/externals/pybind11/include/pybind11/pybind11.h:44:
        In file included from /Users/lewtun/git/giotto-tda/gtda/externals/pybind11/include/pybind11/detail/../attr.h:13:
        In file included from /Users/lewtun/git/giotto-tda/gtda/externals/pybind11/include/pybind11/cast.h:13:
        In file included from /Users/lewtun/git/giotto-tda/gtda/externals/pybind11/include/pybind11/detail/../pytypes.h:12:
        /Users/lewtun/git/giotto-tda/gtda/externals/pybind11/include/pybind11/detail/common.h:789:5: error: static_assert failed due to requirement 'detail::integral_constant<bool, false>::value' "pybind11::overload_cast<...> requires compiling in C++14 mode"
            static_assert(detail::deferred_t<std::false_type, Args...>::value,
            ^             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        /Users/lewtun/git/giotto-tda/gtda/externals/bindings/periodic_cubical_complex_bindings.cpp:34:12: note: in instantiation of template class 'pybind11::overload_cast<>' requested here
                   py::overload_cast<>(&Periodic_cubical_complex_inst::dimension,
                   ^
        1 warning and 1 error generated.
        make[2]: *** [CMakeFiles/gtda_periodic_cubical_complex.dir/gtda/externals/bindings/periodic_cubical_complex_bindings.cpp.o] Error 1
        make[1]: *** [CMakeFiles/gtda_periodic_cubical_complex.dir/all] Error 2
        make[1]: *** Waiting for unfinished jobs....
        4 warnings and 2 errors generated.
        make[2]: *** [CMakeFiles/gtda_wasserstein.dir/gtda/externals/bindings/wasserstein_bindings.cpp.o] Error 1
        make[1]: *** [CMakeFiles/gtda_wasserstein.dir/all] Error 2
        make: *** [all] Error 2
        Traceback (most recent call last):
          File "<string>", line 1, in <module>
          File "/Users/lewtun/git/giotto-tda/setup.py", line 152, in <module>
            setup(name=DISTNAME,
          File "/Users/lewtun/miniconda3/envs/gtda/lib/python3.8/site-packages/setuptools/__init__.py", line 165, in setup
            return distutils.core.setup(**attrs)
          File "/Users/lewtun/miniconda3/envs/gtda/lib/python3.8/distutils/core.py", line 148, in setup
            dist.run_commands()
          File "/Users/lewtun/miniconda3/envs/gtda/lib/python3.8/distutils/dist.py", line 966, in run_commands
            self.run_command(cmd)
          File "/Users/lewtun/miniconda3/envs/gtda/lib/python3.8/distutils/dist.py", line 985, in run_command
            cmd_obj.run()
          File "/Users/lewtun/miniconda3/envs/gtda/lib/python3.8/site-packages/setuptools/command/develop.py", line 38, in run
            self.install_for_development()
          File "/Users/lewtun/miniconda3/envs/gtda/lib/python3.8/site-packages/setuptools/command/develop.py", line 140, in install_for_development
            self.run_command('build_ext')
          File "/Users/lewtun/miniconda3/envs/gtda/lib/python3.8/distutils/cmd.py", line 313, in run_command
            self.distribution.run_command(command)
          File "/Users/lewtun/miniconda3/envs/gtda/lib/python3.8/distutils/dist.py", line 985, in run_command
            cmd_obj.run()
          File "/Users/lewtun/git/giotto-tda/setup.py", line 106, in run
            self.build_extension(ext)
          File "/Users/lewtun/git/giotto-tda/setup.py", line 148, in build_extension
            subprocess.check_call(['cmake', '--build', '.'] + build_args,
          File "/Users/lewtun/miniconda3/envs/gtda/lib/python3.8/subprocess.py", line 364, in check_call
            raise CalledProcessError(retcode, cmd)
        subprocess.CalledProcessError: Command '['cmake', '--build', '.', '--config', 'Release', '--', '-j2']' returned non-zero exit status 2.
        ----------------------------------------
    ERROR: Command errored out with exit status 1: /Users/lewtun/miniconda3/envs/gtda/bin/python -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/Users/lewtun/git/giotto-tda/setup.py'"'"'; __file__='"'"'/Users/lewtun/git/giotto-tda/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' develop --no-deps Check the logs for full command output.
    

    Versions

    • macOS-10.15.5-x86_64-i386-64bit
    • Python 3.8.3 (default, Jul 2 2020, 11:26:31) [Clang 10.0.0 ]
    • NumPy 1.19.1
    • SciPy 1.5.2
    • Joblib 0.16.0
    • Scikit-learn 0.23.1
    • Giotto-tda 0.1.4

    Additional context

    @MonkeyBreaker any ideas / tips on what's going wrong?

    bug 
    opened by lewtun 38
  • :art: Clean up shape classification tutorial

    :art: Clean up shape classification tutorial

    Reference issues/PRs

    Types of changes

    • [ ] Bug fix (non-breaking change which fixes an issue)
    • [ ] New feature (non-breaking change which adds functionality)
    • [ ] Breaking change (fix or feature that would cause existing functionality to change)

    Description

    • Cleans up the shape classification tutorial to a) make it more pedagogical and b) more focused.
    • Some of the PD calculations can take O(minutes) which may be a concern if we eventually build the docs with every push.

    Screenshots (if appropriate)

    Any other comments?

    Checklist

    • [x] I have read the guidelines for contributing.
    • [x] My code follows the code style of this project. I used flake8 to check my Python changes.
    • [x] My change requires a change to the documentation.
    • [ ] I have updated the documentation accordingly.
    • [ ] I have added tests to cover my changes.
    • [ ] All new and existing tests passed. I used pytest to check this on Python tests.
    opened by lewtun 35
  • CI Always exit on failure

    CI Always exit on failure

    Follow up on https://github.com/giotto-ai/giotto-tda/pull/229

    The standard behavior of Azure pipelines is to fail a step with a script only if the last step fails https://github.com/Microsoft/azure-pipelines-tasks/issues/10125

    A way to avoid this on Unix is to set -e on the top of the script, or to explicitly error on Windows.

    This also means that steps with 1 command don't need failOnSterr: true and should work as expected.

    Not too critical, but it's an easy fix, and compared failOnSterr: true it should be more tolerant to warnings.

    I also wanted to check that all steps fail on an expected failure, do not merge yet.

    enhancement CI 
    opened by rth 30
  • Dev installation fail within Conda env

    Dev installation fail within Conda env

    Description

    Pip installation of source fails. I used to have both pypip version and dev installed in the same pyenv. These envs are now deleted.

    Steps/Code to Reproduce

    Create a new Conda env with Python 3.7, install Cmake and Boost as described in Readme. Clone Giotto-learn from Github. pip install --log 'mylogfile.txt' -e . from directory.

    Expected Results

    Completed installation

    Actual Results

    https://gist.github.com/torlarse/52fd212fd7c62cd558434e0c0b597002

    Versions

    Windows-10-10.0.18362-SP0 Python 3.7.4 (default, Aug 9 2019, 18:34:13) [MSC v.1915 64 bit (AMD64)] NumPy 1.18.1 SciPy 1.4.1 joblib 0.14.1 Scikit-Learn 0.22

    opened by torlarse 28
  • Adds FlagserPersistence

    Adds FlagserPersistence

    What does this implement/fix? Explain your changes.

    • Adds FlagserPersistence class and pyflagser as a library requirement.
    • Fixes docstrings for some classes in simplicial.py.
    • Fixes a bug leading to features with negative lifetime in persistent homology transformers when infinity_values is set too low.
    • Clean up code in _utils.py.
    opened by gtauzin 27
  • Mapper / Scikit-learn error from precomputed clustering

    Mapper / Scikit-learn error from precomputed clustering

    Description

    I have a point cloud of 1920 nine-dimensional points. When applying Mapper with DBSCAN clustering as in the Christmas Santa notebook everything works fine. When I apply my own clustering algoritm with a precomputed distance matrix I get an error. Using Kepler Mapper I make this work by setting the parameter precomputed=True when calling mapper.map().

    PS! I used the color function from the Santa .csv file as a hack to make the code run. It worked for the basic clustering method.

    UPDATE: I added point_cloud.csv to the gist, I hope it works for reproduction.

    Steps/Code to Reproduce

    https://gist.github.com/torlarse/43604dd09a98cc3f69166659cd6ddf9e

    Expected Results

    A Mapper complex :)

    Actual Results

    Please see gist for traceback.

    Versions

    Python 3.7.5 (tags/v3.7.5:5c02a39a0b, Oct 15 2019, 00:11:34) [MSC v.1916 64 bit (AMD64)] NumPy 1.17.4 SciPy 1.3.3 joblib 0.14.0 Scikit-Learn 0.22 giotto-Learn 0.1.3

    enhancement good first issue 
    opened by torlarse 24
  • Modify check_point_clouds to allow for sparse input

    Modify check_point_clouds to allow for sparse input

    Types of changes

    • [ ] Bug fix (non-breaking change which fixes an issue)
    • [x] New feature (non-breaking change which adds functionality)
    • [ ] Breaking change (fix or feature that would cause existing functionality to change)

    Description Following the addition of FlagserPersistence in #339 and the added support for list input in the simplicial homology transformers, it is a shame to keep excluding list of sparse matrix input, especially since the performance gains can be considerable for situations in which several edges are infinitely-valued (as @lewtun found out recently when using ripser). This PR modifies the check_point_clouds validation function to accept list of sparse matrix input, while still performing important checks. Furthermore, some docstrings have been edited for clarity/consistency.

    Any other comments? ~~I tried to extend the changes to SparseRipsPersistence but got a mysterious error thrown when trying it on a single 2x2 sparse matrix with ones in the off-diagonal entries. @MonkeyBreaker @gtauzin, do you know why this might be happening?~~ UPDATE: It is now clear that SparseRipsPersistence is not meant to receive sparse input yet.

    Checklist

    • [x] I have read the guidelines for contributing.
    • [x] My code follows the code style of this project. I used flake8 to check my Python changes.
    • [x] My change requires a change to the documentation.
    • [x] I have updated the documentation accordingly.
    • [x] I have added tests to cover my changes.
    • [x] All new and existing tests passed. I used pytest to check this on Python tests.
    opened by ulupo 23
  • Add MNIST classification example notebook

    Add MNIST classification example notebook

    Reference issues/PRs Reopening of PR #442.

    Types of changes

    • [ ] Bug fix (non-breaking change which fixes an issue)
    • [x] New feature (non-breaking change which adds functionality)
    • [ ] Breaking change (fix or feature that would cause existing functionality to change)

    Description Add the MNIST full-blown ML example

    Checklist

    • [x] I have read the guidelines for contributing.
    • [x] My code follows the code style of this project. I used flake8 to check my Python changes.
    • [x] My change requires a change to the documentation.
    • [x] I have updated the documentation accordingly.
    • [ ] I have added tests to cover my changes.
    • [x] All new and existing tests passed. I used pytest to check this on Python tests.
    opened by gtauzin 22
  • Persistence Diagrams fail for reconstructed MNIST images

    Persistence Diagrams fail for reconstructed MNIST images

    I trained a simple Autoencoder on MNIST and wanted to compare the persistence diagrams of the reconstructions to those of the inputs.

    For some of the reconstructed images it raised the following error:

    Code:

    from gtda.homology import CubicalPersistence cubical_persistence=CubicalPersistence(n_jobs=-1) im7_r_cubical=cubical_persistence.fit_transform(im7_r_filtration) im7_r_cubical.shape cubical_persistence.plot(im7_r_cubical)

    Error

    ValueError Traceback (most recent call last) in 3 im7_r_cubical=cubical_persistence.fit_transform(im7_r_filtration) 4 im7_r_cubical.shape ----> 5 cubical_persistence.plot(im7_r_cubical)

    /usr/local/lib/python3.6/dist-packages/gtda/homology/cubical.py in plot(Xt, sample, homology_dimensions, plotly_params) 256 return plot_diagram( 257 Xt[sample], homology_dimensions=homology_dimensions, --> 258 plotly_params=plotly_params 259 )

    /usr/local/lib/python3.6/dist-packages/gtda/plotting/persistence_diagrams.py in plot_diagram(diagram, homology_dimensions, plotly_params) 41 posinfinite_mask = np.isposinf(diagram_no_dims) 42 neginfinite_mask = np.isneginf(diagram_no_dims) ---> 43 max_val = np.max(np.where(posinfinite_mask, -np.inf, diagram_no_dims)) 44 min_val = np.min(np.where(neginfinite_mask, np.inf, diagram_no_dims)) 45 parameter_range = max_val - min_val

    <array_function internals> in amax(*args, **kwargs)

    /usr/local/lib/python3.6/dist-packages/numpy/core/fromnumeric.py in amax(a, axis, out, keepdims, initial, where) 2704 """ 2705 return _wrapreduction(a, np.maximum, 'max', axis, None, out, -> 2706 keepdims=keepdims, initial=initial, where=where) 2707 2708

    /usr/local/lib/python3.6/dist-packages/numpy/core/fromnumeric.py in _wrapreduction(obj, ufunc, method, axis, dtype, out, **kwargs) 85 return reduction(axis=axis, out=out, **passkwargs) 86 ---> 87 return ufunc.reduce(obj, axis, dtype, out, **passkwargs) 88 89

    ValueError: zero-size array to reduction operation maximum which has no identity

    opened by unJul 21
  • Add reduced_homology kwarg to control dropping of infinite bar in H0, fix treatment of infinity_values in CubicalPersistence, uniformize postprocessing of diagrams in gtda.homology

    Add reduced_homology kwarg to control dropping of infinite bar in H0, fix treatment of infinity_values in CubicalPersistence, uniformize postprocessing of diagrams in gtda.homology

    Reference issues/PRs Documentation-wise, this is related to #92 but applies to CubicalPersistence.

    Types of changes

    • [x] Bug fix (non-breaking change which fixes an issue)
    • [x] New feature (non-breaking change which adds functionality)
    • [x] Breaking change (fix or feature that would cause existing functionality to change)

    Description At the user level, this PR achieves three things:

    • [new feature] Adds a reduced_homology kwarg to all simplicial and cubical transformers. When True, we remove one infinite bar in H0 for the user automatically. Default is always True and corresponds to what we've been doing so far in all simplicial transformers. However, in the cubical case this causes a breaking change (see below).
    • [bug fix] Extends to CubicalPersistence the removal of potential negative-lifetime bars which can happen if infinity_values is set too low;
    • [breaking change] Extends to CubicalPersistence the removal of one infinite bar in dimension 0 by default.

    At the code level, the postprocessing of diagrams is unified between the simplicial and cubical transformers and between gudhi-style and ripser/flagser-style output formats. This is achieved by refactoring _postprocess_diagrams and adding a new argument "format" to it.

    Checklist

    • [x] I have read the guidelines for contributing.
    • [x] My code follows the code style of this project. I used flake8 to check my Python changes.
    • [x] My change requires a change to the documentation.
    • [x] I have updated the documentation accordingly.
    • [x] I have ~~added~~ modified tests to cover my changes.
    • [x] All new and existing tests passed. I used pytest to check this on Python tests.
    opened by ulupo 20
  • [DOCS] Unclear documentation for VietorisRipsPersistence padding

    [DOCS] Unclear documentation for VietorisRipsPersistence padding

    There is unclear documentation for the padding used in VietorisRipsPersistence. The documentation says that diagrams may be padded with some points on the diagonal, but it does not say what the padding values are or how they are chosen. This cannot be confusing for users trying to understand the output of the persistence algorithm.

    https://github.com/giotto-ai/giotto-tda/blob/8d09a39403ca11b50605bf466c1aa9f4f3876e5f/gtda/homology/_utils.py#L63 explains that the padding points are chosen as the minimum birth ever observed in that homology dimension, but this is not clear from the documentation.

    It would be helpful if the documentation for VietorisRipsPersistence clarified the padding strategy used or if the code were changed to use a more standard padding strategy such as padding with zeros.

    documentation 
    opened by raphaelreinauer 3
  • CI use GitHub actions

    CI use GitHub actions

    Reference issues/PRs

    Types of changes

    • [ ] Bug fix (non-breaking change which fixes an issue)
    • [x] New feature (non-breaking change which adds functionality)
    • [ ] Breaking change (fix or feature that would cause existing functionality to change)

    Description

    This PR migrates the azure pipeline to github actions. It deploys 2 different workflows:

    1. ci workflow that is run on each PR, it build and run test on linux, mac and windows platforms for Python 3.6 -> 3.9
    2. wheel wrokflow that generates the wheel on demand

    Unfortunately, until we merge this PR on main, we won't be able to see CI results on this repository. But they can be observed on my fork.

    Screenshots (if appropriate)

    Any other comments?

    Checklist

    • [x] I have read the guidelines for contributing.
    • [x] My code follows the code style of this project. I used flake8 to check my Python changes.
    • [ ] My change requires a change to the documentation.
    • [ ] I have updated the documentation accordingly.
    • [ ] I have added tests to cover my changes.
    • [x] All new and existing tests passed. I used pytest to check this on Python tests.

    The current status of the PR does the following:

    • [x] Build and Test for any PR
      • [x] : cache python requirements
      • [x] ; cache C++ compilation
      • [x] : ~Nightly build~
      • [x] : About the Notebook test, currently it fails but I allowed it to pass on fail, this shall be fixed before merging. UPDATE: The test are now enable again, but run on each time and are quite time consuming. I needed to disable one of the notebook, the MNIST_Classification, because it was taking too much time.
        • [x] : Maybe add the possibility to trigger the notebook test with a variable ? Done but same remark as nightly build, I cannot know if it works until this PR is merged on main.
      • [x] : cache boost installation to speed-up building time
      • [x] : Document ?
    • [x] Generates wheel on demand
      • [x] : Find a way to not hardcode the test to perform, the reason is that currently, for whatever reason, it ignores the setup.cfg file. UPDATE: I did not find a way, it seems from the documentation of cibuildwheel that you need to provide which test to perform. I dunno why ...
      • [x] : Document ?
      • [ ] : Automatic publication of wheels generated, see documentation on how to dot it, but do we want to do it ?
      • [x] : cache boost installation to speed-up building time, the cache is disable on Windows because for a reason I cannot understand, after loading the cache, the compilation would fail because some header were not found. I don't know how to debug this, so I decided to disable the cache on Windows.
      • [x] : Nightly build, now an external variable can be set before triggering the workflow in order to rename for a nightly release.
        • Before merging into main, I cannot test it, because to trigger workflow_dispatch events, you need to have the workflow available on main
    enhancement help wanted CI 
    opened by MonkeyBreaker 2
  • Taken's Parameter Search - Potential bug/enhancement

    Taken's Parameter Search - Potential bug/enhancement

    Hello,

    I am currently experimenting with time series classifies using Taken's embedding. The function takens_embedding_optimal_parameters is behaving somewhat differently than I expected. I am getting the following error.

    ValueError: Not enough time stamps (176) to produce at least one 7-dimensional vector under the current choice of time delay (30).

    I understand why this is happening, but it seems like this parameter combination should just be skipped instead of raising an error. I would just reduce the size of max dimension and/or max delay, but both are valid for lower values of the other.

    It looks like issue arises here: https://github.com/giotto-ai/giotto-tda/blob/eaa1dd0c301192f57a6b1de8e2bcee90c96ae1aa/gtda/time_series/_utils.py#L56

    Maybe there can be check on line 55, or potentially a flag for when _time_delay_embedding is being used a parameter search.

    Thanks, Joe

    discussion 
    opened by jcoll3 5
  • [WIP] Extended persistence and lower star filtrations

    [WIP] Extended persistence and lower star filtrations

    Reference issues/PRs #337 #546

    Types of changes

    • [ ] Bug fix (non-breaking change which fixes an issue)
    • [x] New feature (non-breaking change which adds functionality)
    • [ ] Breaking change (fix or feature that would cause existing functionality to change)

    Description Begins adding support for extended persistence via a new class LowerStarFlagPersistence and a new plotting function plot_extended_diagram. Does not yet address downstream processing of extended diagrams. There is also a new data structure for extended persistence diagrams. An extended persistence diagram is a 2D ndarray of shape (n_features, 4) where the first 3 columns are as for ordinary persistence (birth-death-dimension), and the fourth is either 1 or -1: 1 when the feature was born and died during the same "sweep", -1 otherwise. This allows to partition the extended diagram into the usual 4 portions:

    • birth < death and same sweep
    • birth < death and different sweep
    • birth > death and same sweep
    • birth > death and different sweep

    The extended persistence diagrams are obtained by "coning".

    Numerical stability issues have not yet been addressed.

    Checklist

    • [x] I have read the guidelines for contributing.
    • [x] My code follows the code style of this project. I used flake8 to check my Python changes.
    • [x] My change requires a change to the documentation.
    • [ ] I have updated the documentation accordingly.
    • [ ] I have added tests to cover my changes.
    • [ ] All new and existing tests passed. I used pytest to check this on Python tests.
    opened by ulupo 6
  • No figure captions in MNIST_classification notebook and docs

    No figure captions in MNIST_classification notebook and docs

    Describe the bug Currently, the figure captions are visible neither in the notebook nor in the documentation. The previous approach consisted of sing raw html, but it didn't work once pulled through sphinx. I tried to put captions as the alt text (drawing inspiration from persistent_homology_on_graphs.ipynb), but i get an error invalid option block. (even when replacing the ":" with -.

    Expected behavior No caption.

    Actual behaviour A caption beneath the figures.

    documentation 
    opened by wreise 0
Releases(v0.6.0)
  • v0.6.0(Aug 27, 2022)

    This is a major release including a new local homology subpackage, a new backend for computing Vietoris–Rips barcodes, wheels for Python 3.10 and Apple Silicon systems, and end of support for Python 3.6.

    Major Features and Improvements

    • A new local_homology subpackage containing scikit-learn–compatible transformers for the extraction of local homology features has been added (#602). A tutorial and an example notebooks explain it.
    • Wheels for Python 3.10 are now available (#644 and #646).
    • Wheels for Apple Silicon systems are now available for Python versions 3.8, 3.9 and 3.10 (#646).
    • giotto-ph is now the backend for the computation of Vietoris–Rips barcodes, replacing ripser.py (#614).
    • The documentation has been improved (#609).

    Bug Fixes

    • A bug involving tests for the mapper subpackage has been fixed (#638).

    Backwards-Incompatible Changes

    • Python 3.6 is no longer supported, and the manylinux standard has been bumped from manylinux2010 to manylinux2014 (#644 and #646).
    • The python-igraph requirement has been replaced with igraph >= 0.9.8 (#616).

    Thanks to our Contributors

    This release contains contributions from:

    Umberto Lupo, Jacob Bamberger, Wojciech Reise, JuliĂĄn Burella PĂ©rez, and Anibal Medina-Mardones

    We are also grateful to all who filed issues or helped resolve them, asked and answered questions, and were part of inspiring discussions.

    Source code(tar.gz)
    Source code(zip)
    giotto_tda-0.6.0-cp310-cp310-macosx_10_9_universal2.whl(1.58 MB)
    giotto_tda-0.6.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl(1.24 MB)
    giotto_tda-0.6.0-cp310-cp310-win_amd64.whl(1.28 MB)
    giotto_tda-0.6.0-cp37-cp37m-macosx_10_9_x86_64.whl(1.20 MB)
    giotto_tda-0.6.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl(1.25 MB)
    giotto_tda-0.6.0-cp37-cp37m-win_amd64.whl(1.31 MB)
    giotto_tda-0.6.0-cp38-cp38-macosx_10_9_universal2.whl(1.58 MB)
    giotto_tda-0.6.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl(1.24 MB)
    giotto_tda-0.6.0-cp38-cp38-win_amd64.whl(1.28 MB)
    giotto_tda-0.6.0-cp39-cp39-macosx_10_9_universal2.whl(1.58 MB)
    giotto_tda-0.6.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl(1.24 MB)
    giotto_tda-0.6.0-cp39-cp39-win_amd64.whl(1.28 MB)
  • v0.5.0(Jul 8, 2021)

    Major Features and Improvements

    • An object-oriented API for interactive plotting of Mapper graphs has been added with the MapperInteractivePlotter (#586). This is intended to supersede plot_interactive_mapper graph as it allows for inspection of the current state of the objects change by interactivity. See also "Backwards-Incompatible Changes" below.
    • Further citations have been added to the mathematical glossary (#564).

    Bug Fixes

    • A bug preventing EuclideanCechPersistence from working correctly on point clouds in more than 2 dimensions has been fixed (#588).
    • A validation bug preventing VietorisRipsPersistence and WeightedRipsPersistence from accepting non-empty dictionaries as metric_params has been fixed (#590).
    • A bug causing an exception to be raised when node_color_statistic was passed as a numpy array in plot_static_mapper_graph has been fixed (#576).

    Backwards-Incompatible Changes

    • A major change to the behaviour of the (static and interactive) Mapper plotting functions plot_static_mapper_graph and plot_interactive_mapper_graph was introduced in #584. The new MapperInteractivePlotter class (see "Major Features and Improvements" above) also follows this new API. The main changes are as follows:

      • color_by_columns_dropdown has been eliminated.
      • color_variable has been renamed to color_features (but cannot be an array).
      • An additional keyword argument color_data has been added to more clearly separate the input data to the Mapper pipeline from the data to be used for coloring.
      • node_color_statistic is now applied column by column -- previously it could end up being applied to 2d arrays as a whole.
      • The defaults for color-related arguments lead to index values instead of the mean of the data.
    • The default for weight_params in WeightedRipsPersistence is now the empty dictionary, and None is no longer allowed (#595).

    Thanks to our Contributors

    This release contains contributions from many people:

    Umberto Lupo, Wojciech Reise, Julian Burella PĂ©rez, Sean Law, Anibal Medina-Mardones, and Lewis Tunstall

    We are also grateful to all who filed issues or helped resolve them, asked and answered questions, and were part of inspiring discussions.

    Source code(tar.gz)
    Source code(zip)
    giotto_tda-0.5.0-cp36-cp36m-macosx_10_15_x86_64.whl(1.18 MB)
    giotto_tda-0.5.0-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl(1.45 MB)
    giotto_tda-0.5.0-cp36-cp36m-win_amd64.whl(1.24 MB)
    giotto_tda-0.5.0-cp37-cp37m-macosx_10_15_x86_64.whl(1.18 MB)
    giotto_tda-0.5.0-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl(1.45 MB)
    giotto_tda-0.5.0-cp37-cp37m-win_amd64.whl(1.24 MB)
    giotto_tda-0.5.0-cp38-cp38-macosx_10_15_x86_64.whl(1.19 MB)
    giotto_tda-0.5.0-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl(1.45 MB)
    giotto_tda-0.5.0-cp38-cp38-win_amd64.whl(1.23 MB)
    giotto_tda-0.5.0-cp39-cp39-macosx_10_15_x86_64.whl(1.19 MB)
    giotto_tda-0.5.0-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl(1.45 MB)
    giotto_tda-0.5.0-cp39-cp39-win_amd64.whl(1.23 MB)
  • v0.4.0(Jan 13, 2021)

    Major Features and Improvements

    • Wheels for Python 3.9 have been added (#528).
    • Weighted Rips filtrations, and in particular distance-to-measure (DTM) based filtrations, are now supported in ripser and by the new WeightedRipsPersistence transformer (#541).
    • See "Backwards-Incompatible Changes" for major improvements to ParallelClustering and therefore make_mapper_pipeline which are also major breaking changes.
    • GraphGeodesicDistance can now take rectangular input (the number of vertices is inferred to be max(x.shape)), and KNeighborsGraph can now take sparse input (#537).
    • VietorisRipsPersistence now takes a metric_params parameter (#541).

    Bug Fixes

    • A documentation bug affecting plots from DensityFiltration has been fixed (#540).
    • A bug affecting the bindings for GUDHI's edge collapser, which incorrectly did not ignore lower diagonal entries, has been fixed (#538).
    • Symmetry conflicts in the case of sparse input to ripser and VietorisRipsPersistence are now handled in a way true to the documentation, i.e. by favouring upper diagonal entries if different values in transpose positions are also stored (#537).

    Backwards-Incompatible Changes

    • The minimum required version of pyflagser is now 0.4.3 (#537).
    • ParallelClustering.fit_transform now outputs one array of cluster labels per sample, bringing it closer to scikit-learn convention for clusterers, and the fitted single clusterers are no longer stored in the clusterers_ attribute of the fitted object (#535 and #552).

    Thanks to our Contributors

    This release contains contributions from many people:

    Umberto Lupo, Julian Burella PĂ©rez, and Wojciech Reise.

    We are also grateful to all who filed issues or helped resolve them, asked and answered questions, and were part of inspiring discussions.

    Source code(tar.gz)
    Source code(zip)
    giotto_tda-0.4.0-cp36-cp36m-macosx_10_15_x86_64.whl(1.17 MB)
    giotto_tda-0.4.0-cp36-cp36m-manylinux2010_x86_64.whl(1.45 MB)
    giotto_tda-0.4.0-cp36-cp36m-win_amd64.whl(1.24 MB)
    giotto_tda-0.4.0-cp37-cp37m-macosx_10_15_x86_64.whl(1.17 MB)
    giotto_tda-0.4.0-cp37-cp37m-manylinux2010_x86_64.whl(1.45 MB)
    giotto_tda-0.4.0-cp37-cp37m-win_amd64.whl(1.24 MB)
    giotto_tda-0.4.0-cp38-cp38-macosx_10_15_x86_64.whl(1.18 MB)
    giotto_tda-0.4.0-cp38-cp38-manylinux2010_x86_64.whl(1.44 MB)
    giotto_tda-0.4.0-cp38-cp38-win_amd64.whl(1.22 MB)
    giotto_tda-0.4.0-cp39-cp39-macosx_10_15_x86_64.whl(1.18 MB)
    giotto_tda-0.4.0-cp39-cp39-manylinux2010_x86_64.whl(1.45 MB)
    giotto_tda-0.4.0-cp39-cp39-win_amd64.whl(1.22 MB)
  • v0.3.1(Nov 20, 2020)

    Major Features and Improvements

    • The latest changes made to the ripser.py submodule have been pulled (#530, see also #532). This includes in particular the performance improvements to the C++ backend submitted by Julian Burella PĂ©rez via scikit-tda/ripser.py#106. The developer installation now includes a new dependency in robinhood hashmap. These changes do not affect functionality.
    • The example notebook classifying_shapes.ipynb has been modified and improved (#523).
    • The tutorial previously called time_series_classification.ipynb has been split into an introductory tutorial on the Takens embedding ideas (topology_time_series.ipynb) and an example notebook on gravitational wave detection (gravitational_waves_detection.ipynb) which presents a time series classification task (#529).
    • The documentation for PairwiseDistance has been improved (#525).

    Bug Fixes

    • Timeout deadlines for some of the hypothesis tests have been increased to make them less flaky (#531).

    Backwards-Incompatible Changes

    • Due to poor support for brew in the macOS 10.14 virtual machines by Azure, the CI for macOS systems is now run on 10.15 virtual machines and 10.14 is no longer supported by the wheels (#527)

    Thanks to our Contributors

    This release contains contributions from many people:

    Julian Burella PĂ©rez, Umberto Lupo, Lewis Tunstall, Wojciech Reise, and Rayna Andreeva.

    We are also grateful to all who filed issues or helped resolve them, asked and answered questions, and were part of inspiring discussions.

    Source code(tar.gz)
    Source code(zip)
    giotto_tda-0.3.1-cp36-cp36m-macosx_10_15_x86_64.whl(1.17 MB)
    giotto_tda-0.3.1-cp36-cp36m-manylinux2010_x86_64.whl(1.44 MB)
    giotto_tda-0.3.1-cp36-cp36m-win_amd64.whl(1.23 MB)
    giotto_tda-0.3.1-cp37-cp37m-macosx_10_15_x86_64.whl(1.17 MB)
    giotto_tda-0.3.1-cp37-cp37m-manylinux2010_x86_64.whl(1.44 MB)
    giotto_tda-0.3.1-cp37-cp37m-win_amd64.whl(1.23 MB)
    giotto_tda-0.3.1-cp38-cp38-macosx_10_15_x86_64.whl(1.18 MB)
    giotto_tda-0.3.1-cp38-cp38-manylinux2010_x86_64.whl(1.44 MB)
    giotto_tda-0.3.1-cp38-cp38-win_amd64.whl(1.22 MB)
  • v0.3.0(Oct 9, 2020)

    Major Features and Improvements

    This is a major release which adds substantial new functionality and introduces several improvements.

    Persistent homology of directed flag complexes via pyflagser

    • The pyflagser package (source, docs) is now an official dependency of giotto-tda.
    • The FlagserPersistence transformer has been added to gtda.homology (#339). It wraps pyflagser.flagser_weighted to allow for computations of persistence diagrams from directed or undirected weighted graphs. A new notebook demonstrates its use.

    Edge collapsing and performance improvements for persistent homology

    • GUDHI C++ components have been updated to the state of GUDHI v3.3.0, yielding performance improvements in SparseRipsPersistence, EuclideanCechPersistence and CubicalPersistence (#468).
    • Bindings for GUDHI's edge collapser have been created and can now be used as an optional preprocessing step via the optional keyword argument collapse_edges in VietorisRipsPersistence and in gtda.externals.ripser (#469 and #483). When collapse_edges=True, and the input data and/or number of required homology dimensions is sufficiently large, the resulting runtimes for Vietoris–Rips persistent homology are state of the art.
    • The performance of the Ripser bindings has otherwise been improved by avoiding unnecessary data copies, better managing the memory, and using more efficient matrix routines (#501 and #507).

    New transformers and functionality in gtda.homology

    • The WeakAlphaPersistence transformer has been added to gtda.homology (#464). Like VietorisRipsPersistence, SparseRipsPersistence and EuclideanCechPersistence, it computes persistent homology from point clouds, but its runtime can scale much better with size in low dimensions.
    • VietorisRipsPersistence now accepts sparse input when metric="precomputed" (#424).
    • CubicalPersistence now accepts lists of 2D arrays (#503).
    • A reduced_homology parameter has been added to all persistent homology transformers. When True, one infinite bar in the H0 barcode is removed for the user automatically. Previously, it was not possible to keep these bars in the simplicial homology transformers. The default is always True, which implies a breaking change in the case of CubicalPersistence (#467).

    Persistence diagrams

    • A ComplexPolynomial feature extraction transformer has been added (#479).
    • A NumberOfPoints feature extraction transformer has been added (#496).
    • An option to normalize the entropy in PersistenceEntropy according to a heuristic has been added, and a nan_fill_value parameter allows to replace any NaN produced by the entropy calculation with a fixed constant (#450).
    • The computations in HeatKernel, PersistenceImage and in the pairwise distances and amplitudes related to them has been changed to yield the continuum limit when n_bins tends to infinity; sigma is now measured in the same units as the filtration parameter and defaults to 0.1 (#454).

    New curves subpackage

    A new curves subpackage has been added to preprocess, and extract features from, collections of multi-channel curves such as returned by BettiCurve, PersistenceLandscape and Silhouette (#480). It contains:

    • A StandardFeatures transformer that can extract features channel-wise in a generic way.
    • A Derivative transformer that computes channel-wise derivatives of any order by discrete differences (#492).

    New metaestimators subpackage

    A new metaestimator subpackage has been added with a CollectionTransformer meta-estimator which converts any transformer instance into a fit-transformer acting on collections (#495).

    Images

    • A DensityFiltration for collections of binary images has been added (#473).
    • Padder and Inverter have been extended to greyscale images (#489).

    Time series

    • TakensEmbedding is now a new transformer acting on collections of time series (#460).
    • The former TakensEmbedding acting on a single time series has been renamed to SingleTakensEmbedding transformer, and the internal logic employed in its fit for computing optimal hyperparameters is now available via a takens_embedding_optimal_parameters convenience function (#460).
    • The _slice_windows method of SlidingWindow has been made public and renamed into slice_windows (#460).

    Graphs

    • GraphGeodesicDistance has been improved as follows (#422):

      • The new parameters directed, unweighted and method have been added.
      • The rules on the role of zero entries, infinity entries, and non-stored values have been made clearer.
      • Masked arrays are now supported.
    • A mode parameter has been added to KNeighborsGraph; as in scikit-learn, it can be set to either "distance" or "connectivity" (#478).

    • List input is now accepted by all transformers in gtda.graphs, and outputs are consistently either lists or 3D arrays (#478).

    • Sparse matrices returned by KNeighborsGraph and TransitionGraph now have int dtype (0-1 adjacency matrices), and are necessarily symmetric (#478).

    Mapper

    • Pullback cover set labels and partial cluster labels have been added to Mapper node hovertexts (#445).

    • The functionality of Nerve and make_mapper_pipeline has been greatly extended (#447 and #456):

      • Node and edge metadata are now accessible in output igraph.Graph objects by means of the VertexSeq and EdgeSeq attributes vs and es (respectively). Graph-level dictionaries are no longer used.
      • Available node metadata can be accessed by graph.vs[attr_name] where for attr_name is one of "pullback_set_label", "partial_cluster_label", or "node_elements".
      • Sizes of intersections are automatically stored as edge weights, accessible by graph.es["weight"].
      • A "store_intersections" keyword argument has been added to Nerve and make_mapper_pipeline to allow to store the indices defining node intersections as edge attributes, accessible via graph.es["edge_elements"].
      • A contract_nodes optional parameter has been added to both Nerve and make_mapper_pipeline; nodes which are subsets of other nodes are thrown away from the graph when this parameter is set to True.
      • A graph_ attribute is stored during Nerve.fit.
    • Two of the Nerve parameters (min_intersection and the new contract_nodes) are now available in the widgets generated by plot_interactive_mapper_graph, and the layout of these widgets has been improved (#456).

    • ParallelClustering and Nerve have been exposed in the documentation and in gtda.mapper's __init__ (#447).

    Plotting

    • A plot_params kwarg is available in plotting functions and methods throughout to allow user customisability of output figures. The user must pass a dictionary with keys "layout" and/or "trace" (or "traces" in some cases) (#441).
    • Several plots produced by plot class methods now have default titles (#453).
    • Infinite deaths are now plotted by plot_diagrams (#461).
    • Possible multiplicities of persistence pairs in persistence diagram plots are now indicated in the hovertext (#454).
    • plot_heatmap now accepts boolean array input (#444).

    New tutorials and examples

    The following new tutorials have been added:

    • Topology of time series, which explains the theory of the Takens time-delay embedding and its use with persistent homology, demonstrates the new API of several components in gtda.time_series, and shows how to construct time series classification pipelines in giotto-tda by partially reproducing arXiv:1910:08245.
    • Topology in time series forecasting, which explains how to set up time series forecasting pipelines in giotto-tda via TransformerResamplerMixins and the giotto-tda Pipeline class.
    • Topological feature extraction from graphs, which explains what the features extracted from directed or undirected graphs by VietorisRipsPersistence, SparseRipsPersistence and FlagserPersistence are.
    • Classifying handwritten digits, which presents a fully-fledged machine learning pipeline in which cubical persistent homology is applied to the classification of handwritten images from he MNIST dataset, partially reproducing arXiv:1910.08345.

    Utils

    • A check_collection input validation function has been added (#491).
    • validate_params now accepts "in" and "of" keys simultaneously in the references dictionaries, with "in" used for non-list-like types and "of" otherwise (#502).

    Installation improvements

    • pybind11 is now treated as a standard git submodule in the developer installation (#459).
    • pandas is now part of the testing requirements when intalling from source (#508).

    Bug Fixes

    • A bug has been fixed which could lead to features with negative lifetime in persistent homology transformers when infinity_values was set too low (#339).
    • By relying on scipy's shortest_path instead of scikit-learn's graph_shortest_path, some errors in computing GraphGeodesicDistance (e.g. when som edges are zero) have been fixed (#422).
    • A bug in the handling of COO matrices by the ripser interface has been fixed (#465).
    • A bug which led to the incorrect handling of the homology_dimensions parameter in Filtering has been fixed (#439).
    • An issue with the use of joblib.Parallel, which led to errors when attempting to run HeatKernel, PersistenceImage, and the corresponding amplitudes and distances on large datasets, has been fixed (#428 and #481).
    • A bug leading to plots of persistence diagrams not showing points with negative births or deaths has been fixed, as has a bug with the computation of the range to be shown in the plot (#437).
    • A bug in the handling of persistence pairs with negative death values by Filtering has been fixed (#436).
    • A bug in the handling of homology_dimension_ix (now renamed to homology_dimension_idx) in the plot methods of HeatKernel and PersistenceImage has been fixed (#452).
    • A bug in the labelling of axes in HeatKernel and PersistenceImage plots has ben fixed (#453 and #454).
    • PersistenceLandscape plots now show all homology dimensions, instead of just the first (#454).
    • A bug in the computation of amplitudes and pairwise distances based on persistence images has been fixed (#454).
    • Silhouette now does not create NaNs when a subdiagram is trivial (#454).
    • CubicalPersistence now does not create pairs with negative persistence when infinity_values is set too low (#467).
    • Warnings are no longer thrown by KNeighborsGraph when metric="precomputed" (#506).
    • A bug in Labeller.resample affecting cases in which n_steps_future >= size - 1, has been fixed (#460).
    • A bug in validate_params, affecting the case of tuples of allowed types, has been fixed (#502).

    Backwards-Incompatible Changes

    • The minimum required versions from most of the dependencies have been bumped. The updated dependencies are numpy >= 1.19.1, scipy >= 1.5.0, joblib >= 0.16.0, scikit-learn >= 0.23.1, python-igraph >= 0.8.2, plotly >= 4.8.2, and pyflagser >= 0.4.1 (#457).
    • GraphGeodesicDistance now returns either lists or 3D dense ndarrays for compatibility with the homology transformers - By relying on scipy's shortest_path instead of scikit-learn's graph_shortest_path, some errors in computing GraphGeodesicDistance (e.g. when som edges are zero) have been fixed (#422).
    • The output of PairwiseDistance has been transposed to match scikit-learn convention (n_samples_transform, n_samples_fit) (#420).
    • plot class methods now return figures instead of showing them (#441).
    • Mapper node and edge attributes are no longer stored as graph-level dictionaries, "node_id" is no longer an available node attribute, and the attributes nodes_ and edges_ previously stored by Nerve.fit have been removed in favour of a graph_ attribute (#447).
    • The homology_dimension_ix parameter available in some transformers in gtda.diagrams has been renamed to homology_dimensions_idx (#452).
    • The base of the logarithm used by PersistenceEntropy is now 2 instead of e, and NaN values are replaced with -1 instead of 0 by default (#450 and #474).
    • The outputs of PersistenceImage, HeatKernel and of the pairwise distances and amplitudes based on them is now different due to the improvements described above.
    • Weights are no longer stored in the effective_metric_params_ attribute of PairwiseDistance, Amplitude and Scaler objects when the metric is persistence-image–based; only the weight function is (#454).
    • The homology_dimensions_ attributes of several transformers have been converted from lists to tuples. When possible, homology dimensions stored as parts of attributes are now presented as ints (#454).
    • gaussian_filter (used to make heat– and persistence-image–based representations/pairwise distances/amplitudes) is now called with mode="constant" instead of "reflect" (#454).
    • The default value of order in Amplitude has been changed from 2. to None, giving vector instead of scalar features (#454).
    • The meaning of the default None for weight_function in PersistenceImage (and in Amplitude and PairwiseDistance when metric="persistence_image") has been changed from the identity function to the function returning a vector of ones (#454).
    • Due to the updates in the GUDHI components, some of the bindings and Python interfaces to the GUDHI C++ components in gtda.externals have changed (#468).
    • Labeller.transform now returns a 1D array instead of a column array (#475).
    • PersistenceLandscape now returns 3D arrays instead of 4D ones, for compatibility with the new curves subpackage (#480).
    • By default, CubicalPersistence now removes one infinite bar in H0 (#467, and see above).
    • The former width parameter in SlidingWindow and Labeller has been replaced with a more intuitive size parameter. The relation between the two is: size = width + 1 (#460).
    • clusterer is now a required parameter in ParallelClustering (#508).
    • The max_fraction parameter in FirstSimpleGap and FirstHistogramGap now indicates the floor of max_fraction * n_samples; its default value has been changed from None to 1 (#412).

    Thanks to our Contributors

    This release contains contributions from many people:

    Umberto Lupo, Guillaume Tauzin, Julian Burella PĂ©rez, Wojciech Reise, Lewis Tunstall, Nick Sale, and Anibal Medina-Mardones.

    We are also grateful to all who filed issues or helped resolve them, asked and answered questions, and were part of inspiring discussions.

    Source code(tar.gz)
    Source code(zip)
    giotto_tda-0.3.0-cp36-cp36m-macosx_10_14_x86_64.whl(1.15 MB)
    giotto_tda-0.3.0-cp36-cp36m-manylinux2010_x86_64.whl(1.43 MB)
    giotto_tda-0.3.0-cp36-cp36m-win_amd64.whl(1.23 MB)
    giotto_tda-0.3.0-cp37-cp37m-macosx_10_14_x86_64.whl(1.15 MB)
    giotto_tda-0.3.0-cp37-cp37m-manylinux2010_x86_64.whl(1.43 MB)
    giotto_tda-0.3.0-cp37-cp37m-win_amd64.whl(1.23 MB)
    giotto_tda-0.3.0-cp38-cp38-macosx_10_14_x86_64.whl(1.16 MB)
    giotto_tda-0.3.0-cp38-cp38-manylinux2010_x86_64.whl(1.43 MB)
    giotto_tda-0.3.0-cp38-cp38-win_amd64.whl(1.21 MB)
  • v0.2.2(Jun 2, 2020)

    Major Features and Improvements

    • The documentation for gtda.mapper.utils.decorators.method_to_transform has been improved.

    • A table of contents has been added to the theory glossary.

    • The theory glossary has been restructured by including a section titled "Analysis". Entries for l^p norms, L^p norms and heat vectorization have been added.

    • The project's Azure CI for Windows versions has been sped-up by ensuring that the locally installed boost version is detected.

    • Several python bindings to external code from GUDHI, ripser.py and Hera have been made public: specifically, from gtda.externals import * now gives power users access to:

      • bottleneck_distance,
      • wasserstein_distance,
      • ripser,
      • SparseRipsComplex,
      • CechComplex,
      • CubicalComplex,
      • PeriodicCubicalComplex,
      • SimplexTree,
      • WitnessComplex,
      • StrongWitnessComplex.

      However, these functionalities are still undocumented.

    • The gtda.mapper.visualisation and gtda.mapper.utils._visualisation modules have been thoroughly refactored to improve code clarity, add functionality, change behaviour and fix bugs. Specifically, in figures generated by both plot_static_mapper_graph and plot_interactive_mapper_graph:

      • The colorbar no longer shows values rescaled to the interval [0, 1]. Instead, it always shows the true range of node summary statistics.
      • The values of the node summary statistics are now displayed in the hovertext boxes. A a new keyword argument n_sig_figs controls their rounding (3 is the default).
      • plotly_kwargs has been renamed to plotly_params (see "Backwards-Incompatible Changes" below).
      • The dependency on matplotlib's rgb2hex and get_cmap functions has been removed. As no other component in giotto-tda required matplotlib, the dependency on this library has been removed completely.
      • A node_scale keyword argument has been added which can be used to controls the size of nodes (see "Backwards-Incompatible Changes" below).
      • The overall look of Mapper graphs has been improved by increasing the opacity of node colors so that edges do not hide them, and by reducing the thickness of marker lines.

      Furthermore, a clone_pipeline keyword argument has been added to plot_interactive_mapper_graph, which when set to False allows the user to mutate the input pipeline via the interactive widget.

    • The docstrings of plot_static_mapper_graph, plot_interactive_mapper_graph and make_mapper_pipeline have been improved.

    Bug Fixes

    • A CI bug introduced by an update to the XCode compiler installed on the Azure Mac machines has been fixed.
    • A bug afflicting Mapper colors, which was due to an incorrect rescaling to [0, 1], has been fixed.

    Backwards-Incompatible Changes

    • The keyword parameter plotly_kwargs in plot_static_mapper_graph and plot_interactive_mapper_graph has been renamed to plotly_params and has now slightly different specifications. A new logic controls how the information contained in plotly_params is used to update plotly figures.
    • The function get_node_sizeref in gtda.mapper.utils.visualization has been hidden by renaming it to _get_node_sizeref. Its main intended use is subsumed by the new node_scale parameter of plot_static_mapper_graph and plot_interactive_mapper_graph.

    Thanks to our Contributors

    This release contains contributions from many people:

    Umberto Lupo, Julian Burella PĂ©rez, Anibal Medina-Mardones, Wojciech Reise and Guillaume Tauzin.

    We are also grateful to all who filed issues or helped resolve them, asked and answered questions, and were part of inspiring discussions.

    Source code(tar.gz)
    Source code(zip)
    giotto_tda-0.2.2-cp36-cp36m-macosx_10_14_x86_64.whl(1.01 MB)
    giotto_tda-0.2.2-cp36-cp36m-manylinux2010_x86_64.whl(1.31 MB)
    giotto_tda-0.2.2-cp36-cp36m-win_amd64.whl(1.12 MB)
    giotto_tda-0.2.2-cp37-cp37m-macosx_10_14_x86_64.whl(1.02 MB)
    giotto_tda-0.2.2-cp37-cp37m-manylinux2010_x86_64.whl(1.31 MB)
    giotto_tda-0.2.2-cp37-cp37m-win_amd64.whl(1.12 MB)
    giotto_tda-0.2.2-cp38-cp38-macosx_10_14_x86_64.whl(1.02 MB)
    giotto_tda-0.2.2-cp38-cp38-manylinux2010_x86_64.whl(1.30 MB)
    giotto_tda-0.2.2-cp38-cp38-win_amd64.whl(1.10 MB)
  • v0.2.1(Apr 8, 2020)

    Major Features and Improvements

    • The theory glossary has been improved to include the notions of vectorization, kernel and amplitude for persistence diagrams.
    • The ripser function in gtda.externals.python.ripser_interface no longer uses scikit-learn's pairwise_distances when metric is 'precomputed', thus allowing square arrays with negative entries or infinities to be passed.
    • check_point_clouds in gtda.utils.validation now checks for square array input when the input should be a collection of distance-type matrices. Warnings guide the user to correctly setting the distance_matrices parameter. force_all_finite=Falseno longer means accepting NaN input (only infinite input is accepted).
    • VietorisRipsPersistence in gtda.homology.simplicial no longer masks out infinite entries in the input to be fed to ripser.
    • The docstrings for check_point_clouds and VietorisRipsPersistence have been improved to reflect these changes and the extra level of generality for ripser.

    Bug Fixes

    • The variable used to indicate the location of Boost headers has been renamed from Boost_INCLUDE_DIR to Boost_INCLUDE_DIRS to address developer installation issues in some Linux systems.

    Backwards-Incompatible Changes

    • The keyword parameter distance_matrix in check_point_clouds has been renamed to distance_matrices.

    Thanks to our Contributors

    This release contains contributions from many people:

    Umberto Lupo, Anibal Medina-Mardones, Julian Burella PĂ©rez, Guillaume Tauzin, and Wojciech Reise.

    We are also grateful to all who filed issues or helped resolve them, asked and answered questions, and were part of inspiring discussions.

    Source code(tar.gz)
    Source code(zip)
    giotto_tda-0.2.1-cp36-cp36m-macosx_10_14_x86_64.whl(1.01 MB)
    giotto_tda-0.2.1-cp36-cp36m-manylinux2010_x86_64.whl(1.30 MB)
    giotto_tda-0.2.1-cp36-cp36m-win_amd64.whl(1.11 MB)
    giotto_tda-0.2.1-cp37-cp37m-macosx_10_14_x86_64.whl(1.01 MB)
    giotto_tda-0.2.1-cp37-cp37m-manylinux2010_x86_64.whl(1.31 MB)
    giotto_tda-0.2.1-cp37-cp37m-win_amd64.whl(1.11 MB)
    giotto_tda-0.2.1-cp38-cp38-macosx_10_14_x86_64.whl(1.02 MB)
    giotto_tda-0.2.1-cp38-cp38-manylinux2010_x86_64.whl(1.30 MB)
    giotto_tda-0.2.1-cp38-cp38-win_amd64.whl(1.10 MB)
  • v0.2.0(Mar 23, 2020)

    Major Features and Improvements

    This is a major release which substantially broadens the scope of giotto-tda and introduces several improvements.

    The library's documentation has been greatly improved and is now hosted via GitHub pages. It includes rendered jupyter notebooks from the repository's examples folder, as well as an improved theory glossary, more detailed installation instructions, improved guidelines for contributing, and an FAQ.

    Plotting functions and plotting API

    This version introduces built-in plotting capabilities to giotto-tda. These come in the form of:

    • a new plotting subpackage populated with plotting functions for common data structures;
    • a new PlotterMixin and a class-level plotting API based on newly introduced plot, transform_plot and fit_transform_plot methods which are now available in several of giotto-tda's transformers.

    Changes and additions to gtda.homology

    The internal structure of this subpackage has been changed. ConsistentRescaling has been moved to a new point_clouds subpackage (see below), and gtda.homology no longer contains a point_clouds submodule. Instead, it contains two submodules, simplicial and cubical. simplicial contains the VietorisRipsPersistence class as well as the following new classes:

    • SparseRipsPersistence,
    • EuclideanCechPersistence.

    The cubical submodule contains CubicalPersistence, a new class for computing persistent homology of filtered cubical complexes such as those coming from 2D or 3D greyscale images.

    New images subpackage

    The new gtda.images subpackage contains classes which, together with gtda.homology.CubicalPersistence, extend the capabilities of giotto-tda to computer vision, by handling input representing binary or greyscale 2D/3D images represented as arrays.

    The classes in gtda.images.filtrations are responsible for converting binary image input into greyscale images in a variety of ways. The greyscale output can then be fed to gtda.homology.CubicalPersistence to extract topological signatures in the form of persistence diagrams. These classes are:

    • HeightFiltration,
    • RadialFiltration,
    • DilationFiltration,
    • ErosionFiltration,
    • SignedDistanceFiltration.

    The classes in gtda.images.preprocessing perform a variety of preprocessing steps on either binary or greyscale image input, as well as conversion to point cloud format. They are:

    • Binarizer,
    • Inverter,
    • Padder,
    • ImageToPointCloud.

    New point_clouds subpackage

    ConsistentRescaling is no longer placed in gtda.homology. Instead, it is now in a point_clouds subpackage containing classes which process or modify the geometry of point cloud data. gtda.point_clouds also contains the new class ConsecutiveRescaling, written with time series applications in mind.

    List of point cloud input

    All classes in the homology subpackage (VietorisRipsPersistence, SparseRipsPersistence, and EuclideanCechPersistence) can now take as inputs to the fit and transform methods lists of 2D arrays instead of simply 3D arrays. In this way, collections of point clouds with varying numbers of points can be processed.

    Changes and additions to gtda.diagrams

    The diagrams subpackage contains the following new classes:

    • PersistenceImage
    • Silhouette

    Additionally, the subpackage has been reorganised as follows:

    • The features submodule now only contains the scalar feature generation classes Amplitude (moved there from distance) and PersistenceEntropy.
    • Classes which produce vector representations from persistence diagrams have been moved to the new representations submodule.

    Changes and additions to gtda.utils

    • validate_params has been thoroughly refactored, documented and exposed for the benefit of developers.
    • check_diagrams has been modified, documented and exposed for the benefit of developers.
    • The new check_point_clouds performs validation of inputs consisting of collections of point clouds of distance matrices. It accepts both lists of 2D ndarrays and 3D ndarrays, and is used in the fit and transform methods of classes in gtda.homology.simplicial to allow for list input (see above).

    External modules and HPC improvements

    A substantial effort has been put in improving the quality of the high-performance components contained in gtda.externals. The end result is a cleaner packaging as well as faster execution of C++ functions due to improved bindings. In particular:

    • Two binaries are now shipped for ripser, one of them being optimised for calculations with mod 2 coefficients.
    • Recent improvements by the authors of the hera C++ library have been integrated in giotto-tda.
    • Compiler optimisations for Windows-based systems have been added.
    • The integration of pybind11 has been improved and several issues arising with CMake and boost during developer installations have been addressed.

    Bug Fixes

    • Fixed a bug with TakensEmbedding's algorithm for search of optimal parameters.
    • Inconsistencies in between the meaning of "bottleneck amplitude" in the theory and in the code have been ironed out. The code has been modified to agree with the theory glossary. The outputs of the gtda.diagrams classes Amplitude, Scaler and Filtering is affected.
    • Fixed bugs affecting color normalization in Mapper graph plots.

    Backwards-Incompatible Changes

    • Python 3.5 is no longer supported.
    • Mac OS X versions below 10.14 are no longer supported by the wheels shipped via PyPI.
    • ConsistentRescaling is no longer found in gtda.homology and is now part of gtda.point_clouds.
    • The outputs of the gtda.diagrams classes Amplitude, Scaler and Filtering have changed due to sqrt(2) factors (see Bug Fixes).
    • The meta_transformers module has been removed.
    • The plotting module has been removed from the examples folder of the repository.

    Thanks to our Contributors

    This release contains contributions from many people:

    Umberto Lupo, Guillaume Tauzin, Wojciech Reise, Julian Burella Pérez, Roman Yurchak, Lewis Tunstall, Anibal Medina-Mardones, and Adélie Garin.

    We are also grateful to all who filed issues or helped resolve them, asked and answered questions, and were part of inspiring discussions.

    Source code(tar.gz)
    Source code(zip)
    giotto_tda-0.2.0-cp36-cp36m-macosx_10_14_x86_64.whl(1.01 MB)
    giotto_tda-0.2.0-cp36-cp36m-manylinux2010_x86_64.whl(1.30 MB)
    giotto_tda-0.2.0-cp36-cp36m-win_amd64.whl(1.11 MB)
    giotto_tda-0.2.0-cp37-cp37m-macosx_10_14_x86_64.whl(1.01 MB)
    giotto_tda-0.2.0-cp37-cp37m-manylinux2010_x86_64.whl(1.30 MB)
    giotto_tda-0.2.0-cp37-cp37m-win_amd64.whl(1.11 MB)
    giotto_tda-0.2.0-cp38-cp38-macosx_10_14_x86_64.whl(1.01 MB)
    giotto_tda-0.2.0-cp38-cp38-manylinux2010_x86_64.whl(1.29 MB)
    giotto_tda-0.2.0-cp38-cp38-win_amd64.whl(1.09 MB)
  • v0.1.4(Jan 24, 2020)

    Library name change

    The library and GitHub repository have been renamed to giotto-tda! While the new name is meant to better convey the library's focus on Topology-powered machine learning and Data Analysis, the commitment to seamless integration with scikit-learn will remain just as strong and a defining feature of the project. Concurrently, the main module has been renamed from giotto to gtda in this version. giotto-learn will remain on PyPI as a legacy package (stuck at v0.1.3) until we have ensured that users and developers have fully migrated. The new PyPI package giotto-tda will start at v0.1.4 for project continuity.

    Short summary: install via

        pip install -U giotto-tda
    

    and import gtda in your scripts or notebooks!

    Change of license

    The license changes from Apache 2.0 to GNU AGPLv3 from this release on.

    Major Features and Improvements

    • Added a mapper submodule implementing the Mapper algorithm of Singh, MĂ©moli and Carlsson. The main tools are the functions make_mapper_pipeline, plot_static_mapper_graph and plot_interactive_mapper_graph. The first creates an object of class MapperPipeline which can be fit-transformed to data to create a Mapper graph in the form of an igraph.Graph object (see below). The MapperPipeline class itself is a simple subclass of scikit-learn's Pipeline which is adapted to the precise structure of the Mapper algorithm, so that a MapperPipeline object can be used as part of even larger scikit-learn pipelines, inside a meta-estimator, in a grid search, etc. One also has access to other important features of scikit-learn's Pipeline, such as memory caching to avoid unnecessary recomputation of early steps when parameters involved in later steps are changed. The clustering step can be parallelised over the pullback cover sets via joblib -- though this can actually lower performance in small- and medium-size datasets. A range of pre-defined filter functions are also included, as well as covers in one and several dimensions, agglomerative clustering algorithms based on stopping rules to create flat cuts, and utilities for making transformers out of callables or out of other classes which have no transform method. plot_static_mapper_graph allows the user to visualise (in 2D or 3D) the Mapper graph arising from fit-transforming a MapperPipeline to data, and offers a range of colouring options to correlate the graph's structure with exogenous or endogenous information. It relies on plotly for plotting and displaying metadata. plot_interactive_mapper_graph adds interactivity to this, via ipywidgets: specifically, the user can fine-tune some parameters involved in the definition of the Mapper pipeline, and observe in real time how the structure of the graph changes as a result. In this release, all hyperparameters involved in the covering and clustering steps are supported. The ability to fine-tune other hyperparameters will be considered for future versions.
    • Added support for Python 3.8.

    Bug Fixes

    • Fixed consistently incorrect documentation for the fit_transform methods. This has been achieved by introducing a class decorator adapt_fit_transform_docs which is defined in the newly introduced gtda.utils._docs.py.

    Backwards-Incompatible Changes

    • The library name change and the change in the name of the main module giotto are important major changes.
    • There are now additional dependencies in the python-igraph, matplotlib, plotly, and ipywidgets libraries.

    Thanks to our Contributors

    This release contains contributions from many people:

    Umberto Lupo, Lewis Tunstall, Guillaume Tauzin, Philipp Weiler, Julian Burella PĂ©rez.

    We are also grateful to all who filed issues or helped resolve them, asked and answered questions, and were part of inspiring discussions. In particular, we would like to thank Martino Milani, who worked on an early prototype of a Mapper implementation; although very different from the current one, it adopted an early form of caching to avoid recomputation in refitting, which was an inspiration for this implementation.

    Source code(tar.gz)
    Source code(zip)
    giotto_tda-0.1.4-cp35-cp35m-macosx_10_13_x86_64.whl(915.95 KB)
    giotto_tda-0.1.4-cp35-cp35m-manylinux2010_x86_64.whl(1.39 MB)
    giotto_tda-0.1.4-cp35-cp35m-win_amd64.whl(1006.04 KB)
    giotto_tda-0.1.4-cp36-cp36m-macosx_10_13_x86_64.whl(915.96 KB)
    giotto_tda-0.1.4-cp36-cp36m-manylinux2010_x86_64.whl(1.40 MB)
    giotto_tda-0.1.4-cp36-cp36m-win_amd64.whl(1006.01 KB)
    giotto_tda-0.1.4-cp37-cp37m-macosx_10_13_x86_64.whl(916.40 KB)
    giotto_tda-0.1.4-cp37-cp37m-manylinux2010_x86_64.whl(1.39 MB)
    giotto_tda-0.1.4-cp37-cp37m-win_amd64.whl(1006.08 KB)
    giotto_tda-0.1.4-cp38-cp38-macosx_10_13_x86_64.whl(923.05 KB)
    giotto_tda-0.1.4-cp38-cp38-manylinux2010_x86_64.whl(1.40 MB)
    giotto_tda-0.1.4-cp38-cp38-win_amd64.whl(994.43 KB)
  • v0.1.3(Nov 8, 2019)

    Major Features and Improvements

    None

    Bug Fixes

    • Fixed a bug in diagrams.Amplitude causing the transformed array to be wrongly filled and added adequate test.

    Backwards-Incompatible Changes

    None.

    Thanks to our Contributors

    This release contains contributions from many people:

    Umberto Lupo.

    We are also grateful to all who filed issues or helped resolve them, asked and answered questions, and were part of inspiring discussions.

    Source code(tar.gz)
    Source code(zip)
    giotto_learn-0.1.3-cp35-cp35m-linux_x86_64.whl(836.89 KB)
    giotto_learn-0.1.3-cp35-cp35m-macosx_10_13_x86_64.whl(575.16 KB)
    giotto_learn-0.1.3-cp35-cp35m-win_amd64.whl(610.90 KB)
    giotto_learn-0.1.3-cp36-cp36m-linux_x86_64.whl(836.88 KB)
    giotto_learn-0.1.3-cp36-cp36m-macosx_10_13_x86_64.whl(575.16 KB)
    giotto_learn-0.1.3-cp36-cp36m-win_amd64.whl(610.91 KB)
    giotto_learn-0.1.3-cp37-cp37m-linux_x86_64.whl(837.28 KB)
    giotto_learn-0.1.3-cp37-cp37m-macosx_10_13_x86_64.whl(575.26 KB)
    giotto_learn-0.1.3-cp37-cp37m-win_amd64.whl(610.93 KB)
  • v0.1.2(Nov 5, 2019)

    Major Features and Improvements

    • Added support for Python 3.5.

    Bug Fixes

    None.

    Backwards-Incompatible Changes

    None.

    Thanks to our Contributors

    This release contains contributions from many people:

    Matteo Caorsi, Henry Tom (@henrytomsf), Guillaume Tauzin.

    We are also grateful to all who filed issues or helped resolve them, asked and answered questions, and were part of inspiring discussions.

    Source code(tar.gz)
    Source code(zip)
    giotto_learn-0.1.2-cp35-cp35m-linux_x86_64.whl(834.92 KB)
    giotto_learn-0.1.2-cp35-cp35m-macosx_10_14_x86_64.whl(573.40 KB)
    giotto_learn-0.1.2-cp35-cp35m-win_amd64.whl(608.58 KB)
    giotto_learn-0.1.2-cp36-cp36m-linux_x86_64.whl(834.92 KB)
    giotto_learn-0.1.2-cp36-cp36m-macosx_10_14_x86_64.whl(573.40 KB)
    giotto_learn-0.1.2-cp36-cp36m-win_amd64.whl(608.58 KB)
    giotto_learn-0.1.2-cp37-cp37m-linux_x86_64.whl(835.23 KB)
    giotto_learn-0.1.2-cp37-cp37m-macosx_10_13_x86_64.whl(573.51 KB)
    giotto_learn-0.1.2-cp37-cp37m-win_amd64.whl(608.61 KB)
  • v0.1.1(Oct 21, 2019)

    Major Features and Improvements

    • Improved documentation.
    • Improved features of class Labeller.
    • Improved features of class PearsonDissimilarities.
    • Improved GitHub files.
    • Improved CI.

    Bug Fixes

    Fixed minor bugs from the first release.

    Backwards-Incompatible Changes

    The following class were renamed:

    • class PearsonCorrelation was renamed to classPearsonDissimilarities

    Thanks to our Contributors

    This release contains contributions from many people:

    Umberto Lupo, Guillaume Tauzin, Matteo Caorsi, Olivier Morel.

    We are also grateful to all who filed issues or helped resolve them, asked and answered questions, and were part of inspiring discussions.

    Source code(tar.gz)
    Source code(zip)
    giotto_learn-0.1.1-cp36-cp36m-linux_x86_64.whl(834.53 KB)
    giotto_learn-0.1.1-cp36-cp36m-macosx_10_14_x86_64.whl(574.27 KB)
    giotto_learn-0.1.1-cp36-cp36m-win_amd64.whl(608.23 KB)
    giotto_learn-0.1.1-cp37-cp37m-linux_x86_64.whl(834.89 KB)
    giotto_learn-0.1.1-cp37-cp37m-macosx_10_14_x86_64.whl(574.25 KB)
    giotto_learn-0.1.1-cp37-cp37m-win_amd64.whl(608.29 KB)
  • v0.1.0(Oct 18, 2019)

    Major Features and Improvements

    The following submodules where added:

    • giotto.homology implements transformers to modify metric spaces or generate persistence diagrams.
    • giotto.diagrams implements transformers to preprocess persistence diagrams or extract features from them.
    • giotto.time_series implements transformers to preprocess time series or embed them in a higher dimensional space for persistent homology.
    • giotto.graphs implements transformers to create graphs or extract metric spaces from graphs.
    • giotto.meta_transformers implements convenience giotto.Pipeline transformers for direct topological feature generation.
    • giotto.utils implements hyperparameters and input validation functions.
    • giotto.base implements a TransformerResamplerMixin for transformers that have a resample method.
    • giotto.pipeline extends scikit-learn's module by defining Pipelines that include TransformerResamplers.

    Bug Fixes

    Backwards-Incompatible Changes

    Thanks to our Contributors

    This release contains contributions from many people:

    Guillaume Tauzin, Umberto Lupo, Philippe Nguyen, Matteo Caorsi, Julian Burella PĂ©rez, Alessio Ghiraldello.

    We are also grateful to all who filed issues or helped resolve them, asked and answered questions, and were part of inspiring discussions.

    Source code(tar.gz)
    Source code(zip)
    giotto_learn-0.1.0-cp36-cp36m-linux_x86_64.whl(834.54 KB)
    giotto_learn-0.1.0-cp36-cp36m-win_amd64.whl(608.23 KB)
    giotto_learn-0.1.0-cp37-cp37m-linux_x86_64.whl(834.89 KB)
    giotto_learn-0.1.0-cp37-cp37m-win_amd64.whl(608.30 KB)
Owner
giotto.ai
Adding a third dimension to AI
giotto.ai
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

Website | Documentation | Tutorials | Installation | Release Notes CatBoost is a machine learning method based on gradient boosting over decision tree

CatBoost 6.9k Jan 5, 2023
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

Light Gradient Boosting Machine LightGBM is a gradient boosting framework that uses tree based learning algorithms. It is designed to be distributed a

Microsoft 14.5k Jan 7, 2023
Machine Learning toolbox for Humans

Reproducible Experiment Platform (REP) REP is ipython-based environment for conducting data-driven research in a consistent and reproducible way. Main

Yandex 663 Dec 31, 2022
TorchDrug is a PyTorch-based machine learning toolbox designed for drug discovery

A powerful and flexible machine learning platform for drug discovery

MilaGraph 1.1k Jan 8, 2023
High performance implementation of Extreme Learning Machines (fast randomized neural networks).

High Performance toolbox for Extreme Learning Machines. Extreme learning machines (ELM) are a particular kind of Artificial Neural Networks, which sol

Anton Akusok 174 Dec 7, 2022
High performance Python GLMs with all the features!

High performance Python GLMs with all the features!

QuantCo 200 Dec 14, 2022
PyTorch extensions for high performance and large scale training.

Description FairScale is a PyTorch extension library for high performance and large scale training on one or multiple machines/nodes. This library ext

Facebook Research 2k Dec 28, 2022
A high performance and generic framework for distributed DNN training

BytePS BytePS is a high performance and general distributed training framework. It supports TensorFlow, Keras, PyTorch, and MXNet, and can run on eith

Bytedance Inc. 3.3k Dec 28, 2022
Mosec is a high-performance and flexible model serving framework for building ML model-enabled backend and microservices

Mosec is a high-performance and flexible model serving framework for building ML model-enabled backend and microservices. It bridges the gap between any machine learning models you just trained and the efficient online service API.

null 164 Jan 4, 2023
AutoTabular automates machine learning tasks enabling you to easily achieve strong predictive performance in your applications.

AutoTabular automates machine learning tasks enabling you to easily achieve strong predictive performance in your applications. With just a few lines of code, you can train and deploy high-accuracy machine learning and deep learning models tabular data.

Robin 55 Dec 27, 2022
AutoTabular automates machine learning tasks enabling you to easily achieve strong predictive performance in your applications.

AutoTabular AutoTabular automates machine learning tasks enabling you to easily achieve strong predictive performance in your applications. With just

wenqi 2 Jun 26, 2022
A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.

Master status: Development status: Package information: TPOT stands for Tree-based Pipeline Optimization Tool. Consider TPOT your Data Science Assista

Epistasis Lab at UPenn 8.9k Jan 9, 2023
Python Extreme Learning Machine (ELM) is a machine learning technique used for classification/regression tasks.

Python Extreme Learning Machine (ELM) Python Extreme Learning Machine (ELM) is a machine learning technique used for classification/regression tasks.

Augusto Almeida 84 Nov 25, 2022
Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

Vowpal Wabbit 8.1k Dec 30, 2022
CD) in machine learning projectsImplementing continuous integration & delivery (CI/CD) in machine learning projects

CML with cloud compute This repository contains a sample project using CML with Terraform (via the cml-runner function) to launch an AWS EC2 instance

Iterative 19 Oct 3, 2022
Python 3.6+ toolbox for submitting jobs to Slurm

Submit it! What is submitit? Submitit is a lightweight tool for submitting Python functions for computation within a Slurm cluster. It basically wraps

Facebook Incubator 768 Jan 3, 2023
A Python implementation of the Robotics Toolbox for MATLAB

Robotics Toolbox for Python A Python implementation of the Robotics Toolbox for MATLABÂź GitHub repository Documentation Wiki (examples and details) Sy

Peter Corke 1.2k Jan 7, 2023
PyPOTS - A Python Toolbox for Data Mining on Partially-Observed Time Series

A python toolbox/library for data mining on partially-observed time series, supporting tasks of forecasting/imputation/classification/clustering on incomplete multivariate time series with missing values.

Wenjie Du 179 Dec 31, 2022
MIT-Machine Learning with Python–From Linear Models to Deep Learning

MIT-Machine Learning with Python–From Linear Models to Deep Learning | One of the 5 courses in MIT MicroMasters in Statistics & Data Science Welcome t

null 2 Aug 23, 2022