Profile and test to gain insights into the performance of your beautiful Python code

Overview

Profile and test to gain insights into the performance of your beautiful Python code

View Demo - Report Bug - Request Feature


QuickPotato in a nutshell

QuickPotato is a Python library that aims to make it easier to rapidly profile your software and produce powerful code visualizations that enables you to quickly investigate where potential performance bottlenecks are hidden.

Also, QuickPotato is trying to provide you with a path to add an automated performance testing angle to your regular unit tests or test-driven development test cases allowing you to test your code early in the development life cycle in a simple, reliable, and fast way.

Installation

Install using pip or download the source code from GitHub.

pip install QuickPotato

Generating Flame Graphs

Example of a Python flame graph

How to interpret the Flame Graphs generated by QuickPotato together with d3-flame-graph:

  • Each box is a function in the stack
  • The y-axis shows the stack depth the top box shows what was on the CPU.
  • The x-axis does not show time but spans the population and is ordered alphabetically.
  • The width of the box show how long it was on-CPU or was part of an parent function that was on-CPU.

If you are unfamiliar with Flame Graphs you can best read about them on Brendan Greg's website.

In the following way you can generate a Python flame graph with QuickPotato:

from examples.example_code import FancyCode
from QuickPotato import performance_test as pt
from QuickPotato.statistical.visualizations import FlameGraph

# Create a test case
pt.test_case_name = "FlameGraph"

pt.measure_method_performance(
    method=FancyCode().say_my_name_and_more,  # <-- The Method which you want to test.
    arguments=["joey hendricks"],  # <-- Your arguments go here.
    iteration=10,  # <-- The number of times you want to execute this method.
    pacing=0  # <-- How much seconds you want to wait between iterations.
)

# Generate the flame graph visualizations to analyse your code performance.
FlameGraph(pt.test_case_name, test_id=pt.current_test_id).export("C:\\temp\\")

Generating Heatmaps (Beta)

Example of a Python heatmap

How does a by QuickPotato generated D3 heatmap work:

  • Every box in the heatmap is a function
  • The y-axis is made up of functions ordered by its latency.
  • The x-axis spans the amount of sample (one sample is on execution of your code) and is separated into columns of test id's (one test id is one completely executed test).
  • The color shows the speeds of the function to more red a box is the more time there was spent.
  • All boxes are clickable and will give you information about that particular function.

In the following way you can generate a Python heatmap with QuickPotato:

from examples.example_code import FancyCode
from QuickPotato import performance_test as pt
from QuickPotato.statistical.visualizations import HeatMap

# Create a test case
pt.test_case_name = "Heatmap"

# Attach the method from which you want to performance test
pt.measure_method_performance(
    method=FancyCode().say_my_name_and_more,  # <-- The Method which you want to test.
    arguments=["joey hendricks"],  # <-- Your arguments go here.
    iteration=10,  # <-- The number of times you want to execute this method.
    pacing=0  # <-- How much seconds you want to wait between iterations.
)

# Generate the heatmap visualizations to analyse your code performance.
HeatMap(pt.test_case_name, test_ids=[pt.current_test_id, pt.previous_test_id]).export("C:\\temp\\")

This D3 visualization is still being tweaked and improved if you encounter any issue with it please open an issue. (Your feedback is appreciated!)

Generating a CSV file

Example of a csv file

You can generate a CSV export in the following way:

from examples.example_code import FancyCode
from QuickPotato import performance_test as pt
from QuickPotato.statistical.visualizations import CsvFile

# Create a test case
pt.test_case_name = "exporting to csv"

# Attach the method from which you want to performance test
pt.measure_method_performance(
    method=FancyCode().say_my_name_and_more,  # <-- The Method which you want to test.
    arguments=["joey hendricks"],  # <-- Your arguments go here.
    iteration=10,  # <-- The number of times you want to execute this method.
    pacing=0  # <-- How much seconds you want to wait between iterations.
)

# Export the sample into csv file for further analysis
CsvFile(pt.test_case_name, test_id=pt.current_test_id).export("C:\\temp\\")

Generating a Bar Chart

Example of a bar chart

How to interpret the bar chart generate by QuickPotato:

  • each color is method executed in your performance test.
  • The graph is ordered by latency from slowest to fastest (This can be disabled.)
  • The Y axis is time spent per method.
  • The X axis made up out of samples and divided per test id.
  • You can exclude a method by clicking the method name in the legend. Also, by double clicking a method name you can deselect all other methods.
  • On the top right-hand side Plotly's control bar can be found to further interact with the graph.

You can generate a simple interactive bar chart in the following way:

from examples.example_code import FancyCode
from QuickPotato import performance_test as pt
from QuickPotato.statistical.visualizations import BarChart

# Create a test case
pt.test_case_name = "bar chart"

# Attach the method from which you want to performance test
pt.measure_method_performance(
    method=FancyCode().say_my_name_and_more,  # <-- The Method which you want to test.
    arguments=["joey hendricks"],  # <-- Your arguments go here.
    iteration=10,  # <-- The number of times you want to execute this method.
    pacing=0  # <-- How much seconds you want to wait between iterations.
)

# Generate visualizations to analyse your code.
BarChart(pt.test_case_name, test_ids=[pt.current_test_id, pt.previous_test_id]).export("C:\\temp\\")

Boundary testing

Within QuickPotato, it is possible to create a performance test that validates if your code breaches any defined boundary or not. An example of this sort of test can be found in the snippet below:

from QuickPotato import performance_test as pt
from examples.example_code import FancyCode

# Create a test case
pt.test_case_name = "test_performance"  # <-- Define test case name

# Defining the boundaries
pt.max_and_min_boundary_for_average = {"max": 1, "min": 0.001}

# Execute your code in a non-intrusive way
pt.measure_method_performance(
    method=FancyCode().say_my_name_and_more,  # <-- The Method which you want to test.
    arguments=["joey hendricks"],  # <-- Your arguments go here.
    iteration=10,  # <-- The number of times you want to execute this method.
    pacing=0  # <-- How much seconds you want to wait between iterations.
)

# Analyse results for change True if there is no change otherwise False
results = pt.verify_benchmark_against_previous_baseline()

Regression testing

It is also possible to verify that there is no regression between the current benchmark and a previous baseline. The method for creating such a test can also be found in the snippet below:

from QuickPotato import performance_test as pt
from QuickPotato.configuration.management import options
from examples.example_code import FancyCode

# Disabling this setting will filter out untested or failed test-id's out of your baseline selection.
options.enable_the_selection_of_untested_or_failed_test_ids = False

# Create a test case
pt.test_case_name = "test_performance"  # <-- Define test case name

# Execute your code in a non-intrusive way
pt.measure_method_performance(
  method=FancyCode().say_my_name_and_more,  # <-- The Method which you want to test.
  arguments=["joey hendricks"],  # <-- Your arguments go here.
  iteration=10,  # <-- The number of times you want to execute this method.
  pacing=0  # <-- How much seconds you want to wait between iterations.
)

# Analyse results for change True if there is no change otherwise False
results = pt.verify_benchmark_against_previous_baseline()

Integrating with unit testing frameworks

Uplifting basic performance tests into a test framework is easy within QuickPotato and can be achieved the following way:

from QuickPotato import performance_test as pt
from QuickPotato.configuration.management import options
from examples.example_code import *
import unittest


class TestPerformance(unittest.TestCase):

    def setUp(self):
        """
        Disable the selection of failed or untested test results.
        This will make sure QuickPotato will only compare you tests against a valid baseline.
        """
        options.enable_the_selection_of_untested_or_failed_test_ids = False

    def tearDown(self):
        """
        Enabling the selection of failed or untested test results.
        We enable this setting after the test so it will not bother you when quick profiling.
        """
        options.enable_the_selection_of_untested_or_failed_test_ids = True

    def test_performance_of_method(self):
        """
        Your performance test.
        """
        # Create a test case
        pt.test_case_name = "test_performance"  # <-- Define test case name

        # Defining the boundaries
        pt.max_and_min_boundary_for_average = {"max": 10, "min": 0.001}

        # Execute your code in a non-intrusive way
        pt.measure_method_performance(
            method=FancyCode().say_my_name_and_more,  # <-- The Method which you want to test.
            arguments=["joey hendricks"],  # <-- Your arguments go here.
            iteration=10,  # <-- The number of times you want to execute this method.
            pacing=0  # <-- How much seconds you want to wait between iterations.
        )

        # Pass or fail the performance test
        self.assertTrue(pt.verify_benchmark_against_previous_baseline())
        self.assertTrue(pt.verify_benchmark_against_set_boundaries())

Coming soon

Some features which I am planning to add to QuickPotato soon:

  • Improving the heatmap
  • Scatter plot
  • Creating a virtual map of your code
  • Time Line (Showing from left to right the time spent per action in your code.)

Learn more about QuickPotato

If you want to learn more about test driven performance testing and want to see how this project reached its current state? Then I would encourage you to check out the following resources:

SonarCloud

Comments
  • Inconsistent tests: have I caused this problem, or is this expected behavior?

    Inconsistent tests: have I caused this problem, or is this expected behavior?

    Hi Joey,

    I have implemented my ideas on the observer_pattern branch of my fork. You can examine the diff here, although be warned that it is rather large.

    I'm most confident in the changes I made to QuickPotato.profiling.intrusive and QuickPotato.profiling.interpreters. I'm least confident in any of the changes I made to the other test infrastructure.

    If I run test_pt_boundary_testing, all tests pass.

    If I run test_pt_visualizations, the test fails, although I'm not sure it was every supposed to succeed in the first place, as FlameGraph() does not receive a test_id.

    I'm most confused about the test_pt_regression_testing. If I run the tests again and again and again without making any changes to the code, I get different results. Observe the following screenshots, and note that they were all taken successively with no changes to code.

    Screenshot from 2021-07-30 11-59-48

    Screenshot from 2021-07-30 12-00-47

    Screenshot from 2021-07-30 12-02-08

    Screenshot from 2021-07-30 12-02-53

    I'm confused about why this happens. Is this expected behavior, or have I messed something up?


    Edit: In case you are confused about how my PerformanceBreakpoint decorator should be used, see the following example:

    # standard library
    from time import sleep
    from math import log10
    
    # QuickPotato
    from QuickPotato.profiling.intrusive import PerformanceBreakpoint
    from QuickPotato.profiling.interpreters import SimpleInterpreter, StatisticsInterpreter
    
    @PerformanceBreakpoint
    def function_1():
        sleep(1)
        return len(str(log10(7**7**7)))
    
    @PerformanceBreakpoint(observers=[StatisticsInterpreter])
    def function_2():
        sleep(1)
        return len(str(log10(7**7**7)))
    
    @PerformanceBreakpoint(observers=[SimpleInterpreter])
    def function_3():
        sleep(1)
        return len(str(log10(7**7**7)))
    
    # --- and now in console ---
    
    >>> from QuickPotato import performance_test
    ... from ??? import function_1, function_2, function_3
    
    # required in order to set performance_test.test_id
    >>> performance_test.test_case_name = 'MyDemonstrativeTest'
    
    >>> function_3()
    # runs the SimpleInterpreter, which just prints to console, and returns the function's value
    SimpleInterpreter
     ├─ self.method_name='function_3'
     ├─ self.test_id='GU98BK70CBI9'
     ├─ self.sample_id='BHVO2ZDN'
     ├─ self.database_name='QuickProfiling'
     ├─ subject.profiler.total_response_time=1.1437058448791504
    17
    
    >>> function_1()
    # simply runs the function without profiling
    17
    
    >>> function_2()
    # runs the StatisticsInterpreter, which logs to the database, and returns the function's value
    17
    
    

    ~I've not tested the execution_wrapper parameter yet, but I think I need to use that with Dagster. When I used JoeyHendricks/QuickPotato:master with Dagster:~

    • ~profiling a Dagster pipeline only captured the pipeline setup, not its full run~
    • ~profiling a Dagster solid captured nothing and wrote None for every column of the performance_statistics database.~

    ~Based on that experience, I think I will need to respectively pass the following to execution_wrapper:~


    Edit 2: re Dagster Nope. I'm going to have to experiment a bit more with decorator voodoo.

    question 
    opened by afparsons 3
  • Great work! Here are some discussion topics...

    Great work! Here are some discussion topics...

    This is a nice library. I had some time to play with it this afternoon (we previously exchanged a few messages on Reddit) and I am impressed.

    If you don't mind, I am going to use this GitHub issue to maintain a dialogue with you since it appears that you have not enabled GitHub Discussions. If I develop any concrete feature requests or specific questions or bug reports, I will spin those out as separate and discrete GitHub issues.


    I intend to use QuickPotato to find slow parts of my Dagster pipelines.

    Dagster describes itself in the following way:

    Dagster is a data orchestrator for machine learning, analytics, and ETL

    Essentially, Dagster is a competitor/spiritual successor to Apache Airflow. It is used as follows:

    • a data engineer defines discrete actions as Python functions. Each one of these Python functions is called a Solid.
    • solids can be chained together to form a Pipeline, which is analogous to an Apache Airflow DAG ("directed acyclic graph").
    • Dagster offers many other non-executable constructs, like Resource, Mode, Schedules, Sensor, IOManager, etc.; it really is a well thought out system, but for now, those aren't relevant to my questions.
    • a Pipeline can be executed automatically via Schedule (think cron job), triggered by a Sensor ("run this pipeline when a new file appears in the S3 bucket"), run via the Dagster CLI, or run through the Dagit UI.

    I was most attracted to this library because of the intrusive profiling this library offers. The other alternatives I looked at (py-spy, pyflame, etc.) require providing a specific Python script as an argument to the profiler. That approach simply doesn't work when integrating with Dagster! Being able to wrap a function and to know that the function will be profiled whenever it is called (and enabled=True) is really awesome. Congratulations :grinning:

    I have provided the context above just so you understand my use cases.

    I've watched your NeotysPAC demonstration. Please correct me if I am wrong, but it seems like your initial idea was to build the intrusive profiler, and since then you have subsequently added in the unit test workflows (for example, the pt.measure_method_performance(...) construct). At this point, it seems like the performance_breakpoint decorator is not feature complete.

    Below are my thoughts after experimenting with QuickPotato for about an hour. Please note that I have not yet looked through the codebase extensively and I might hold some misconceptions.

    • I would like to be able to provide more configuration arguments to performance_breakpoint. It looks like it inherits the performance_test.test_case_name from scope, but being able to pass this in, or dynamically set it, or generate it using some sort of rule (lambda?) would be great. Likewise with a database name/connection.

    • It took me a while to figure out why QuickPotato wasn't writing to my PostgreSQL server (docker container). I finally saw that the database name is set by the test_case_name. This was not intuitive for me. It would be helpful if you could document how QuickPotato organizes its recordings, i.e. that each test case gets its own database table.

    # QuickPotato/profiling/intrusive.py
    database_name=performance_test.test_case_name
    
    # QuickPotato/database/operations.py
    return f"{self.URL}/{database_name}"
    
    • I would like a way to link a specific execution of a Dagster Pipeline to its QuickPotato profile record. Each run of a Dagster Pipeline is automatically assigned a UUID, and its metadata, logs, etc. are stored in a database keyed by this UUID. It looks like each QuickPotato benchmark has both a test_id and a sample_id. It would be great to be able to get those from the StatisticsInterpreter before/during/after it writes to the database so I can then match the Dagster and QuickPotato IDs. I don't know how to do that though. Maybe modifying the decorator to work with the observer pattern could work? The test_id and sample_id could be given to some other function which handles matching the profile statistics with the Dagster run_id.

    Anyway, great work so far!

    enhancement 
    opened by afparsons 3
  • strange behaviour

    strange behaviour

    Hi i ve just discovered QuickPotato and for the first use i ve encountered strange behaviour.

    this the very simple test i ve used image

    first point On the method pt.measure_method_performance there is not named_arguments param, i looked in the source code and it figure than named parameters are not considered.

    image

    second point data.average_response_time raise a ZeroDivisionError because data._response_times is an empty list.

    best regards

    bug 
    opened by larrieu-olivier 3
  • QuickPotato crashes when large payload are sent to database (SQLite Only)

    QuickPotato crashes when large payload are sent to database (SQLite Only)

    The why:

    When running in SQLite mode (not attached to a proper database server) QuickPotato will crash when the insert payload is over 999. This crash can happen every time when turning on system resource collection or when your stacks are large.

    Work around:

    Turn off system resource collection in the options file and for now refrain from measuring large and complex piece of code when running in SQLite mode. Or you can just connect to a database server like MySQL where to problem does not exist.

    Reason:

    SQLite cannot insert payload greater than 999 SQL variables therefore it would spit out the following error:

    "too many SQL variables"

    Link to problem description on stack overflow:

    • https://stackoverflow.com/questions/7106016/too-many-sql-variables-error-in-django-with-sqlite3

    Solution

    Within the interpreter classes I need to make changes so the following action could be preformed:

    • [x] Check if SQLite is used as a database server within the interpreter class
    • [x] Split payload (if payload is over 999 and SQLite is used) in multiple chunks
    bug 
    opened by JoeyHendricks 2
  • Add better statistical detection to QuickPotato

    Add better statistical detection to QuickPotato

    Will be adding the regression algorithms I have developed to detect relevant changes between software changes. More info:

    https://github.com/JoeyHendricks/automated-performance-test-result-analysis

    enhancement 
    opened by JoeyHendricks 1
  • problems when using sqlite3 with large profiled samples and deprecation warning

    problems when using sqlite3 with large profiled samples and deprecation warning

    Got the following error in one of my project will fix this and update QuickPotato.

    C:\Program Files\Python38\lib\copyreg.py:91: SADeprecationWarning: Calling URL() directly is deprecated and will be disabled in a future release.  The public constructor for URL is now the URL.create() method.
      return cls.__new__(cls, *args)
    
    
    Ran 1 test in 1.273s
    
    FAILED (errors=1)
    
    Error
    Traceback (most recent call last):
      File "C:\Users\user\OneDrive\My Documents\PycharmProject\venster\venv\lib\site-packages\sqlalchemy\engine\base.py", line 1802, in _execute_context
        self.dialect.do_execute(
      File "C:\Users\user\OneDrive \My Documents\PycharmProject\venster\venv\lib\site-packages\sqlalchemy\engine\default.py", line 719, in do_execute
        cursor.execute(statement, parameters)
    sqlite3.OperationalError: too many SQL variables
    
    The above exception was the direct cause of the following exception:
    
    bug 
    opened by JoeyHendricks 1
  • To spawn a database the configuration class must be imported and used.

    To spawn a database the configuration class must be imported and used.

    The why:

    I saw that when I was not using the configuration class to define settings in the YAML file. That QuickPotato would refuse to spawn a database.

    Solution:

    I need to look into why this happens and come up with a fix will update this issues as soon as possible.

    bug 
    opened by JoeyHendricks 1
  • Cannot collect multiple samples into one html flame graph report when quick profiling

    Cannot collect multiple samples into one html flame graph report when quick profiling

    The why:

    When you're testing code performance outside of a "unit performance test" it is not possible to get all collected samples into one HTML report. This happens because the test id keeps changing everytime the function is executed not allowing you to rapidly capture information about your code.

    Work around:

    Use the unit performance test class to manually set a test case that will force QuickPotato to not generate a new one.

    Solution:

    Investigate options to allow multiple executions to appear in the HTML test report.

    bug 
    opened by JoeyHendricks 1
  • Reworking intrusive profiling decorator

    Reworking intrusive profiling decorator

    QuickPotato can intrusively profile code this feature is overshadowed by the benchmarking object but this needs to be reworked and properly documented.

    enhancement 
    opened by JoeyHendricks 0
  • Reworking Database object

    Reworking Database object

    Will be improving how databases are handled in QuickPotato for that the micro benchmarking core needs to be also reworked. The idea behind this change is that QuickPotato works within one database using a test case name as an identifier.

    Because of this change, QuickPotato can better integrate within a corporate setting and will become more friendly to database vendors like Oracle. This change also entails that the database system behind QuickPotato will become less complicated and relies more on SQLalchemy core than in the previous iteration.

    Also, this change will help out in when and how data is stored in the benchmarking database because of this the benchmarking object will become less complicated in regards to saving data.

    Summarized this issue will improve the following things:

    • [x] Reduce complexity around databases
    • [x] No more multiple databases by default (Can still be determined by the user by overwriting the database connection string.)
    • [x] Better save mechanisms around test evidence.
    • [x] Improvements to how it is possible to interact with statistical measurements from the benchmark object.
    • [x] Overall structural improvements to make room for the above changes.
    • [x] reworked parts contain a new or better doc string.

    Commits will be pushed to this repo when the changes are complete.

    enhancement 
    opened by JoeyHendricks 0
Releases(v1.0.3)
Owner
Joey Hendricks
I am an eccentric dumpling-loving performance engineer that is enthusiastic about making software applications lighting fast!
Joey Hendricks
A Python toolbox for gaining geometric insights into high-dimensional data

"To deal with hyper-planes in a 14 dimensional space, visualize a 3D space and say 'fourteen' very loudly. Everyone does it." - Geoff Hinton Overview

Contextual Dynamics Laboratory 1.6k Feb 17, 2021
BrowZen correlates your emotional states with the web sites you visit to give you actionable insights about how you spend your time browsing the web.

BrowZen BrowZen correlates your emotional states with the web sites you visit to give you actionable insights about how you spend your time browsing t

Nick Bild 36 Sep 28, 2022
AB-test-analyzer - Python class to perform AB test analysis

AB-test-analyzer Python class to perform AB test analysis Overview This repo con

null 13 Jul 16, 2022
I'm doing Genuary, an aritifiacilly generated month to build code that make beautiful things

Genuary 2022 I'm doing Genuary, an aritifiacilly generated month to build code that make beautiful things. Every day there is a new prompt for making

Joaquín Feltes 1 Jan 10, 2022
Python module for drawing and rendering beautiful atoms and molecules using Blender.

Batoms is a Python package for editing and rendering atoms and molecules objects using blender. A Python interface that allows for automating workflows.

Xing Wang 1 Jul 6, 2022
This package creates clean and beautiful matplotlib plots that work on light and dark backgrounds

This package creates clean and beautiful matplotlib plots that work on light and dark backgrounds. Inspired by the work of Edward Tufte.

Nico Schlömer 205 Jan 7, 2023
UNMAINTAINED! Renders beautiful SVG maps in Python.

Kartograph is not maintained anymore As you probably already guessed from the commit history in this repo, Kartograph.py is not maintained, which mean

null 1k Dec 9, 2022
🐍PyNode Next allows you to easily create beautiful graph visualisations and animations

PyNode Next A complete rewrite of PyNode for the modern era. Up to five times faster than the original PyNode. PyNode Next allows you to easily create

ehne 3 Feb 12, 2022
Painlessly create beautiful matplotlib plots.

Announcement Thank you to everyone who has used prettyplotlib and made it what it is today! Unfortunately, I no longer have the bandwidth to maintain

Olga Botvinnik 1.6k Jan 6, 2023
Collection of scripts for making high quality beautiful math-related posters.

Poster Collection of scripts for making high quality beautiful math-related posters. The poster can have as large printing size as 3x2 square feet wit

Nattawut Phetmak 3 Jun 9, 2022
Generate SVG (dark/light) images visualizing (private/public) GitHub repo statistics for profile/website.

Generate daily updated visualizations of GitHub user and repository statistics from the GitHub API using GitHub Actions for any combination of private and public repositories, whether owned or contributed to - no server required.

Adam Ross 2 Dec 16, 2022
An application that allows you to design and test your own stock trading algorithms in an attempt to beat the market.

StockBot is a Python application for designing and testing your own daily stock trading algorithms. Installation Use the

Ryan Cullen 280 Dec 19, 2022
Simple CLI python app to show a stocks graph performance. Made with Matplotlib and Tiingo.

stock-graph-python Simple CLI python app to show a stocks graph performance. Made with Matplotlib and Tiingo. Tiingo API Key You will need to add your

Toby 3 May 14, 2022
Dipto Chakrabarty 7 Sep 6, 2022
Python ts2vg package provides high-performance algorithm implementations to build visibility graphs from time series data.

ts2vg: Time series to visibility graphs The Python ts2vg package provides high-performance algorithm implementations to build visibility graphs from t

Carlos Bergillos 26 Dec 17, 2022
High performance, editable, stylable datagrids in jupyter and jupyterlab

An ipywidgets wrapper of regular-table for Jupyter. Examples Two Billion Rows Notebook Click Events Notebook Edit Events Notebook Styling Notebook Pan

J.P. Morgan Chase 75 Dec 15, 2022
Here I plotted data for the average test scores across schools and class sizes across school districts.

HW_02 Here I plotted data for the average test scores across schools and class sizes across school districts. Average Test Score by Race This graph re

null 7 Oct 27, 2021
Simple, realtime visualization of neural network training performance.

pastalog Simple, realtime visualization server for training neural networks. Use with Lasagne, Keras, Tensorflow, Torch, Theano, and basically everyth

Rewon Child 416 Dec 29, 2022
A high performance implementation of HDBSCAN clustering. http://hdbscan.readthedocs.io/en/latest/

HDBSCAN Now a part of scikit-learn-contrib HDBSCAN - Hierarchical Density-Based Spatial Clustering of Applications with Noise. Performs DBSCAN over va

Leland McInnes 91 Dec 29, 2022