Quantify the difference between two arbitrary curves in space

Overview

similaritymeasures

Downloads a month similaritymeasures ci codecov

Quantify the difference between two arbitrary curves

Curves in this case are:

  • discretized by inidviudal data points
  • ordered from a beginning to an ending

Consider the following two curves. We want to quantify how different the Numerical curve is from the Experimental curve. Notice how there are no concurrent Stress or Strain values in the two curves. Additionally one curve has more data points than the other curves.

Image of two different curves

In the ideal case the Numerical curve would match the Experimental curve exactly. This means that the two curves would appear directly on top of each other. Our measures of similarity would return a zero distance between two curves that were on top of each other.

Methods covered

This library includes the following methods to quantify the difference (or similarity) between two curves:

  • Partial Curve Mappingx (PCM) method: Matches the area of a subset between the two curves [1]
  • Area methodx: An algorithm for calculating the Area between two curves in 2D space [2]
  • Discrete Frechet distancey: The shortest distance in-between two curves, where you are allowed to very the speed at which you travel along each curve independently (walking dog problem) [3, 4, 5, 6, 7, 8]
  • Curve Lengthx method: Assumes that the only true independent variable of the curves is the arc-length distance along the curve from the origin [9, 10]
  • Dynamic Time Warpingy (DTW): A non-metric distance between two time-series curves that has been proven useful for a variety of applications [11, 12, 13, 14, 15, 16]

x denotes methods created specifically for material parameter identification

y denotes that the method implemented in this library supports N-D data!

Installation

Install with pip

[sudo] pip install similaritymeasures

or clone and install from source.

git clone https://github.com/cjekel/similarity_measures
[sudo] pip install ./similarity_measures

Example usage

This shows you how to compute the various similarity measures

import numpy as np
import similaritymeasures
import matplotlib.pyplot as plt

# Generate random experimental data
x = np.random.random(100)
y = np.random.random(100)
exp_data = np.zeros((100, 2))
exp_data[:, 0] = x
exp_data[:, 1] = y

# Generate random numerical data
x = np.random.random(100)
y = np.random.random(100)
num_data = np.zeros((100, 2))
num_data[:, 0] = x
num_data[:, 1] = y

# quantify the difference between the two curves using PCM
pcm = similaritymeasures.pcm(exp_data, num_data)

# quantify the difference between the two curves using
# Discrete Frechet distance
df = similaritymeasures.frechet_dist(exp_data, num_data)

# quantify the difference between the two curves using
# area between two curves
area = similaritymeasures.area_between_two_curves(exp_data, num_data)

# quantify the difference between the two curves using
# Curve Length based similarity measure
cl = similaritymeasures.curve_length_measure(exp_data, num_data)

# quantify the difference between the two curves using
# Dynamic Time Warping distance
dtw, d = similaritymeasures.dtw(exp_data, num_data)

# print the results
print(pcm, df, area, cl, dtw)

# plot the data
plt.figure()
plt.plot(exp_data[:, 0], exp_data[:, 1])
plt.plot(num_data[:, 0], num_data[:, 1])
plt.show()

If you are interested in setting up an optimization problem using these measures, check out this Jupyter Notebook which replicates Section 3.2 from [2].

Changelog

Version 0.3.0: Frechet distance now supports N-D data! See CHANGELOG.md for full details.

Documenation

Each function includes a descriptive docstring, which you can view online here.

References

[1] Katharina Witowski and Nielen Stander. Parameter Identification of Hysteretic Models Using Partial Curve Mapping. 12th AIAA Aviation Technology, Integration, and Op- erations (ATIO) Conference and 14th AIAA/ISSMO Multidisciplinary Analysis and Optimization Conference, sep 2012. doi: doi:10.2514/6.2012-5580.

[2] Jekel, C. F., Venter, G., Venter, M. P., Stander, N., & Haftka, R. T. (2018). Similarity measures for identifying material parameters from hysteresis loops using inverse analysis. International Journal of Material Forming. https://doi.org/10.1007/s12289-018-1421-8

[3] M Maurice Frechet. Sur quelques points du calcul fonctionnel. Rendiconti del Circol Matematico di Palermo (1884-1940), 22(1):1–72, 1906.

[4] Thomas Eiter and Heikki Mannila. Computing discrete Frechet distance. Technical report, 1994.

[5] Anne Driemel, Sariel Har-Peled, and Carola Wenk. Approximating the Frechet Distance for Realistic Curves in Near Linear Time. Discrete & Computational Geometry, 48(1): 94–127, 2012. ISSN 1432-0444. doi: 10.1007/s00454-012-9402-z. URL http://dx.doi.org/10.1007/s00454-012-9402-z.

[6] K Bringmann. Why Walking the Dog Takes Time: Frechet Distance Has No Strongly Subquadratic Algorithms Unless SETH Fails, 2014.

[7] Sean L Seyler, Avishek Kumar, M F Thorpe, and Oliver Beckstein. Path Similarity Analysis: A Method for Quantifying Macromolecular Pathways. PLOS Computational Biology, 11(10):1–37, 2015. doi: 10.1371/journal.pcbi.1004568. URL https://doi.org/10.1371/journal.pcbi.1004568.

[8] Helmut Alt and Michael Godau. Computing the Frechet Distance Between Two Polyg- onal Curves. International Journal of Computational Geometry & Applications, 05 (01n02):75–91, 1995. doi: 10.1142/S0218195995000064.

[9] A Andrade-Campos, R De-Carvalho, and R A F Valente. Novel criteria for determina- tion of material model parameters. International Journal of Mechanical Sciences, 54 (1):294–305, 2012. ISSN 0020-7403. doi: https://doi.org/10.1016/j.ijmecsci.2011.11.010. URL http://www.sciencedirect.com/science/article/pii/S0020740311002451.

[10] J Cao and J Lin. A study on formulation of objective functions for determin- ing material models. International Journal of Mechanical Sciences, 50(2):193–204, 2008. ISSN 0020-7403. doi: https://doi.org/10.1016/j.ijmecsci.2007.07.003. URL http://www.sciencedirect.com/science/article/pii/S0020740307001178.

[11] Donald J Berndt and James Clifford. Using Dynamic Time Warping to Find Pat- terns in Time Series. In Proceedings of the 3rd International Conference on Knowledge Discovery and Data Mining, AAAIWS’94, pages 359–370. AAAI Press, 1994. URL http://dl.acm.org/citation.cfm?id=3000850.3000887.

[12] François Petitjean, Alain Ketterlin, and Pierre Gançarski. A global averaging method for dynamic time warping, with applications to clustering. Pattern Recognition, 44 (3):678–693, 2011. ISSN 0031-3203. doi: https://doi.org/10.1016/j.patcog.2010.09.013. URL http://www.sciencedirect.com/science/article/pii/S003132031000453X.

[13] Toni Giorgino. Computing and Visualizing Dynamic Time Warping Alignments in R: The dtw Package. Journal of Statistical Software; Vol 1, Issue 7 (2009), aug 2009. URL http://dx.doi.org/10.18637/jss.v031.i07.

[14] Stan Salvador and Philip Chan. Toward Accurate Dynamic Time Warping in Linear Time and Space. Intell. Data Anal., 11(5):561–580, oct 2007. ISSN 1088-467X. URL http://dl.acm.org/citation.cfm?id=1367985.1367993.

[15] Paolo Tormene, Toni Giorgino, Silvana Quaglini, and Mario Stefanelli. Matching incomplete time series with dynamic time warping: an algorithm and an applica- tion to post-stroke rehabilitation. Artificial Intelligence in Medicine, 45(1):11–34, 2009. ISSN 0933-3657. doi: https://doi.org/10.1016/j.artmed.2008.11.007. URL http://www.sciencedirect.com/science/article/pii/S0933365708001772.

[16] Senin, P., 2008. Dynamic time warping algorithm review. Information and Computer Science Department University of Hawaii at Manoa Honolulu, USA, 855, pp.1-23. http://seninp.github.io/assets/pubs/senin_dtw_litreview_2008.pdf

Contributions welcome!

This is by no means a complete list of all possible similarity measures. For instance the SciPy Hausdorff distance is an alternative similarity measure useful if you don't know the beginning and ending of each curve. There are many more possible functions out there. Feel free to send PRs for other functions in literature!

Requirements for adding new method to this library:

  • all methods should be able to quantify the difference between two curves
  • method must support the case where each curve may have a different number of data points
  • follow the style of existing functions
  • reference to method details, or descriptive docstring of the method
  • include test(s) for your new method
  • minimum Python dependencies (try to stick to SciPy/numpy functions if possible)

Please cite

If you've found this information or library helpful please cite the following paper. You should also cite the papers of any methods that you have used.

Jekel, C. F., Venter, G., Venter, M. P., Stander, N., & Haftka, R. T. (2018). Similarity measures for identifying material parameters from hysteresis loops using inverse analysis. International Journal of Material Forming. https://doi.org/10.1007/s12289-018-1421-8

@article{Jekel2019,
author = {Jekel, Charles F and Venter, Gerhard and Venter, Martin P and Stander, Nielen and Haftka, Raphael T},
doi = {10.1007/s12289-018-1421-8},
issn = {1960-6214},
journal = {International Journal of Material Forming},
month = {may},
title = {{Similarity measures for identifying material parameters from hysteresis loops using inverse analysis}},
url = {https://doi.org/10.1007/s12289-018-1421-8},
year = {2019}
}
Comments
  • frechet_dist input size is bounded by maximum recursion depth

    frechet_dist input size is bounded by maximum recursion depth

    Consider the followings:

    max_len = 1000
    a = [[1,2,3] for i in range(max_len)]
    b = [[1,6,3] for i in range(max_len)]
    frechet_dist(a,b)
    
    

    While running this code on a 32GB RAM machine it raises a stack-overflow error. I would suggest to switch the recursion based computations to iterative based computations using Queue's.

    Is anyone currently working on optimizing the memory usage of frechet_dist ?

    Thank you for your work, Arbel Amir

    opened by ArbelAmir 8
  • discrete Frechet distance between lists or 1D arrays

    discrete Frechet distance between lists or 1D arrays

    my question might sounds a little dumb. but is it possible to use similaritymeasures.frechet_dist() for lists or 1D arrays? i tried to calculate similarity between a list and other multiple list (which also contains the first list )but the most similar output was not the first argument.which i expect to return it since they are exactly the same.but it works when implemented in real coordinates with lat ,lon like trajectories and the most similar output is the first given argument.i'm trying to use factors other than distance for calculating similarity between two thing and those parametrs are just a numerical values and i'm wondering how can i use this frechet _dist for list arrays.

    opened by miladad8 6
  • Regarding code update in

    Regarding code update in "is_simple_quad" function on Aug 18,2019

    Dear Authors,

    Thanks for your contribution in the form of "simialritymeasures" library for quantifying the difference between the curves. I have been using it for finding the area between the curves. But, since your update in the code to check if the quadrilateral is simple or not [ in "is_simple_quad" function on Aug 18,2019], the output for area between the curves is not correct. (However, if I use the previous code the area returned is correct). Specifically, the "if condition" which checks the number of cross products with same sign, should be: sum(crossTF) > 2 instead of sum(crossTF) == 2

    The same can be checked from the following code which tries to find the area between two simple curves. Running the following prints : area1 : 0.0

    while using the previous code give correct area (4 in this case)

    import matplotlib.pyplot as plt
    import similaritymeasures
    
    xaxis=[0,1, 2, 3, 4]
    curve1=[0,0,0,0,0]
    curve2=[1,1,1,1,1]
    exp_data = np.zeros((len(xaxis), 2))
    num_data = np.zeros((len(xaxis), 2))
    
    exp_data[:, 0] = xaxis
    exp_data[:, 1] = curve1
    num_data[:, 0] = xaxis
    num_data[:, 1] = curve2
    
    plt.figure()
    plt.scatter(xaxis, curve1)
    plt.scatter(xaxis, curve2)
    plt.show()
            
    area1=similaritymeasures.area_between_two_curves(exp_data, num_data)
    print("area1 : "+str(area1) )```
    opened by aanchalMongia 5
  • Problem during pip install:

    Problem during pip install: "UnicodeDecodeError: 'gbk' codec can't decode byte"

    pip install similaritymeasures gives

    (base) D:\repositories\joinTracks>pip install similaritymeasures
    Collecting similaritymeasures
      Using cached similaritymeasures-0.4.3.tar.gz (397 kB)
        ERROR: Command errored out with exit status 1:
         command: 'C:\Users\s00557672\Anaconda3\python.exe' -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\S00557~1\\AppData\\Local\\Temp\\pip-install-d5tndazp\\similaritymeasures\\setup.py'"'"';
    __file__='"'"'C:\\Users\\S00557~1\\AppData\\Local\\Temp\\pip-install-d5tndazp\\similaritymeasures\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'
    "');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base 'C:\Users\S00557~1\AppData\Local\Temp\pip-install-d5tndazp\similaritymeasures\pip-egg-info'
             cwd: C:\Users\S00557~1\AppData\Local\Temp\pip-install-d5tndazp\similaritymeasures\
        Complete output (5 lines):
        Traceback (most recent call last):
          File "<string>", line 1, in <module>
          File "C:\Users\S00557~1\AppData\Local\Temp\pip-install-d5tndazp\similaritymeasures\setup.py", line 12, in <module>
            long_description=open('README.md').read(),
        UnicodeDecodeError: 'gbk' codec can't decode byte 0x93 in position 5204: illegal multibyte sequence
        ----------------------------------------
    ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
    
    

    How can I fix it?

    opened by sergorl 4
  • Added MAE and MSE

    Added MAE and MSE

    Added extra functions to find the Mean Absolute Distance (MAE) and the Mean Squared Distance (MSE) between the two curves. It works with all the distance measures in scipy.spatial.distance.cdist.

    opened by HarshRaoD 2
  • Similarity between two curves which have different number of data points

    Similarity between two curves which have different number of data points

    Hello, I would like to know how to compute the similarity of two curves which own various number of data points? Such like that you referred to on the main page: Also, which methods support this? And what is the meaning of the results? Looking for your reply.

    opened by xiaobrnbrn 2
  • pcm may be wrong

    pcm may be wrong

    A user has pointed out that i have potentially an incorrect pcm implementation because I divide the distances by a max value.

    It is possible that it is a mistake on my part, where I was trying to combine code for the curve_length and pcm methods. It is also possible that I thought xmax and ymax would always be one, so it wouldn't matter. The curve_length method needs the max values because there is no other normalization.

    The line in question: https://github.com/cjekel/similarity_measures/blob/master/similaritymeasures/similaritymeasures.py#L352

    I'm pretty sure that line is correct for the curve_length_measure method.

    bug help wanted 
    opened by cjekel 0
  • Improv perf

    Improv perf

    Good afternoon, I hope this message finds you well, and I compliment and thank you for the code.

    I worked with frechet and dtw, and I found that computational performances of the frechet function were subpar since they didn't use the cdist function from scipy (which i found by far more performing than the minkowski_distance one).

    I took the freedom to change the code and propose you a pull request (i also modified tests code in order for it not to import the already installed package).

    On my day to day job I also use cython to improve performances of python programs, and I was thinking that maybe it could benefit some of the loops in the code.

    Best of wishes for whatever!

    Nuc

    opened by nucccc 3
  • Similarity between two curves using PyTorch

    Similarity between two curves using PyTorch

    Hi guys,

    I want to implement in my trainer a measure of similarity between my predicted trajectory and the GT trajectory. Here is an example:

    imagen

    The GT is the red line, my observation is the yellow line (almost hidden by the other participants) and the green line is my prediction. The other agents are not used at this moment.

    Now, in order to train my DL based Motion Prediction algorithm I am using the ADE, FDE and NLL losses w.r.t. the GT. Nevertheless, I think that if my prediction does not match exactly the GT but it is in the same centerline (but driving with a different velocity, for example) it will be better. E.g.

    imagen

    This prediction does not match the GT (until the red diamond at the bottom), but at least the shapes of both curves are more or less the same.

    How could I do that?

    opened by Cram3r95 20
  • Incorrect measurement of area between intersecting 2D curves

    Incorrect measurement of area between intersecting 2D curves

    It appears that either my understanding of area between the curves or its calculation in the library is incorrect (the referenced paper is paywalled). In the following example, I have two plots, where grey line is original data, and there are two different blue splines. Visually, you can clearly see that the area between two lines on the left plot is several times larger than the area on the right plot, but the calculation with similaritymeasures.area_between_two_curves shows only a 2x difference. image image

    Here is the GitHub gist, where I present the calculation (the GDrive zip with airfoils is public, so the whole thing can be ran in Colab or elsewhere if you modify the path in the 2nd cell): https://gist.github.com/rafalszulejko/2c9ff645b448d60d857975a8f7965045#file-wing-optimization2-ipynb

    opened by rafalszulejko 11
  • Faster DTW

    Faster DTW

    Hello,

    Thanks for a really nice repo with an easy-to-use API for quickly generating some metrics on curve similarities. I just thought I would let you know that there is a much faster DTW implementation than the one you are using in this repo which if it covers your needs you should consider replacing with the current implementation:

    Link to faster DTW implementation

    Carry on the great work! :)

    opened by vancromy 2
  • Add other interpolation methods to the area between curves method

    Add other interpolation methods to the area between curves method

    Right now the area between curves method uses bisection of largest gap to add artificial data points. This method was used to minimize the number of artificial quads/points. However, this can have some negative effects in some cases, specifically when the sampling rate is artificial and does not match (e.g. one curve is just a straight line with few points).

    A potential alternative it to use the arc length projection of one curve's points onto the other. This would preserve the sampling rate, and may make for more uniform quads. This is similar to what's done in PCM method.

    When another interpolation method is added, give users the choice of which interpolation method to use. Changing the interpolation method is anticipated to change the results.

    opened by cjekel 0
Releases(0.6.0)
  • 0.6.0(Oct 8, 2022)

    • similaritymeasures.pcm now produces different values! This was done to better follow the original algorithm. To get the same results from previous versions, set norm_seg_length=True. What this option does is scale each segment length by the maximum values of the curve (borrowed from the curve_length_measure). This scaling should not be needed with the PCM method because both curves are always scaled initially.
    • Fix docstring documentation for returns in similaritymeasures.dtw and similaritymeasures.curve_length_measure
    Source code(tar.gz)
    Source code(zip)
  • v0.5.0(Aug 6, 2022)

Owner
Charles Jekel
Charles Jekel
Code for Two-stage Identifier: "Locate and Label: A Two-stage Identifier for Nested Named Entity Recognition"

Code for Two-stage Identifier: "Locate and Label: A Two-stage Identifier for Nested Named Entity Recognition", accepted at ACL 2021. For details of the model and experiments, please see our paper.

tricktreat 87 Dec 16, 2022
MNIST, but with Bezier curves instead of pixels

bezier-mnist This is a work-in-progress vector version of the MNIST dataset. Samples Here are some samples from the training set. Note that, while the

Alex Nichol 15 Jan 16, 2022
Code for the KDD 2021 paper 'Filtration Curves for Graph Representation'

Filtration Curves for Graph Representation This repository provides the code from the KDD'21 paper Filtration Curves for Graph Representation. Depende

Machine Learning and Computational Biology Lab 16 Oct 16, 2022
Plotting points that lie on the intersection of the given curves using gradient descent.

Plotting intersection of curves using gradient descent Webapp Link ---> What's the app about Why this app Plotting functions and their intersection. A

Divakar Verma 2 Jan 9, 2022
DIT is a DTLS MitM proxy implemented in Python 3. It can intercept, manipulate and suppress datagrams between two DTLS endpoints and supports psk-based and certificate-based authentication schemes (RSA + ECC).

DIT - DTLS Interception Tool DIT is a MitM proxy tool to intercept DTLS traffic. It can intercept, manipulate and/or suppress DTLS datagrams between t

null 52 Nov 30, 2022
A fast model to compute optical flow between two input images.

DCVNet: Dilated Cost Volumes for Fast Optical Flow This repository contains our implementation of the paper: @InProceedings{jiang2021dcvnet, title={

Huaizu Jiang 8 Sep 27, 2021
FuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space OptimizationFuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space Optimization

FuseDream This repo contains code for our paper (paper link): FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimizat

XCL 191 Dec 31, 2022
Space robot - (Course Project) Using the space robot to capture the target satellite that is disabled and spinning, then stabilize and fix it up

Space robot - (Course Project) Using the space robot to capture the target satellite that is disabled and spinning, then stabilize and fix it up

Mingrui Yu 3 Jan 7, 2022
TDN: Temporal Difference Networks for Efficient Action Recognition

TDN: Temporal Difference Networks for Efficient Action Recognition Overview We release the PyTorch code of the TDN(Temporal Difference Networks).

Multimedia Computing Group, Nanjing University 326 Dec 13, 2022
[CVPR 2021] Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion

[CVPR 2021] Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion

Rex Cheng 364 Jan 3, 2023
Official PyTorch implementation of "Physics-aware Difference Graph Networks for Sparsely-Observed Dynamics".

Physics-aware Difference Graph Networks for Sparsely-Observed Dynamics This repository is the official PyTorch implementation of "Physics-aware Differ

USC-Melady 46 Nov 20, 2022
Code for the ICCV 2021 paper "Pixel Difference Networks for Efficient Edge Detection" (Oral).

Pixel Difference Convolution This repository contains the PyTorch implementation for "Pixel Difference Networks for Efficient Edge Detection" by Zhuo

Alex 236 Dec 21, 2022
Ankou: Guiding Grey-box Fuzzing towards Combinatorial Difference

Ankou Ankou is a source-based grey-box fuzzer. It intends to use a more rich fitness function by going beyond simple branch coverage and considering t

SoftSec Lab 54 Dec 24, 2022
Finite difference solution of 2D Poisson equation. Can handle Dirichlet, Neumann and mixed boundary conditions.

Poisson-solver-2D Finite difference solution of 2D Poisson equation Current version can handle Dirichlet, Neumann, and mixed (combination of Dirichlet

Mohammad Asif Zaman 34 Dec 23, 2022
CFC-Net: A Critical Feature Capturing Network for Arbitrary-Oriented Object Detection in Remote Sensing Images

CFC-Net This project hosts the official implementation for the paper: CFC-Net: A Critical Feature Capturing Network for Arbitrary-Oriented Object Dete

ming71 55 Dec 12, 2022
2D Time independent Schrodinger equation solver for arbitrary shape of well

Schrodinger Well Python Python solver for timeless Schrodinger equation for well with arbitrary shape https://imgur.com/a/jlhK7OZ Pictures of circular

WeightAn 24 Nov 18, 2022
Official implementation of "Dynamic Anchor Learning for Arbitrary-Oriented Object Detection" (AAAI2021).

DAL This project hosts the official implementation for our AAAI 2021 paper: Dynamic Anchor Learning for Arbitrary-Oriented Object Detection [arxiv] [c

ming71 215 Nov 28, 2022
Shuwa Gesture Toolkit is a framework that detects and classifies arbitrary gestures in short videos

Shuwa Gesture Toolkit is a framework that detects and classifies arbitrary gestures in short videos

Google 89 Dec 22, 2022