A library for pattern matching on symbolic expressions in Python.

Overview

MatchPy

MatchPy is a library for pattern matching on symbolic expressions in Python.

Work in progress

Latest version released on PyPi Latest version released via conda-forge Test coverage Build status of the master branch Documentation Status The Journal of Open Source Software Digital Object Identifier

Installation

MatchPy is available via PyPI, and for Conda via conda-forge. It can be installed with pip install matchpy or conda install -c conda-forge matchpy.

Overview

This package implements pattern matching in Python. Pattern matching is a powerful tool for symbolic computations, operating on symbolic expressions. Given a pattern and an expression (which is usually called subject), the goal of pattern matching is to find a substitution for all the variables in the pattern such that the pattern becomes the subject. As an example, consider the pattern f(x), where f is a function and x is a variable, and the subject f(a), where a is a constant symbol. Then the substitution that replaces x with a is a match. MatchPy supports associative and/or commutative function symbols, as well as sequence variables, similar to pattern matching in Mathematica.

A detailed example of how to use MatchPy can be found here.

MatchPy supports both one-to-one and many-to-one pattern matching. The latter makes use of similarities between patterns to efficiently find matches for multiple patterns at the same time.

A list of publications about MatchPy can be found below.

Expressions

Expressions are tree-like data structures, consisting of operations (functions, internal nodes) and symbols (constants, leaves):

>>> from matchpy import Operation, Symbol, Arity
>>> f = Operation.new('f', Arity.binary)
>>> a = Symbol('a')
>>> print(f(a, a))
f(a, a)

Patterns are expressions which may contain wildcards (variables):

>>> from matchpy import Pattern, Wildcard
>>> x = Wildcard.dot('x')
>>> print(Pattern(f(a, x)))
f(a, x_)

In the previous example, x is the name of the variable. However, it is also possible to use wildcards without names:

>>> w = Wildcard.dot()
>>> print(Pattern(f(w, w)))
f(_, _)

It is also possible to assign variable names to entire subexpressions:

>>> print(Pattern(f(w, a, variable_name='y')))
y: f(_, a)

Pattern Matching

Given a pattern and an expression (which is usually called subject), the idea of pattern matching is to find a substitution that maps wildcards to expressions such that the pattern becomes the subject. In MatchPy, a substitution is a dict that maps variable names to expressions.

>>> from matchpy import match
>>> y = Wildcard.dot('y')
>>> b = Symbol('b')
>>> subject = f(a, b)
>>> pattern = Pattern(f(x, y))
>>> substitution = next(match(subject, pattern))
>>> print(substitution)
{x ↦ a, y ↦ b}

Applying the substitution to the pattern results in the original expression.

>>> from matchpy import substitute
>>> print(substitute(pattern, substitution))
f(a, b)

Sequence Wildcards

Sequence wildcards are wildcards that can match a sequence of expressions instead of just a single expression:

>>> z = Wildcard.plus('z')
>>> pattern = Pattern(f(z))
>>> subject = f(a, b)
>>> substitution = next(match(subject, pattern))
>>> print(substitution)
{z ↦ (a, b)}

Associativity and Commutativity

MatchPy natively supports associative and/or commutative operations. Nested associative operators are automatically flattened, the operands in commutative operations are sorted:

>>> g = Operation.new('g', Arity.polyadic, associative=True, commutative=True)
>>> print(g(a, g(b, a)))
g(a, a, b)

Associativity and commutativity is also considered for pattern matching:

>>> pattern = Pattern(g(b, x))
>>> subject = g(a, a, b)
>>> print(next(match(subject, pattern)))
{x ↦ g(a, a)}
>>> h = Operation.new('h', Arity.polyadic)
>>> pattern = Pattern(h(b, x))
>>> subject = h(a, a, b)
>>> list(match(subject, pattern))
[]

Many-to-One Matching

When a fixed set of patterns is matched repeatedly against different subjects, matching can be sped up significantly by using many-to-one matching. The idea of many-to-one matching is to construct a so called discrimination net, a data structure similar to a decision tree or a finite automaton that exploits similarities between patterns. In MatchPy, there are two such data structures, implemented as classes: DiscriminationNet and ManyToOneMatcher. The DiscriminationNet class only supports syntactic pattern matching, that is, operations are neither associative nor commutative. Sequence variables are not supported either. The ManyToOneMatcher class supports associative and/or commutative matching with sequence variables. For syntactic pattern matching, the DiscriminationNet should be used, as it is usually faster.

>>> pattern1 = Pattern(f(a, x))
>>> pattern2 = Pattern(f(y, b))
>>> matcher = ManyToOneMatcher(pattern1, pattern2)
>>> subject = f(a, b)
>>> matches = matcher.match(subject)
>>> for matched_pattern, substitution in sorted(map(lambda m: (str(m[0]), str(m[1])), matches)):
...     print('{} matched with {}'.format(matched_pattern, substitution))
f(a, x_) matched with {x ↦ b}
f(y_, b) matched with {y ↦ a}

Roadmap

Besides the existing features, we plan on adding the following to MatchPy:

  • Support for Mathematica's Alternatives: For example f(a | b) would match either f(a) or f(b).
  • Support for Mathematica's Repeated: For example f(a..) would match f(a), f(a, a), f(a, a, a), etc.
  • Support pattern sequences (PatternSequence in Mathematica). These are mainly useful in combination with Alternatives or Repeated, e.g. f(a | (b, c)) would match either f(a) or f(b, c). f((a a)..) would match any f with an even number of a arguments.
  • All these additional pattern features need to be supported in the ManyToOneMatcher as well.
  • Better integration with existing types such as dict.
  • Code generation for both one-to-one and many-to-one matching. There is already an experimental implementation, but it still has some dependencies on MatchPy which can probably be removed.
  • Improving the documentation with more examples.
  • Better test coverage with more randomized tests.
  • Implementation of the matching algorithms in a lower-level language, for example C, both for performance and to make MatchPy's functionality available in other languages.

Contributing

If you have some issue or want to contribute, please feel free to open an issue or create a pull request. Help is always appreciated!

The Makefile has several tasks to help development:

  • To install all needed packages, you can use make init .
  • To run the tests you can use make test. The tests use pytest.
  • To generate the documentation you can use make docs .
  • To run the style checker (pylint) you can use make check .

If you have any questions or need help with setting things up, please open an issue and we will try the best to assist you.

Publications

Manuel Krebber and Henrik Barthels
Journal of Open Source Software, Volume 3(26), pp. 2, June 2018.

Manuel Krebber, Henrik Barthels and Paolo Bientinesi
Proceedings of the 7th Workshop on Python for High-Performance and Scientific Computing, November 2017.

Manuel Krebber, Henrik Barthels and Paolo Bientinesi
Proceedings of the 15th Python in Science Conference, July 2017.

Manuel Krebber
Master Thesis, RWTH Aachen University, May 2017

If you want to cite MatchPy, please reference the JOSS paper:

@article{krebber2018,
    author    = {Manuel Krebber and Henrik Barthels},
    title     = {{M}atch{P}y: {P}attern {M}atching in {P}ython},
    journal   = {Journal of Open Source Software},
    year      = 2018,
    pages     = 2,
    month     = jun,
    volume    = {3},
    number    = {26},
    doi       = "10.21105/joss.00670",
    web       = "http://joss.theoj.org/papers/10.21105/joss.00670",
}
Comments
  • `ManyToOneReplacer` slow for small expressions

    `ManyToOneReplacer` slow for small expressions

    I have added ~1700 ReplacementRules in ManyToOneReplacer. The .replace() has become very slow even for very small expressions(like 2*x).

    File: https://github.com/parsoyaarihant/sympy/blob/502fd311d64155560383a5f4f3b2710c318dc410/sympy/integrals/rubi/patterns.py

    opened by parsoyaarihant 38
  • Added internal implementation of the Hopcroft-Karp algorithm

    Added internal implementation of the Hopcroft-Karp algorithm

    Added Hopcroft-Karp algorithm in order to free MatchPy of its GPL dependency. Readapted from the C++ implementation in SymEngine.

    Translating the C++ implementation in SymEngine back to Python.

    I'm the author of the C++ implementation so no copyright notice mentioning SymEngine is necessary.

    opened by Upabjojr 23
  • `ManyToOneReplacer` for Rubi

    `ManyToOneReplacer` for Rubi

    Hi, I implemented ~80 rules using ManyToOneReplacer for Rubi integration and MatchPy was able to match the subjects very quickly. However, when I added ~550 rules, there was significant decrease in speed while matching even smaller subjects.

    It would be great if you could advice us on how to solve this issue.

    Here are the files for our project:

    Patterns: https://github.com/parsoyaarihant/sympy/blob/rubi4/sympy/rubi/patterns.py Constraint: https://github.com/parsoyaarihant/sympy/blob/rubi4/sympy/rubi/constraint.py All Rubi Files: https://github.com/parsoyaarihant/sympy/tree/rubi4/sympy/rubi

    opened by parsoyaarihant 21
  • Optional arguments in machpy

    Optional arguments in machpy

    I wanted to know if matchpy supports optional arguments during match.

    Mathematica: https://reference.wolfram.com/language/ref/Optional.html

    Example:

    >>> x = Symbol('x')
    >>> a_ = Wildcard.dot('a', optional=1)
    >>> a_free = CustomConstraint(lambda a, x: str(x) not in a.symbols)
    >>> x_ = Wildcard.dot('x')
    >>> pattern = Pattern(Mul(a_, x_), a_free)
    >>> next(match(x, pattern))
    {a: 1}
    
    opened by parsoyaarihant 20
  • Fixing the code generator

    Fixing the code generator

    The code generator enforces the reading of all variables whenever a constraint has been encountered. In this PR, the constraint is check only if all variables have been read.

    Furthermore, indentation has been changes to four spaces and a related bug forcing to always use bugs has been fixed as well.

    opened by Upabjojr 15
  • Repeated definition of the same CustomConstraint

    Repeated definition of the same CustomConstraint

    In the current RUBI rules patterns in SymPy, the same constraint is defined many times. For example: CustomConstraint(lambda a, x: FreeQ(a, x)) has its definition repeated more and more, like in:

    https://github.com/sympy/sympy/blob/628286fb5cc2248002bb0b141a911757a2e81fa7/sympy/integrals/rubi/rules/miscellaneous_algebraic.py#L22

    and in the next rule the same constraint is redefined: https://github.com/sympy/sympy/blob/628286fb5cc2248002bb0b141a911757a2e81fa7/sympy/integrals/rubi/rules/miscellaneous_algebraic.py#L26

    Python defines a new object for every new lambda definition, even if the functions return the same identical value.

    Does this practice have some potential negative consequences, like:

    • the size of the generated decision tree?
    • the overall speed of MatchPy?
    opened by Upabjojr 13
  • using `int` as Wildcard

    using `int` as Wildcard

    Hi, I am trying to define class ConstantWild (similar to ConstantSymbol you suggested earlier).

    Here is the code:

    from matchpy import Wildcard
    
    class ConstantWild(Wildcard):
        def __init__(self, value):
            super(self.__class__, self).__init__(min_count=1, fixed_size=True, variable_name=str(value))
            self.value = value
    
        def __str__(self):
            return str(self.value)
    

    The above code have same attributes as Wildcard.dot but I don't understand why it is giving errors.

    opened by parsoyaarihant 9
  • Custom sorting key for substitute( )

    Custom sorting key for substitute( )

    Substitute now accepts custom sorting key to specify custom sorting for elements in Multiset.

    This should solve the compatibility problem with SymPy which does not allow comparison operators (>, >=, <, <=) to be evaluated to booleans if the truth of the expression cannot be easily determined.

    opened by Upabjojr 8
  • Need Help in code generation.

    Need Help in code generation.

    I could not find any documentation related to code generation. Last year @wheerd had written a code generation script for rubi in sympy. https://gist.github.com/wheerd/13ac5a0e9b560db6201d96136fa0bbcc

    But code generated was huge so, It cannot be used. This year I have removed the repeated definition of constraints. So code generation should work probably.

    However, I am unable to write the code generation script. Here is the new structure of rubi in sympy. Can someone update the gist accordingly?

    rules = [] #list to keep track which rules has been applied
    def cons_f1(m, x):
        return FreeQ(m, x)
    cons1 = CustomConstraint(cons_f1)
    
    def cons_f2(m):
        return NonzeroQ(m + S(1))
    cons2 = CustomConstraint(cons_f2)
    
    pattern1 = Pattern(Integral(x_**WC('m', S(1)), x_), cons1, cons2)
    def replacement1(m, x):
        rules.append(1)
        return Simp(x**(m + S(1))/(m + S(1)), x)
    rule1 = ReplacementRule(pattern1, replacement1)
    
    help wanted question 
    opened by ashishkg0022 7
  • Alternate way to define constraint

    Alternate way to define constraint

    I wanted to know if there is alternatative way to define constraint in matchpy. Example:

    
    def FreeQ(vars, x): # returns True of any `var` contains `x`
    	if any(i.contains(x) for i in vars):
    		return False
    	return True
    
    def NonzeroQ(e): # returns True if `e` is not zero.
    	return e != 0
    
    pattern = Pattern(Int(Pow(Add(a_, Mul(b_, x_)), m_), x_), FreeQ(list(a, b, m), x), NonzeroQ(Add(m, 1)))
    

    Thanks.

    opened by parsoyaarihant 7
  • GPL'd dependency: hopcroftkarp library

    GPL'd dependency: hopcroftkarp library

    The hopcroftkarp library upon which this project depends is GPL'd.

    I have re-implemented the Hopcroft-Karp algorithm in C++ as part of an attempt to make MatchPy usable in SymEngine (that is, SymPy's core rewritten in C++). I didn't perform extensive testing, so there may be bugs. That code can be translated into Python.

    enhancement 
    opened by Upabjojr 6
  • FreeQ equivalent in MatchPy?

    FreeQ equivalent in MatchPy?

    Mathematica has FreeQ to test whether an expression contains a symbol. This is very useful in pattern matching for equations as you can specify that, for example, in a * x + b == 0 the variables a and b should not contain the variable x.

    In SymPy we are currently using things like CustomConstraint(lambda a, x: not a.has(x)), where a.has(x) is a SymPy expression that tells you if x is contained in the expression tree of a.

    Would it make sense to add an optimized FreeQ-like tester that checks whether the variable is contained in the expression during the matching iteration of MatchPy?

    opened by Upabjojr 1
  • Optional wildcards to match the identity element of the current node

    Optional wildcards to match the identity element of the current node

    When using the optional=value argument in Wildcard, one can specify which value to get in case no matching is found. The value has to be given for every wildcard.

    In case of addition and multiplication, the usual optional values are the identity elements, zero and one respectively.

    It would be nice to have the possibility to have the optional value to return the identity element of the current node, if available.

    enhancement 
    opened by Upabjojr 7
  • substitute not working with SymPy

    substitute not working with SymPy

    Calling sorted( ) in https://github.com/HPAC/matchpy/blob/5cae3f275e3a1f725518516bf3c700dc3be03a56/matchpy/functions.py#L87 fails with SymPy, as SymPy objects are not comparable with operators (<, <=, >, >=). Operators are overloaded in SymPy to create instances of inequality objects.

    SymPy has a function called default_sort_key meant to deal with this problem:

    from sympy import symbols
    x, y, z = symbols("x y z")
    from sympy.core.compatibility import default_sort_key
    sorted([z, x, y], key=default_sort_key)
    

    The question is, is it possible to modify MatchPy to specify a custom sorting key so that SymPy can specify the way its objects should be sorted?

    Or is it an issue with Multiset?

    opened by Upabjojr 11
  • One-to-one matching: symbols with variable_name do not work in commutative functions

    One-to-one matching: symbols with variable_name do not work in commutative functions

    In one-to-one matching, symbols that have a variable_name do not work in commutative functions. They seem to work correctly in many-to-one matching. Example:

    from matchpy import Operation, Symbol, Arity, Pattern, match, ManyToOneMatcher
    
    f = Operation.new('f', Arity.binary, commutative=True)
    a_x = Symbol('a', variable_name='x')
    a = Symbol('a')
    b = Symbol('b')
    
    subject = f(a, b)
    pattern = Pattern(f(a_x, b))
    
    print(list(match(subject, pattern)))
    
    matcher = ManyToOneMatcher(pattern)
    print(list(matcher.match(subject)))
    

    This results in

    []
    [(Pattern(f(a: x, b)), {'x': Symbol('a')})]
    

    but it should result in

    [{'x': Symbol('a')}]
    [(Pattern(f(a: x, b)), {'x': Symbol('a')})]
    

    So far, we don't have any tests that verify this functionality. Once it is fixed, we should add some.

    bug 
    opened by hbarthels 0
  • Match arbitrary function

    Match arbitrary function

    Is there any way to match a function with a certain number of arguments? Suppose I've defined a binary function sum, is there a way to create a pattern f(a, b) that matches sum(x, y)?

    enhancement 
    opened by sidmani 4
  • Improve checking correctness of input

    Improve checking correctness of input

    To avoid issues such as #60, we should check that objects are created and functions are used correctly, at least in those cases where so far, incorrect input does not produce any errors.

    enhancement 
    opened by hbarthels 0
Releases(0.5.5)
Owner
High-Performance and Automatic Computing
Umeå University & RWTH Aachen
High-Performance and Automatic Computing
Pattern Matching for Python 3.7+ in a simple, yet powerful, extensible manner.

Awesome Pattern Matching (apm) for Python pip install awesome-pattern-matching Simple Powerful Extensible Composable Functional Python 3.7+, PyPy3.7+

Julian Fleischer 97 Nov 3, 2022
Pampy: The Pattern Matching for Python you always dreamed of.

Pampy: Pattern Matching for Python Pampy is pretty small (150 lines), reasonably fast, and often makes your code more readable and hence easier to rea

Claudio Santini 3.5k Dec 30, 2022
Minimal, super readable string pattern matching for python.

simplematch Minimal, super readable string pattern matching for python. import simplematch simplematch.match("He* {planet}!", "Hello World!") >>> {"p

Thomas Feldmann 147 Dec 1, 2022
Really bad lisp implementation. Fun with pattern matching.

Lisp-py This is a horrible, ugly interpreter for a trivial lisp. Don't use it. It was written as an excuse to mess around with the new pattern matchin

Erik Derohanian 1 Nov 23, 2021
A simple script written using symbolic python that takes as input a desired metric and automatically calculates and outputs the Christoffel Pseudo-Tensor, Riemann Curvature Tensor, Ricci Tensor, Scalar Curvature and the Kretschmann Scalar

A simple script written using symbolic python that takes as input a desired metric and automatically calculates and outputs the Christoffel Pseudo-Tensor, Riemann Curvature Tensor, Ricci Tensor, Scalar Curvature and the Kretschmann Scalar

null 2 Nov 27, 2021
A collection of common regular expressions bundled with an easy to use interface.

CommonRegex Find all times, dates, links, phone numbers, emails, ip addresses, prices, hex colors, and credit card numbers in a string. We did the har

Madison May 1.5k Dec 31, 2022
poetry2nix turns Poetry projects into Nix derivations without the need to actually write Nix expressions

poetry2nix poetry2nix turns Poetry projects into Nix derivations without the need to actually write Nix expressions. It does so by parsing pyproject.t

Nix community projects 405 Dec 29, 2022
Python-Kite: Simple python code to make kite pattern

Python-Kite Simple python code to make kite pattern. Getting Started These instr

Anoint 0 Mar 22, 2022
ripgrep recursively searches directories for a regex pattern while respecting your gitignore

ripgrep (rg) ripgrep is a line-oriented search tool that recursively searches the current directory for a regex pattern. By default, ripgrep will resp

Andrew Gallant 35k Dec 31, 2022
Hacktoberfest2021 🥳- Contribute Any Pattern In Any Language😎 Every PR will be accepted Pls contribute

✨ Hacktober Fest 2021 ✨ ?? All Contributors are requested to star this repo and follow me for a successful merge of pull request. ?? ?? Add any patter

Md. Almas Ali 103 Jan 7, 2023
A repo to record how I prepare my Interview, and really hope it can help you as well. Really appreciate Kieran's help in the pattern's part.

Project Overview The purpose of this repo is to help others to find solutions and explaintion I will commit a solution and explanation to every proble

Vincent Zhenhao ZHAO 1 Nov 29, 2021
🍕 A small app with capabilities ordering food and listing them with pub/sub pattern

food-ordering A small app with capabilities ordering food and listing them. Prerequisites Docker Run Tests docker-compose run --rm web ./manage.py tes

Muhammet Mücahit 1 Jan 14, 2022
KiCad bus length matching script.

KiBus length matching script This script implements way to monitor multiple nets, combined into a bus that needs to be length matched

Piotr Esden-Tempski 22 Mar 17, 2022
JLC2KICAD_lib is a python script that generate a component library for KiCad from the JLCPCB/easyEDA library.

JLC2KiCad_lib is a python script that generate a component library (schematic, footprint and 3D model) for KiCad from the JLCPCB/easyEDA library. This script requires Python 3.6 or higher.

Nicolas Toussaint 73 Dec 26, 2022
K2HASH Python library - NoSQL Key Value Store(KVS) library

k2hash_python Overview k2hash_python is an official python driver for k2hash. Install Firstly you must install the k2hash shared library: curl -o- htt

Yahoo! JAPAN 3 Oct 19, 2022
Msgpack serialization/deserialization library for Python, written in Rust using PyO3 and rust-msgpack. Reboot of orjson. msgpack.org[Python]

ormsgpack ormsgpack is a fast msgpack library for Python. It is a fork/reboot of orjson It serializes faster than msgpack-python and deserializes a bi

Aviram Hassan 139 Dec 30, 2022
PyPIContents is an application that generates a Module Index from the Python Package Index (PyPI) and also from various versions of the Python Standard Library.

PyPIContents is an application that generates a Module Index from the Python Package Index (PyPI) and also from various versions of the Python Standar

Collage Labs 10 Nov 19, 2022
🔩 Like builtins, but boltons. 250+ constructs, recipes, and snippets which extend (and rely on nothing but) the Python standard library. Nothing like Michael Bolton.

Boltons boltons should be builtins. Boltons is a set of over 230 BSD-licensed, pure-Python utilities in the same spirit as — and yet conspicuously mis

Mahmoud Hashemi 6k Jan 6, 2023