Simple, fast, and parallelized symbolic regression in Python/Julia via regularized evolution and simulated annealing

Overview

PySR

(pronounced like py as in python, and then sur as in surface)

Documentation Status PyPI version .github/workflows/CI.yml

Parallelized symbolic regression built on Julia, and interfaced by Python. Uses regularized evolution, simulated annealing, and gradient-free optimization.

Cite this software

Documentation

Check out SymbolicRegression.jl for the pure-Julia backend of this package.

Symbolic regression is a very interpretable machine learning algorithm for low-dimensional problems: these tools search equation space to find algebraic relations that approximate a dataset.

One can also extend these approaches to higher-dimensional spaces by using a neural network as proxy, as explained in 2006.11287, where we apply it to N-body problems. Here, one essentially uses symbolic regression to convert a neural net to an analytic equation. Thus, these tools simultaneously present an explicit and powerful way to interpret deep models.

Backstory:

Previously, we have used eureqa, which is a very efficient and user-friendly tool. However, eureqa is GUI-only, doesn't allow for user-defined operators, has no distributed capabilities, and has become proprietary (and recently been merged into an online service). Thus, the goal of this package is to have an open-source symbolic regression tool as efficient as eureqa, while also exposing a configurable python interface.

Installation

PySR uses both Julia and Python, so you need to have both installed.

Install Julia - see downloads, and then instructions for mac and linux. (Don't use the conda-forge version; it doesn't seem to work properly.)

You can install PySR with:

pip install pysr

The first launch will automatically install the Julia packages required.

Quickstart

Here is some demo code (also found in example.py)

import numpy as np
from pysr import pysr, best

# Dataset
X = 2*np.random.randn(100, 5)
y = 2*np.cos(X[:, 3]) + X[:, 0]**2 - 2

# Learn equations
equations = pysr(X, y, niterations=5,
    binary_operators=["plus", "mult"],
    unary_operators=[
      "cos", "exp", "sin", #Pre-defined library of operators (see https://pysr.readthedocs.io/en/latest/docs/operators/)
      "inv(x) = 1/x"]) # Define your own operator! (Julia syntax)

...# (you can use ctl-c to exit early)

print(best(equations))

which gives:

x0**2 + 2.000016*cos(x3) - 1.9999845

One can also use best_tex to get the LaTeX form, or best_callable to get a function you can call. This uses a score which balances complexity and error; however, one can see the full list of equations with:

print(equations)

This is a pandas table, with additional columns:

  • MSE - the mean square error of the formula
  • score - a metric akin to Occam's razor; you should use this to help select the "true" equation.
  • sympy_format - sympy equation.
  • lambda_format - a lambda function for that equation, that you can pass values through.
Comments
  • Add Support for Arbitrary Precision Arithmetic with BigFloat

    Add Support for Arbitrary Precision Arithmetic with BigFloat

    Is your feature request related to a problem? Please describe. I tried running 'pysr' on a 1,000 row array with 4 integer input variables and one integer output variable - a Goedel Number.

    From Mathematica:

    GoedelNumber[l_List] := Times @@ MapIndexed[Prime[First[#2]]^#1 &, l]
    

    E.g.

    Data file:
    # 7	1	5	8	6917761200000
    
    julia> 2^7*3^1*5^5*7^8
    6917761200000
    

    The model returned:

    Complexity  Loss       Score     Equation
    1           Inf       NaN       0.22984365
    
    

    I am just learning 'pysr' and maybe it's just 'user error'. However, Inf and Nan suggest that Goedel numbers may exceed Float64.

    Screenshot 2022-12-01 at 8 33 44 AM

    Describe the solution you'd like Not sure what happened, because the largest Goedel number in the input is: 1.6679880978201e+23

    Additional context I didn't see any parameters to set 'verbose' mode or 'debugging' information.

    GoedelTableFourParameters.txt

    enhancement good first issue 
    opened by dbl001 35
  • [Windows] : Couldn't find equation file!

    [Windows] : Couldn't find equation file!

    Hi Miles,

    I've been installing PySR in parallel to Julia under win10. It runs... till the moment it crashes with the following message:

    File "C:\Users\Matthieu\anaconda3\lib\site-packages\pysr\sr.py", line 774, in get_hof raise RuntimeError("Couldn't find equation file! The equation search likely exited before a single iteration completed.")

    RuntimeError: Couldn't find equation file! The equation search likely exited before a single iteration completed.

    In the last case, I've been to 38% of progress.

    I have to say that, sometime (not often), the process gets completed.

    What is the reason for this?

    Also... is there a forum or I posted at the right place?

    I thank you for your help.

    Regards

    Magaud

    bug 
    opened by Magaud59 27
  • JuliaError: Exception 'Unsatisfiable requirements detected for package DynamicExpressions / prior installation with conda

    JuliaError: Exception 'Unsatisfiable requirements detected for package DynamicExpressions / prior installation with conda

    Describe the bug

    JuliaError: Exception 'Unsatisfiable requirements detected for package DynamicExpressions [a40a106e]:
     DynamicExpressions [a40a106e] log:
     ├─DynamicExpressions [a40a106e] has no known versions!
     └─restricted to versions 0.4 by SymbolicRegression [8254be44] — no versions left
       └─SymbolicRegression [8254be44] log:
         ├─possible versions are: 0.14.4 or uninstalled
         └─SymbolicRegression [8254be44] is fixed to version 0.14.4' occurred while calling julia code:
    Pkg.add([sr_spec, clustermanagers_spec], io=stderr)
    

    Version (please include the following information): MacOS Ventura 13.0.1 (22A400)

    • Julia version [Run julia --version in the terminal]

    • julia --version julia version 1.8.3

    • Python version [Run python --version in the terminal]

    • Python 3.8.13

    • Did you install with pip or conda?

    • pip

    $ conda list pysr
    # packages in environment at /Users/davidlaxer/anaconda3/envs/ai:
    #
    # Name                    Version                   Build  Channel
    pysr                      0.11.11                  pypi_0    pypi
    
    % pip show pysr
    Name: pysr
    Version: 0.11.11
    Summary: Simple and efficient symbolic regression
    Home-page: https://github.com/MilesCranmer/pysr
    Author: Miles Cranmer
    Author-email: [email protected]
    License: 
    Location: /Users/davidlaxer/anaconda3/envs/ai/lib/python3.8/site-packages
    Requires: julia, numpy, pandas, scikit-learn, sympy
    Required-by: 
    
    • PySR version [Run python -c 'import pysr; print(pysr.__version__)']
    • 0.9.1
    • Does the bug still appear with the latest version of PySR?

    Configuration

    • What are your PySR settings?
    • What dataset are you running on?
    • If possible, please share a minimal code example that produces the error.

    Error message Add the error message here, or whatever other information would be useful for debugging.

    If the error is "Couldn't find equation file...", this error indicates something went wrong with the backend. Please scroll up and copy the output of Julia, rather than the output of python.

    Additional context Add any other context about the problem here.

    Julia Version 1.8.3
    Commit 0434deb161e (2022-11-14 20:14 UTC)
    Platform Info:
      OS: macOS (x86_64-apple-darwin21.4.0)
      uname: Darwin 22.1.0 Darwin Kernel Version 22.1.0: Sun Oct  9 20:14:54 PDT 2022; root:xnu-8792.41.9~2/RELEASE_X86_64 x86_64 i386
      CPU: Intel(R) Core(TM) i7-10700K CPU @ 3.80GHz: 
                     speed         user         nice          sys         idle          irq
           #1-16  3800 MHz    7543546 s          0 s    3955434 s   72076495 s          0 s
      Memory: 128.0 GB (32470.4921875 MB free)
      Uptime: 951050.0 sec
      Load Avg:  8.20068359375  5.13525390625  4.3212890625
      WORD_SIZE: 64
      LIBM: libopenlibm
      LLVM: libLLVM-13.0.1 (ORCJIT, skylake)
      Threads: 1 on 16 virtual cores
    Environment:
      JULIA_DEPOT_PATH_BACKUP = 
      JULIA_PROJECT_BACKUP = 
      JULIA_LOAD_PATH_BACKUP = 
      JULIA_DEPOT_PATH = /Users/davidlaxer/anaconda3/envs/ai/share/julia:
      JULIA_SSL_CA_ROOTS_PATH_BACKUP = 
      JULIA_SSL_CA_ROOTS_PATH = 
      JULIA_PROJECT = @pysr-0.11.11
      TERM = xterm-color
      PATH = /Users/davidlaxer/.opam/_coq-platform_.2021.02.1/bin:/Users/davidlaxer/.juliaup/bin:/Users/davidlaxer/.cabal/bin:/Users/davidlaxer/.ghcup/bin:/Users/davidlaxer/anaconda3/envs/ai/bin:/Users/davidlaxer/anaconda3/condabin:/opt/local/bin:/opt/local/sbin:/usr/local/bin:/System/Cryptexes/App/usr/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Library/Apple/usr/bin:/Users/davidlaxer/.cargo/bin:/Users/jetbrains/.local/bin
      XPC_FLAGS = 0x0
      HOME = /Users/davidlaxer
      JAVA_HOME = :-
      JAVA_LD_LIBRARY_PATH = :-
      CAML_LD_LIBRARY_PATH = /Users/davidlaxer/.opam/_coq-platform_.2021.02.1/lib/stublibs:/Users/davidlaxer/.opam/_coq-platform_.2021.02.1/lib/ocaml/stublibs:/Users/davidlaxer/.opam/_coq-platform_.2021.02.1/lib/ocaml
      OCAML_TOPLEVEL_PATH = /Users/davidlaxer/.opam/_coq-platform_.2021.02.1/lib/toplevel
      PKG_CONFIG_PATH = /Users/davidlaxer/.opam/_coq-platform_.2021.02.1/lib/pkgconfig:
      CONDA_BACKUP_FFLAGS = -march=nocona -mtune=core2 -ftree-vectorize -fPIC -fstack-protector -O2 -pipe
      CONDA_BACKUP_FORTRANFLAGS = -march=nocona -mtune=core2 -ftree-vectorize -fPIC -fstack-protector -O2 -pipe
      CONDA_BACKUP_DEBUG_FFLAGS = -march=nocona -mtune=core2 -ftree-vectorize -fPIC -fstack-protector -O2 -pipe
      CONDA_BACKUP_DEBUG_FORTRANFLAGS = -march=nocona -mtune=core2 -ftree-vectorize -fPIC -fstack-protector -O2 -pipe
    [ Info: Julia version info
    [ Info: Julia executable: /Users/davidlaxer/anaconda3/envs/ai/share/julia/juliaup/julia-1.8.3+0.x64/bin/julia
    [ Info: Trying to import PyCall...
    ┌ Info: PyCall is already installed and compatible with Python executable.
    │ 
    │ PyCall:
    │     python: /Users/davidlaxer/anaconda3/envs/ai/bin/python
    │     libpython: /Users/davidlaxer/anaconda3/envs/ai/lib/libpython3.8.dylib
    │ Python:
    │     python: /Users/davidlaxer/anaconda3/envs/ai/bin/python
    └     libpython: 
       Resolving package versions...
    ---------------------------------------------------------------------------
    JuliaError                                Traceback (most recent call last)
    Input In [5], in <cell line: 4>()
          1 get_ipython().system('export JULIA_SSL_CA_ROOTS_PATH=""')
          2 import pysr
    ----> 4 pysr.install()
    
    File ~/anaconda3/envs/ai/lib/python3.8/site-packages/pysr/julia_helpers.py:87, in install(julia_project, quiet)
         83 io_arg = _get_io_arg(quiet)
         85 if is_shared:
         86     # Install SymbolicRegression.jl:
    ---> 87     _add_sr_to_julia_project(Main, io_arg)
         89 Main.eval("using Pkg")
         90 Main.eval(f"Pkg.instantiate({io_arg})")
    
    File ~/anaconda3/envs/ai/lib/python3.8/site-packages/pysr/julia_helpers.py:240, in _add_sr_to_julia_project(Main, io_arg)
        230 Main.sr_spec = Main.PackageSpec(
        231     name="SymbolicRegression",
        232     url="https://github.com/MilesCranmer/SymbolicRegression.jl",
        233     rev="v" + __symbolic_regression_jl_version__,
        234 )
        235 Main.clustermanagers_spec = Main.PackageSpec(
        236     name="ClusterManagers",
        237     url="https://github.com/JuliaParallel/ClusterManagers.jl",
        238     rev="14e7302f068794099344d5d93f71979aaf4fbeb3",
        239 )
    --> 240 Main.eval(f"Pkg.add([sr_spec, clustermanagers_spec], {io_arg})")
    
    File ~/anaconda3/envs/ai/lib/python3.8/site-packages/julia/core.py:627, in Julia.eval(self, src)
        625 if src is None:
        626     return None
    --> 627 ans = self._call(src)
        628 if not ans:
        629     return None
    
    File ~/anaconda3/envs/ai/lib/python3.8/site-packages/julia/core.py:555, in Julia._call(self, src)
        553 # logger.debug("_call(%s)", src)
        554 ans = self.api.jl_eval_string(src.encode('utf-8'))
    --> 555 self.check_exception(src)
        557 return ans
    
    File ~/anaconda3/envs/ai/lib/python3.8/site-packages/julia/core.py:609, in Julia.check_exception(self, src)
        607 else:
        608     exception = sprint(showerror, self._as_pyobj(res))
    --> 609 raise JuliaError(u'Exception \'{}\' occurred while calling julia code:\n{}'
        610                  .format(exception, src))
    
    JuliaError: Exception 'Unsatisfiable requirements detected for package DynamicExpressions [a40a106e]:
     DynamicExpressions [a40a106e] log:
     ├─DynamicExpressions [a40a106e] has no known versions!
     └─restricted to versions 0.4 by SymbolicRegression [8254be44] — no versions left
       └─SymbolicRegression [8254be44] log:
         ├─possible versions are: 0.14.4 or uninstalled
         └─SymbolicRegression [8254be44] is fixed to version 0.14.4' occurred while calling julia code:
    Pkg.add([sr_spec, clustermanagers_spec], io=stderr)
    
     % julia
                   _
       _       _ _(_)_     |  Documentation: https://docs.julialang.org
      (_)     | (_) (_)    |
       _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
      | | | | | | |/ _` |  |
      | | |_| | | | (_| |  |  Version 1.8.3 (2022-11-14)
     _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
    |__/                   |
    
    julia> using Pkg
    
    julia> Pkg.add("DynamicExpressions")
    ERROR: The following package names could not be resolved:
     * DynamicExpressions (not found in project, manifest or registry)
    Stacktrace:
      [1] pkgerror(msg::String)
        @ Pkg.Types ~/anaconda3/envs/ai/share/julia/juliaup/julia-1.8.3+0.x64/share/julia/stdlib/v1.8/Pkg/src/Types.jl:67
      [2] ensure_resolved(ctx::Pkg.Types.Context, manifest::Pkg.Types.Manifest, pkgs::Vector{Pkg.Types.PackageSpec}; registry::Bool)
        @ Pkg.Types ~/anaconda3/envs/ai/share/julia/juliaup/julia-1.8.3+0.x64/share/julia/stdlib/v1.8/Pkg/src/Types.jl:952
      [3] add(ctx::Pkg.Types.Context, pkgs::Vector{Pkg.Types.PackageSpec}; preserve::Pkg.Types.PreserveLevel, platform::Base.BinaryPlatforms.Platform, kwargs::Base.Pairs{Symbol, Base.TTY, Tuple{Symbol}, NamedTuple{(:io,), Tuple{Base.TTY}}})
        @ Pkg.API ~/anaconda3/envs/ai/share/julia/juliaup/julia-1.8.3+0.x64/share/julia/stdlib/v1.8/Pkg/src/API.jl:264
      [4] add(pkgs::Vector{Pkg.Types.PackageSpec}; io::Base.TTY, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
        @ Pkg.API ~/anaconda3/envs/ai/share/julia/juliaup/julia-1.8.3+0.x64/share/julia/stdlib/v1.8/Pkg/src/API.jl:156
      [5] add(pkgs::Vector{Pkg.Types.PackageSpec})
        @ Pkg.API ~/anaconda3/envs/ai/share/julia/juliaup/julia-1.8.3+0.x64/share/julia/stdlib/v1.8/Pkg/src/API.jl:145
      [6] #add#27
        @ ~/anaconda3/envs/ai/share/julia/juliaup/julia-1.8.3+0.x64/share/julia/stdlib/v1.8/Pkg/src/API.jl:144 [inlined]
      [7] add
        @ ~/anaconda3/envs/ai/share/julia/juliaup/julia-1.8.3+0.x64/share/julia/stdlib/v1.8/Pkg/src/API.jl:144 [inlined]
      [8] #add#26
        @ ~/anaconda3/envs/ai/share/julia/juliaup/julia-1.8.3+0.x64/share/julia/stdlib/v1.8/Pkg/src/API.jl:143 [inlined]
      [9] add(pkg::String)
        @ Pkg.API ~/anaconda3/envs/ai/share/julia/juliaup/julia-1.8.3+0.x64/share/julia/stdlib/v1.8/Pkg/src/API.jl:143
     [10] top-level scope
        @ REPL[2]:1
    
    julia> 
    
    

    The code works properly on Google CoLab.

    bug 
    opened by dbl001 26
  • Refactor of PySRRegressor

    Refactor of PySRRegressor

    Re Issue #143

    Compatibility with scikit-learn should be improved.

    Noteable breaking changes for users: PySRRegressor.equations is now called PySRRegressor.equations_

    Tests have been updated to allow compatibility with the refactored code but still assess the same functionality. All tests should pass.

    Please let me know if there are any concerns or if you would like me to document/explain any of the changes in detail.

    opened by tttc3 24
  • [BUG] conda version breaking

    [BUG] conda version breaking

    Edit: If you are seeing issues with the conda version, try updating PySR with conda update pysr. The new version fixes an issue related to automatic updating of Julia packages.


    The conda-forge jobs which test conda install -c conda-forge pysr are currently breaking. This is even with repeat attempts: https://github.com/MilesCranmer/PySR/actions/workflows/CI_conda_forge.yml. The error:

    ImportError: 
        Required dependencies are not installed or built.  Run the following code in the Python REPL:
    

    I find this strange, since underlying feedstock has not changed in the meantime, and it seems like the julia feedstock hasn't been updated recently either.

    FYI @mkitti @ngam. I will try to look into this a bit later today.

    bug 
    opened by MilesCranmer 23
  • [Errno 2] No such file or directory

    [Errno 2] No such file or directory

    I have installed pysr-0.6.12.post1 and I have been try to run the example.py but after solve some previous closed bug reports a FileNotFoundError occurs. I'm using Windows 10 and Python 3.7 the version of Julia is 1.6.2. The error message is the following.

    FileNotFoundError: [Errno 2] No such file or directory: 'hall_of_fame_2021-08-04_230410.180.csv.bkup'

    bug 
    opened by jzsmoreno 21
  • Performance speed-up options?

    Performance speed-up options?

    Hello Miles! Thank you for open-sourcing this powerful tool! I am working on including PySR in my own research, and running into some performance bottlenecks.

    I found regressing a simple equation (e.g. the quick-start example) takes roughly 2 minutes. Ideally, I am aiming to reduce that time to ~30 seconds. Would you give me some pointers on this? Meanwhile, I will try break down the challenge in several pieces:

    1. Activating a new environment at each API call: I noticed that a new Julia (?) environment is created each time I call pysr() api (see terminal output below). Could we keep the environment up so we can skip this process for subsequent calls?
    Running on julia -O3 /tmp/tmpe5qmgemh/runfile.jl
      Activating environment at `~/anaconda3/envs/rw/lib/python3.7/site-packages/Project.toml`
        Updating registry at `~/.julia/registries/General`
      No Changes to `~/anaconda3/envs/rw/lib/python3.7/site-packages/Project.toml`
      No Changes to `~/anaconda3/envs/rw/lib/python3.7/site-packages/Manifest.toml`
    Activating environment on workers.
      Activating environment at `~/anaconda3/envs/rw/lib/python3.7/site-packages/Project.toml`
      Activating environment at `~/anaconda3/envs/rw/lib/python3.7/site-packages/Project.toml`
      Activating  Activating  environment at `~/anaconda3/envs/rw/lib/python3.7/site-packages/Project.toml`
    environment at `~/anaconda3/envs/rw/lib/python3.7/site-packages/Project.toml`
      Activating  Activating environment at `~/anaconda3/envs/rw/lib/python3.7/site-packages/Project.toml` 
    environment at `~/anaconda3/envs/rw/lib/python3.7/site-packages/Project.toml`
    Importing installed module on workers...Finished!
    Started!
    
    1. If the above wouldn't work, then allowing y to be vector-valued (as mentioned in #35) would be a second-best option! Even better, if we could create a "batched" version of pysr(X, y) api pysr_batched(X, y), such that X and y are python lists, and we return the results in a list as well, so that we only generate one Julia script, and call os.system() once to keep the Julia environment up.

    2. Multi-threading: I noticed that increasing procs from 4 to 8 resulted in slightly longer running time. I am running on a 8-core 16-tread CPU. Did I do something dumb?

    3. I went into pysr/sr.py and added runtests=false flag in line 438 and 440. That saved ~20 seconds.

    opened by yxie20 20
  • [Feature] LaTeX table generator

    [Feature] LaTeX table generator

    This generates a booktabs-style LaTeX table for a subset of equations. Here is an example:

    import numpy as np
    from pysr import PySRRegressor
    
    X = 2 * np.random.randn(100, 5)
    y = 2.5382 * np.cos(X[:, 3]) + X[:, 0] ** 2 - 0.5
    
    model = PySRRegressor(
        niterations=80,
        binary_operators=["+", "*"],
        unary_operators=["cos"],
        model_selection="best",
        loss="loss(x, y) = (x - y)^2",  # Custom loss function (julia syntax)
        maxsize=11,
    )
    
    model.fit(X, y)
    
    print(model.latex_table(precision=3, include_score=True))
    

    The output of this is:

    \begin{table}[h]
    \begin{center}
    \begin{tabular}{@{}lccc@{}}
    \toprule
    Equation & Complexity & Loss & Score \\
    \midrule
    $3.9$ & 1 & 38.9 & 0 \\
    $x_{0}^{2}$ & 3 & 3.16 & 1.26 \\
    $x_{0}^{2} - 0.257$ & 5 & 3.09 & 0.0105 \\
    $x_{0}^{2} + \cos{\left(x_{3} \right)}$ & 6 & 1.26 & 0.898 \\
    $x_{0}^{2} + 2.44 \cos{\left(x_{3} \right)}$ & 8 & 0.245 & 0.818 \\
    $x_{0}^{2} + 2.54 \cos{\left(x_{3} \right)} - 0.5$ & 10 & 2.28e-13 & 13.9 \\
    \bottomrule
    \end{tabular}
    \end{center}
    \end{table}
    

    which renders as: image

    Leaving include_score set to False will leave out the Score column. Precision can be adjusted to have more or less precise constants.

    One can render only a subset of equations by using latex_table([1, 4]) which only includes the 1st and 4th equation in model.equations_.


    Edit: it now renders the e-13 as \cdot 10^{-13}

    opened by MilesCranmer 19
  • Set JULIA_PROJECT, use Pkg.add once

    Set JULIA_PROJECT, use Pkg.add once

    • Sets JULIA_PROJECT before loading pyjulia so that PyCall.jl can be contained within the pysr environment
    • Also use Pkg.add in a single step to add both SymbolicRegression.jl and ClusterManagers.jl to the environment at the same time

    I likely advised against using the environment variable JULIA_PROJECT in the past. However, I think this may be necessary to avoid interference from other projects if installed within the same environment.

    opened by mkitti 15
  • Windows support

    Windows support

    Hi Miles,

    first of all, this is awesome. Thanks so much for making this.

    A student I'm working with is trying to run PySR under Windows. Is that in principle supported?

    PySR's dependencies don't seem to have any issues with Windows, but pysr.pysr throws a FileNotFoundError when accessing /tmp/.hyperparams_{rand_string}.hl'. Seems to be because of the different file system structure under Windows. If this is the only issue, how would you feel about using something like tempfile to generate temporary files in a more OS-independent way?

    I am happy to try this and open a PR once it works.

    Cheers, Johann

    implemented 
    opened by johannbrehmer 15
  • [Windows] Always returning the same equation?

    [Windows] Always returning the same equation?

    I don't know if this is a Windows issue or what (I work on a Linux partition, but I just wanted to play around with this - I haven't actually done serious work Windows for 7 years or so, so I'm at a loss), but after fitting one equation, it's always returning that equation. Even with different data, in a different notebook.

    I've looked to see if I could find the julia file it creates - nope. And they're different files every time.

    Any ideas?

    opened by JQVeenstra 14
  • [BUG] Pickling error on use of ReLU

    [BUG] Pickling error on use of ReLU

    I see this error when I try to use the ReLU operator:

    PicklingError: Can't pickle relu: attribute lookup relu on __main__ failed
    

    seems like it's implemented in a way that can't be pickled. Should be an easy fix.

    bug 
    opened by MilesCranmer 1
  • [BUG] *Windows SystemError: <PyCall.jlwrap on basic example*

    [BUG] *Windows SystemError:

    I have done a fresh installation on windows (with pip) and I am running the basic example provided in the Introduction. I am getting a JULIA error. Thanks in advance for any help!

    Version:

    • Julia version [1.8.3]
    • Python version [3.10.6]
    • PySR version [0.11.11]

    Error message

    C:\tools\Anaconda3\envs\env_ai\lib\site-packages\pysr\sr.py:1257: UserWarning: Note: it looks like you are running in Jupyter. The progress bar will be turned off. warnings.warn( Traceback (most recent call last):

    File "C:\tools\Anaconda3\envs\env_ai\lib\site-packages\spyder_kernels\py3compat.py", line 356, in compat_exec exec(code, globals, locals)

    File "c:\users\gorth\untitled0.py", line 25, in model.fit(X, y)

    File "C:\tools\Anaconda3\envs\env_ai\lib\site-packages\pysr\sr.py", line 1792, in fit self._run(X, y, mutated_params, weights=weights, seed=seed)

    File "C:\tools\Anaconda3\envs\env_ai\lib\site-packages\pysr\sr.py", line 1652, in run self.raw_julia_state = SymbolicRegression.EquationSearch(

    SystemError: <PyCall.jlwrap (in a Julia function called from Python) JULIA: SystemError: opening file "hall_of_fame_2022-12-17_011150.694.csv": Invalid argument Stacktrace: [1] systemerror(p::String, errno::Int32; extrainfo::Nothing) @ Base .\error.jl:176 [2] #systemerror#80 @ .\error.jl:175 [inlined] [3] systemerror @ .\error.jl:175 [inlined] [4] open(fname::String; lock::Bool, read::Nothing, write::Nothing, create::Nothing, truncate::Bool, append::Nothing) @ Base .\iostream.jl:293 [5] open(fname::String, mode::String; lock::Bool) @ Base .\iostream.jl:356 [6] open(fname::String, mode::String) @ Base .\iostream.jl:355 [7] open(::SymbolicRegression.var"#48#77"{Options{typeof(loss), Int64, 0.86, 10}, Vector{PopMember{Float32}}, SymbolicRegression.CoreModule.DatasetModule.Dataset{Float32}}, ::String, ::Vararg{String}; kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}) @ Base .\io.jl:382 [8] open @ .\io.jl:381 [inlined] [9] EquationSearch(::SymbolicRegression.CoreModule.ProgramConstantsModule.SRThreaded, datasets::Vector{SymbolicRegression.CoreModule.DatasetModule.Dataset{Float32}}; niterations::Int64, options::Options{typeof(loss), Int64, 0.86, 10}, numprocs::Nothing, procs::Nothing, addprocs_function::Nothing, runtests::Bool, saved_state::Nothing) @ SymbolicRegression C:\Users\gorth.julia\packages\SymbolicRegression\37l4B\src\SymbolicRegression.jl:751 [10] EquationSearch(datasets::Vector{SymbolicRegression.CoreModule.DatasetModule.Dataset{Float32}}; niterations::Int64, options::Options{typeof(loss), Int64, 0.86, 10}, parallelism::String, numprocs::Nothing, procs::Nothing, addprocs_function::Nothing, runtests::Bool, saved_state::Nothing) @ SymbolicRegression C:\Users\gorth.julia\packages\SymbolicRegression\37l4B\src\SymbolicRegression.jl:383 [11] EquationSearch(X::Matrix{Float32}, y::Matrix{Float32}; niterations::Int64, weights::Nothing, varMap::Vector{String}, options::Options{typeof(loss), Int64, 0.86, 10}, parallelism::String, numprocs::Nothing, procs::Nothing, addprocs_function::Nothing, runtests::Bool, saved_state::Nothing, multithreaded::Nothing) @ SymbolicRegression C:\Users\gorth.julia\packages\SymbolicRegression\37l4B\src\SymbolicRegression.jl:320 [12] #EquationSearch#21 @ C:\Users\gorth.julia\packages\SymbolicRegression\37l4B\src\SymbolicRegression.jl:345 [inlined] [13] invokelatest(::Any, ::Any, ::Vararg{Any}; kwargs::Base.Pairs{Symbol, Any, NTuple{8, Symbol}, NamedTuple{(:weights, :niterations, :varMap, :options, :numprocs, :parallelism, :saved_state, :addprocs_function), Tuple{Nothing, Int64, Vector{String}, Options{typeof(loss), Int64, 0.86, 10}, Nothing, String, Nothing, Nothing}}}) @ Base .\essentials.jl:731 [14] pyjlwrap_call(f::Function, args::Ptr{PyCall.PyObject_struct}, kw::Ptr{PyCall.PyObject_struct}) @ PyCall C:\Users\gorth.julia\packages\PyCall\ygXW2\src\callback.jl:32 [15] pyjlwrap_call(self_::Ptr{PyCall.PyObject_struct}, args_::Ptr{PyCall.PyObject_struct}, kw_::Ptr{PyCall.PyObject_struct}) @ PyCall C:\Users\gorth.julia\packages\PyCall\ygXW2\src\callback.jl:44>

    bug 
    opened by trifinos 13
  • Repeated CI failures on Windows

    Repeated CI failures on Windows

    Many of the Windows tests are now failing with various segmentation faults, which appear to be randomly triggered:

    • Nightly action: https://github.com/MilesCranmer/PySR/actions/workflows/CI_large_nightly.yml
    • PR action: https://github.com/MilesCranmer/PySR/pull/237

    They seem to occur more frequently on older versions of Julia, and rarely on Julia 1.8.3. Regardless, a segfault anywhere is cause for concern and should be tracked down.

    The errors include:

    1. Early segmentation fault (Julia 1.6.7) at first run, segfault during noise test (Julia 1.6.7 and others), as well as segfaults during warm start test.

    e.g., Windows:

     D:\a\_temp\221410f9-8bf7-4099-901d-eb9813d86c45.sh: line 1:  1098 Segmentation fault      python -m pysr.test main
    Started!
    
    also occurs on Ubuntu sometimes:
    signal (11): Segmentation fault
    in expression starting at none:0
    unknown function (ip: 0x7fd6a19bc215)
    unknown function (ip: 0x7fd6a19947ff)
    macro expansion at /home/runner/.julia/packages/PyCall/ygXW2/src/exception.jl:95 [inlined]
    convert at /home/runner/.julia/packages/PyCall/ygXW2/src/conversions.jl:94
    pyjlwrap_getattr at /home/runner/.julia/packages/PyCall/ygXW2/src/pytype.jl:378
    unknown function (ip: 0x7fd68d30b1bd)
    unknown function (ip: 0x7fd6a19babda)
    unknown function (ip: 0x7fd6a198e9d4)
    pyisinstance at /home/runner/.julia/packages/PyCall/ygXW2/src/PyCall.jl:170 [inlined]
    pysequence_query at /home/runner/.julia/packages/PyCall/ygXW2/src/conversions.jl:752
    pytype_query at /home/runner/.julia/packages/PyCall/ygXW2/src/conversions.jl:773
    pytype_query at /home/runner/.julia/packages/PyCall/ygXW2/src/conversions.jl:806 [inlined]
    convert at /home/runner/.julia/packages/PyCall/ygXW2/src/conversions.jl:831
    julia_kwarg at /home/runner/.julia/packages/PyCall/ygXW2/src/callback.jl:19 [inlined]
    #57 at ./none:0 [inlined]
    iterate at ./generator.jl:47 [inlined]
    collect_to! at ./array.jl:728
    unknown function (ip: 0x7fd68d341d9a)
    _jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2237 [inlined]
    jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2419
    collect_to! at ./array.jl:736
    unknown function (ip: 0x7fd68d33e35a)
    _jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2237 [inlined]
    jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2419
    collect_to! at ./array.jl:736
    collect_to_with_first! at ./array.jl:706
    unknown function (ip: 0x7fd68d33d775)
    _jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2237 [inlined]
    jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2419
    collect at ./array.jl:687
    unknown function (ip: 0x7fd68d33afb4)
    _jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2237 [inlined]
    jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2419
    _pyjlwrap_call at /home/runner/.julia/packages/PyCall/ygXW2/src/callback.jl:31
    unknown function (ip: 0x7fd68d3348d5)
    _jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2237 [inlined]
    jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2419
    pyjlwrap_call at /home/runner/.julia/packages/PyCall/ygXW2/src/callback.jl:44
    unknown function (ip: 0x7fd68d30aeee)
    unknown function (ip: 0x7fd6a19980c7)
    _PyObject_VectorcallTstate at /home/runner/work/_temp/SourceCode/./Include/cpython/abstract.h:116 [inlined]
    _PyObject_VectorcallTstate at /home/runner/work/_temp/SourceCode/./Include/cpython/abstract.h:103 [inlined]
    PyObject_Vectorcall at /home/runner/work/_temp/SourceCode/./Include/cpython/abstract.h:127 [inlined]
    call_function at /home/runner/work/_temp/SourceCode/Python/ceval.c:5077 [inlined]
    _PyEval_EvalFrameDefault at /home/runner/work/_temp/SourceCode/Python/ceval.c:3537
    unknown function (ip: 0x7fd6a19ebbb7)
    _PyFunction_Vectorcall at /home/runner/work/_temp/SourceCode/Objects/call.c:396
    unknown function (ip: 0x7fd6a199a1e0)
    unknown function (ip: 0x7fd6a19ed97b)
    unknown function (ip: 0x7fd6a19ebbb7)
    _PyFunction_Vectorcall at /home/runner/work/_temp/SourceCode/Objects/call.c:396
    unknown function (ip: 0x7fd6a19ecdf6)
    unknown function (ip: 0x7fd6a1998972)
    unknown function (ip: 0x7fd6a199a1e0)
    unknown function (ip: 0x7fd6a19ecb12)
    unknown function (ip: 0x7fd6a1998972)
    unknown function (ip: 0x7fd6a19ecdf6)
    unknown function (ip: 0x7fd6a19ebbb7)
    _PyFunction_Vectorcall at /home/runner/work/_temp/SourceCode/Objects/call.c:396
    unknown function (ip: 0x7fd6a199a28d)
    unknown function (ip: 0x7fd6a19ef9b1)
    unknown function (ip: 0x7fd6a19ebbb7)
    unknown function (ip: 0x7fd6a1997d4c)
    unknown function (ip: 0x7fd6a1998f2b)
    unknown function (ip: 0x7fd6a1a46421)
    unknown function (ip: 0x7fd6a199802f)
    _PyObject_VectorcallTstate at /home/runner/work/_temp/SourceCode/./Include/cpython/abstract.h:116 [inlined]
    _PyObject_VectorcallTstate at /home/runner/work/_temp/SourceCode/./Include/cpython/abstract.h:103 [inlined]
    PyObject_Vectorcall at /home/runner/work/_temp/SourceCode/./Include/cpython/abstract.h:127 [inlined]
    call_function at /home/runner/work/_temp/SourceCode/Python/ceval.c:5077 [inlined]
    _PyEval_EvalFrameDefault at /home/runner/work/_temp/SourceCode/Python/ceval.c:3520
    unknown function (ip: 0x7fd6a19ebbb7)
    _PyFunction_Vectorcall at /home/runner/work/_temp/SourceCode/Objects/call.c:396
    unknown function (ip: 0x7fd6a199a28d)
    unknown function (ip: 0x7fd6a19ef9b1)
    unknown function (ip: 0x7fd6a19ebbb7)
    unknown function (ip: 0x7fd6a1997d4c)
    unknown function (ip: 0x7fd6a1998f2b)
    unknown function (ip: 0x7fd6a1a46421)
    unknown function (ip: 0x7fd6a199802f)
    _PyObject_VectorcallTstate at /home/runner/work/_temp/SourceCode/./Include/cpython/abstract.h:116 [inlined]
    _PyObject_VectorcallTstate at /home/runner/work/_temp/SourceCode/./Include/cpython/abstract.h:103 [inlined]
    PyObject_Vectorcall at /home/runner/work/_temp/SourceCode/./Include/cpython/abstract.h:127 [inlined]
    call_function at /home/runner/work/_temp/SourceCode/Python/ceval.c:5077 [inlined]
    _PyEval_EvalFrameDefault at /home/runner/work/_temp/SourceCode/Python/ceval.c:3520
    unknown function (ip: 0x7fd6a1998972)
    unknown function (ip: 0x7fd6a19ecdf6)
    unknown function (ip: 0x7fd6a1998972)
    unknown function (ip: 0x7fd6a19ecb12)
    unknown function (ip: 0x7fd6a19ebbb7)
    _PyEval_EvalCodeWithName at /home/runner/work/_temp/SourceCode/Python/ceval.c:4361
    unknown function (ip: 0x7fd6a19eb876)
    PyEval_EvalCode at /home/runner/work/_temp/SourceCode/Python/ceval.c:828
    unknown function (ip: 0x7fd6a1a6399f)
    cfunction_vectorcall_FASTCALL at /home/runner/work/_temp/SourceCode/Objects/methodobject.c:430
    unknown function (ip: 0x7fd6a19ecb12)
    unknown function (ip: 0x7fd6a19ebbb7)
    _PyFunction_Vectorcall at /home/runner/work/_temp/SourceCode/Objects/call.c:396
    unknown function (ip: 0x7fd6a19ecb12)
    unknown function (ip: 0x7fd6a19ebbb7)
    _PyFunction_Vectorcall at /home/runner/work/_temp/SourceCode/Objects/call.c:396
    unknown function (ip: 0x7fd6a1a7fdd6)
    unknown function (ip: 0x7fd6a1a7faae)
    Py_BytesMain at /home/runner/work/_temp/SourceCode/Modules/main.c:731
    unknown function (ip: 0x7fd6a1642d8f)
    __libc_start_main at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
    _start at python (unknown line)
    Allocations: 185387713 (Pool: 185351460; Big: 36253); GC: 470
    /home/runner/work/_temp/bdd49862-48fd-4e82-bed8-685329606248.sh: line 1:  2324 Segmentation fault      (core dumped) python -m pysr.test main
    
    1. Git errors: (Julia 1.8.2)
    PyCall is installed and built successfully.
         Cloning git-repo `[https://github.com/MilesCranmer/SymbolicRegression.jl`](https://github.com/MilesCranmer/SymbolicRegression.jl%60)
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/Users/runner/work/PySR/PySR/pysr/julia_helpers.py", line 87, in install
        _add_sr_to_julia_project(Main, io_arg)
      File "/Users/runner/work/PySR/PySR/pysr/julia_helpers.py", line 240, in _add_sr_to_julia_project
        Main.eval(f"Pkg.add([sr_spec, clustermanagers_spec], {io_arg})")
      File "/Users/runner/hostedtoolcache/Python/3.9.14/x64/lib/python3.9/site-packages/julia/core.py", line 627, in eval
        ans = self._call(src)
      File "/Users/runner/hostedtoolcache/Python/3.9.14/x64/lib/python3.9/site-packages/julia/core.py", line 555, in _call
        self.check_exception(src)
      File "/Users/runner/hostedtoolcache/Python/3.9.14/x64/lib/python3.9/site-packages/julia/core.py", line 609, in check_exception
        raise JuliaError(u'Exception \'{}\' occurred while calling julia code:\n{}'
    julia.core.JuliaError: Exception 'failed to clone from https://github.com/MilesCranmer/SymbolicRegression.jl, error: GitError(Code:ERROR, Class:Net, SecureTransport error: connection closed via error)' occurred while calling julia code:
    Pkg.add([sr_spec, clustermanagers_spec], io=stderr)
    
    1. Access errors during scikit-learn tests (these ones don't even fail the CI, which is a bit worrisome)

    e.g.,

    Failed check_fit2d_predict1d with:
        Traceback (most recent call last):
          File "D:\a\PySR\PySR\pysr\test\test.py", line 671, in test_scikit_learn_compatibility
            check(model)
          File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\sklearn\utils\_testing.py", line 188, in wrapper
            return fn(*args, **kwargs)
          File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\sklearn\utils\estimator_checks.py", line 1300, in check_fit2d_predict1d
            estimator.fit(X, y)
          File "D:\a\PySR\PySR\pysr\sr.py", line 1792, in fit
            self._run(X, y, mutated_params, weights=weights, seed=seed)
          File "D:\a\PySR\PySR\pysr\sr.py", line 1493, in _run
            Main = init_julia(self.julia_project, julia_kwargs=julia_kwargs)
          File "D:\a\PySR\PySR\pysr\julia_helpers.py", line 180, in init_julia
            Julia(**julia_kwargs)
          File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\julia\core.py", line 519, in __init__
            self._call("const PyCall = Base.require({0})".format(PYCALL_PKGID))
          File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\julia\core.py", line 554, in _call
            ans = self.api.jl_eval_string(src.encode('utf-8'))
        OSError: exception: access violation reading 0x000001BC1C501000
    
    1. Torch errors.

    One other curious thing is that this error is raised on some Windows tests (https://github.com/MilesCranmer/PySR/actions/runs/3664894286/jobs/6195713513). But, this should not take place...

    Run python -m pysr.test torch
    D:\a\PySR\PySR\pysr\julia_helpers.py:139: UserWarning: `torch` was loaded before the Julia instance started. This may cause a segfault when running `PySRRegressor.fit`. To avoid this, please run `pysr.julia_helpers.init_julia()` *before* importing `torch`. For updates, see https://github.com/pytorch/pytorch/issues/78829
      warnings.warn(
    D:\a\_temp\8727c9f4-d0f6-4345-84e6-e774762771ab.sh: line 1:   258 Segmentation fault      python -m pysr.test torch
    Started!
    
    opened by MilesCranmer 11
  • Raise warning on statically-linked Python binaries

    Raise warning on statically-linked Python binaries

    Time-to-first-search is very slow on statically-linked versions of Python (such as packaged with conda), as precompiled code cannot be used, so things are compiled from scratch. I think this adds some friction to the user experience, so this PR introduces a warning that recommends the user try pyenv if startup time is important.

    When https://github.com/JuliaPy/pyjulia/issues/496 is solved, this warning is no longer needed.

    See https://github.com/conda-forge/python-feedstock/issues/222 for the discussion on the conda page.

    opened by MilesCranmer 3
  • [Feature] Install with CLI

    [Feature] Install with CLI

    Right now you install SymbolicRegression.jl using python -c 'import pysr; pysr.install()'. However, this is a bit of spooky action at a distance, because you can't quite be sure which pysr is actually being called. Thus, it would be great if there was a CLI, similar to how testing is done with python -m pysr.test main. For example:

    python -m pysr.install
    

    If anybody wants to add this, I'd be more than happy to accept a PR!

    enhancement 
    opened by MilesCranmer 0
Releases(v0.11.11)
  • v0.11.11(Nov 22, 2022)

    What's Changed

    • Make Julia startup options configurable; set optimize=3 by @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/228

    Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.11.10...v0.11.11

    Source code(tar.gz)
    Source code(zip)
  • v0.11.10(Nov 21, 2022)

    What's Changed

    • Clean up dockerfile by @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/223
    • Update backend version with improved resource monitoring by @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/227

    Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.11.9...v0.11.10

    Source code(tar.gz)
    Source code(zip)
  • v0.11.9(Nov 5, 2022)

    What's Changed

    • Refactor testing suite to have CLI by @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/221

    Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.11.8...v0.11.9

    Source code(tar.gz)
    Source code(zip)
  • v0.11.8(Nov 4, 2022)

    What's Changed

    • Fix PyCall not giving traceback by @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/218
    • Fixed safe operators; make progress bar print to stderr by @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/219

    Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.11.7...v0.11.8

    Source code(tar.gz)
    Source code(zip)
  • v0.11.7(Nov 4, 2022)

    What's Changed

    • Expand nightly conda-forge tests to other Python versions by @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/212
    • Clean up parameter groupings in docs by @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/214
    • Add optimization-as-mutation, and adaptive parsimony by @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/217

    Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.11.6...v0.11.7

    Source code(tar.gz)
    Source code(zip)
  • v0.11.6(Oct 31, 2022)

    What's Changed

    • Speed up evaluation with turbo parameter by @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/208

    https://user-images.githubusercontent.com/7593028/199054602-7ad19e87-19ff-4440-aa09-da6d7b6175d5.mp4

    Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.11.5...v0.11.6

    Source code(tar.gz)
    Source code(zip)
  • v0.11.5(Oct 24, 2022)

    What's Changed

    • 30-50% Faster evaluation, and perform explicit version assertion for backend by @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/205

    Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.11.4...v0.11.5

    Source code(tar.gz)
    Source code(zip)
  • v0.11.4(Oct 10, 2022)

    What's Changed

    • Fix conda forge installs by @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/202

    Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.11.3...v0.11.4

    Source code(tar.gz)
    Source code(zip)
  • v0.11.3(Oct 6, 2022)

    What's Changed

    • Faster evaluation for constant sub-expressions (SymbolicRegression.jl#129)
    • Will now check variable names for spaces and other non-alphanumeric characters, aside from underscores. Before this would only raise an issue after a search, when trying to pickle the saved data.

    Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.11.2...v0.11.3

    Source code(tar.gz)
    Source code(zip)
  • v0.11.2(Sep 28, 2022)

  • v0.11.1-1(Sep 26, 2022)

    What's Changed

    • Added Customization page in the docs for tweaking the backend's loss function and constraints.
    • Adding two entries to papers.yml by @JayWadekar in https://github.com/MilesCranmer/PySR/pull/192
    • Explicitly deprecate Julia <= 1.5 by @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/194
    • Allow custom shared projects for julia_project by @MilesCranmer @mkitti in https://github.com/MilesCranmer/PySR/pull/197
      • e.g., this would allow you to run with @my-project and it will set up a shared Julia project under my-project (in the environments dir)

    New Contributors

    • @JayWadekar made their first contribution in https://github.com/MilesCranmer/PySR/pull/192

    Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.11.0...v0.11.1-1

    Source code(tar.gz)
    Source code(zip)
  • v0.11.0(Sep 11, 2022)

    What's Changed

    • Update backend https://github.com/MilesCranmer/PySR/pull/191
      • Includes high-precision constants when precision=64
      • Enables datasets with zero variance (to allow fitting a constant)
      • Changes, e.g., abs(x)^y to x^y, with expressions avoided altogether for invalid input. This is because the former would sometimes give weird functional forms by exploiting the cusp at x=0. Thanks to @johanbluecreek.

    Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.10.4...v0.11.0

    Source code(tar.gz)
    Source code(zip)
  • v0.10.4-1(Sep 8, 2022)

    What's Changed

    • Fix install for Julia <=1.6 by @MilesCranmer @mkitti in https://github.com/MilesCranmer/PySR/pull/188
      • PyJulia will now launch directly into the shared pysr-{version} environment, rather than activating it later.

    Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.10.3...v0.10.4

    Source code(tar.gz)
    Source code(zip)
  • v0.10.3(Sep 6, 2022)

    What's Changed

    • Displays a warning message when PyTorch is imported before PyJulia starts. See https://github.com/pytorch/pytorch/issues/78829. The only current solution is to start Julia beforehand.
    • New docs! Using Material-Mkdocs:
    Screen Shot 2022-09-06 at 6 06 49 PM Source code(tar.gz)
    Source code(zip)
  • v0.10.2(Sep 6, 2022)

    What's Changed

    • Set JULIA_PROJECT, use Pkg.add once by @mkitti in https://github.com/MilesCranmer/PySR/pull/186

    Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.10.1...v0.10.2

    Source code(tar.gz)
    Source code(zip)
  • v0.10.1(Sep 6, 2022)

  • v0.10.0(Aug 14, 2022)

    What's Changed

    • Easy loading from auto-generated checkpoint files by @MilesCranmer w/ review @tttc3 @Pablo-Lemos in https://github.com/MilesCranmer/PySR/pull/167
      • Use .from_file to load from the auto-generated .pkl file.
    • LaTeX table generator by @MilesCranmer w/ review @tttc3 @kazewong in https://github.com/MilesCranmer/PySR/pull/156
      • Generate a LaTeX table of discovered equations with .latex_table()
    • Improved default model selection strategy by @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/177
      • Old strategy is available as model_selection="score"
    • Add opencontainers image-spec to Dockerfile by @SauravMaheshkar w/ review @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/166
    • Switch to comma-based csv format by @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/176

    Bug fixes

    • Fixed conversions to torch and JAX when a rational number appears in the sympy expression (https://github.com/MilesCranmer/PySR/commit/17c9b1a1762efbd8e021d275491f75cc6dcea8f1, https://github.com/MilesCranmer/PySR/commit/f119733698e4517e34cc902c78dcb95d450c0c80)
    • Fixed pickle saving when trained with multi-output (https://github.com/MilesCranmer/PySR/commit/3da0df512ee295f446ceb0ae6e2c39fb0e380618)
    • Fixed pickle saving when using custom operators with defined sympy -> jax/torch/numpy mappings
    • Backend fix avoids use of Julia's cp which is buggy for some file systems (e.g., EOS)

    New Contributors

    • @SauravMaheshkar made their first contribution in https://github.com/MilesCranmer/PySR/pull/166

    Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.9.0...v0.10.0

    Source code(tar.gz)
    Source code(zip)
  • v0.9.0(Jun 4, 2022)

    What's Changed

    • Refactor of PySRRegressor by @tttc3 in https://github.com/MilesCranmer/PySR/pull/146
      • PySRRegressor is now completely compatible with scikit-learn.
      • PySRRegressor can be stored in a pickle file, even after fitting, and then be reloaded and used with .predict()
      • PySRRegressor.equations -> PySRRegressor.equations_

    New Contributors

    • @tttc3 made their first contribution in https://github.com/MilesCranmer/PySR/pull/146

    Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.8.7...v0.9.0

    Source code(tar.gz)
    Source code(zip)
  • v0.8.5(May 20, 2022)

    What's Changed

    • Custom complexities for operators, constants, and variables (https://github.com/MilesCranmer/PySR/pull/138)
    • Early stopping conditions (https://github.com/MilesCranmer/PySR/pull/134)
      • Based on a certain loss value being achieved
      • Max number of evaluations (for theoretical studies of genetic algorithms, rather than anything practical).
    • Work with specified expression rather than the one given by model_selection, by passing index to the function you wish to use (e.g,. model.predict(X, index=5) would use the 5th equation.).

    Full Changelog since v0.8.1: https://github.com/MilesCranmer/PySR/compare/v0.8.1...v0.8.5

    Source code(tar.gz)
    Source code(zip)
  • v0.8.1(May 8, 2022)

    What's Changed

    • Enable distributed processing with ClusterManagers.jl from https://github.com/MilesCranmer/PySR/pull/133

    Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.8.0...v0.8.1

    Source code(tar.gz)
    Source code(zip)
  • v0.8.0(May 8, 2022)

    This new release updates the entire set of default PySR parameters according to the ones presented in https://github.com/MilesCranmer/PySR/discussions/115. These parameters have been tuned over nearly 71,000 trials. See the discussion for further info.

    Additional changes:

    • Nested constraints implemented. For example, you can now prevent sin and cos from being repeatedly nested, by using the argument: nested_constraints={"sin": {"sin": 0, "cos": 0}, "cos": {"sin": 0, "cos": 0}}. This argument states that within a sin operator, you can only have a max depth of 0 for other sin or cos. The same is done for cos. The argument nested_constraints={"^": {"+": 2, "*": 1, "^": 0}} states that within a pow operator, you can only have 2 things added, or 1 use of multiplication (i.e., no double products), and zero other pow operators. This helps a lot with finding interpretable expressions!
    • New parsimony algorithm (backend change). This seems to help searches quite a bit, especially when one is searching for more complex expressions. This is turned on by use_frequency_in_tournament which is now the default.
    • Many backend improvements: speed, bug fixes, etc.
    • Improved stability of multi-processing (backend change). Thanks to @CharFox1.
    • Auto-differentiation implemented (backend change). This isn't used by default in any instances right now, but could be used by optimization later. Thanks to @kazewong.
    • Improved testing coverage of weird edge cases.
    • All parameters to PySRRegressor have been cleaned up to be in snake_case rather than CamelCase. The backend is also now almost entirely snake_case for internal functions. +Other readability improvements. Thanks to @bstollnitz and @patrick-kidger for the suggestions.
    Source code(tar.gz)
    Source code(zip)
  • v0.6.0(Jun 1, 2021)

    PySR Version 0.6.0

    Large changes:

    • Exports to JAX, PyTorch, NumPy. All exports have a similar interface. JAX and PyTorch allow the equation parameters to be trained (e.g., as part of some differentiable model). Read https://pysr.readthedocs.io/en/latest/docs/options/#callable-exports-numpy-pytorch-jax for details. Thanks Patrick Kidger for the PyTorch export.
    • Multi-output y input is allowed, and the backend will efficiently batch over each output. A list of dataframes is returned by pysr for these cases. All best_* functions return a list as well.
    • BFGS optimizer introduced + more stable parameter search due to back tracking line search.

    Smaller changes since 0.5.16:

    • Expanded tests, coverage calculation for PySR
    • Improved (pre-processing) feature selection with random forest
    • New default parameters for search:
      • annealing=False (no annealing works better with the new code. This is equivalent to alpha=infinity)
      • useFrequency=True (deals with complexity in a smarter way)
      • npopulations = 20 ~~procs*4~~
      • progress=True (show a progress bar)
      • optimizer_algorithm="BFGS"
      • optimizer_iterations=10
      • optimize_probability=1
      • binary_operators default = ["+", "-", "/", "*"]
      • unary_operators default = []
    • Warnings:
      • Using maxsize > 40 will trigger a warning mentioning how it will be slow and use a lot of memory. Will mention to turn off useFrequency, and perhaps also use warmupMaxsizeBy.
    • Deprecated nrestarts -> optimizer_nrestarts
    • Printing fixed in Jupyter
    Source code(tar.gz)
    Source code(zip)
  • v0.4.0(Feb 1, 2021)

    With versions v0.4.0/v0.4.0, SymbolicRegression.jl and PySR have now been completely disentangled: PySR is 100% Python code (with some Julia meta-programming), and SymbolicRegression.jl is 100% Julia code.

    PySR now works by activating a Julia env that has SymbolicRegression.jl as a dependency, and making calls to it! By default it will set up a Julia project inside the pip install location, and install requirements at the user's confirmation, though you can pass an arbitrary project directory as well (e.g., if you want to use PySR but also tweak the backend). The nice thing about this is that for Python users, all you need to do is install a Julia binary somewhere, and they should be good to go. And for Julia users, you never need to touch the Python side.

    The SymbolicRegression.jl backend also sets up workers automatically & internally now, so one never needs to call @everywhere when setting things up. The same is true even with locally-defined functions - these get passed to workers!

    With PySR importing the latest Julia code, this also means it gets new simplification routines powered by SymbolicUtils.jl, which seem to help improve the equations discovered.

    Source code(tar.gz)
    Source code(zip)
  • v0.3.8(Sep 27, 2020)

    Populations don't block eachother, which gives a large speedup especially for large numbers of populations. This was fixed by using RemoteChannel() in Julia.

    Some populations happen to take longer than others - perhaps they have very complex equations - and can therefore block others that have finished early. This lets the processor work on the next population to be finished.

    Source code(tar.gz)
    Source code(zip)
  • v0.3.5(Sep 27, 2020)

    Uses equation from Cranmer et al. (2020) https://arxiv.org/abs/2006.11287 to score equations, and prints this alongside MSE. This makes symbolic regression more robust to noise.

    Source code(tar.gz)
    Source code(zip)
Owner
Miles Cranmer
Astro PhD candidate @princeton trying to accelerate astrophysics with AI. I build interpretable ML algorithms.
Miles Cranmer
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

Website | Documentation | Tutorials | Installation | Release Notes CatBoost is a machine learning method based on gradient boosting over decision tree

CatBoost 6.9k Jan 5, 2023
Python module for performing linear regression for data with measurement errors and intrinsic scatter

Linear regression for data with measurement errors and intrinsic scatter (BCES) Python module for performing robust linear regression on (X,Y) data po

Rodrigo Nemmen 56 Sep 27, 2022
Python Extreme Learning Machine (ELM) is a machine learning technique used for classification/regression tasks.

Python Extreme Learning Machine (ELM) Python Extreme Learning Machine (ELM) is a machine learning technique used for classification/regression tasks.

Augusto Almeida 84 Nov 25, 2022
Decision Tree Regression algorithm implemented on Python from scratch.

Decision_Tree_Regression I implemented the decision tree regression algorithm on Python. Unlike regular linear regression, this algorithm is used when

null 1 Dec 22, 2021
Used Logistic Regression, Random Forest, and XGBoost to predict the outcome of Search & Destroy games from the Call of Duty World League for the 2018 and 2019 seasons.

Call of Duty World League: Search & Destroy Outcome Predictions Growing up as an avid Call of Duty player, I was always curious about what factors led

Brett Vogelsang 2 Jan 18, 2022
Predicting diabetes over a five year period using logistic regression and the Pima First-Nation dataset

Diabetes This script uses the Pima First Nations dataset to create a model to predict whether or not an individual will develop Diabetes Mellitus Type

null 1 Mar 28, 2022
Diabetes Prediction with Logistic Regression

Diabetes Prediction with Logistic Regression Exploratory Data Analysis Data Preprocessing Model & Prediction Model Evaluation Model Validation: Holdou

AZİZE SULTAN PALALI 2 Oct 23, 2021
This repository contains the code to predict house price using Linear Regression Method

House-Price-Prediction-Using-Linear-Regression The dataset I used for this personal project is from Kaggle uploaded by aariyan panchal. Link of Datase

null 0 Jan 28, 2022
A linear regression model for house price prediction

Linear_Regression_Model A linear regression model for house price prediction. This code is using these packages, so please make sure your have install

ShawnWang 1 Nov 29, 2021
A logistic regression model for health insurance purchasing prediction

Logistic_Regression_Model A logistic regression model for health insurance purchasing prediction This code is using these packages, so please make sur

ShawnWang 1 Nov 29, 2021
Ml based project which uses regression technique to predict the price.

Price-Predictor Ml based project which uses regression technique to predict the price. I have used various regression models and finds the model with

Garvit Verma 1 Jul 9, 2022
Multiple Linear Regression using the LinearRegression class from sklearn.linear_model library

Multiple-Linear-Regression-master - A python program to implement Multiple Linear Regression using the LinearRegression class from sklearn.linear model library

Kushal Shingote 1 Feb 6, 2022
Machine-care - A simple python script to take care of simple maintenance tasks

Machine care An simple python script to take care of simple maintenance tasks fo

null 2 Jul 10, 2022
ThunderSVM: A Fast SVM Library on GPUs and CPUs

What's new We have recently released ThunderGBM, a fast GBDT and Random Forest library on GPUs. add scikit-learn interface, see here Overview The miss

Xtra Computing Group 1.4k Dec 22, 2022
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

Light Gradient Boosting Machine LightGBM is a gradient boosting framework that uses tree based learning algorithms. It is designed to be distributed a

Microsoft 14.5k Jan 7, 2023
ThunderGBM: Fast GBDTs and Random Forests on GPUs

Documentations | Installation | Parameters | Python (scikit-learn) interface What's new? ThunderGBM won 2019 Best Paper Award from IEEE Transactions o

Xtra Computing Group 648 Dec 16, 2022
Greykite: A flexible, intuitive and fast forecasting library

The Greykite library provides flexible, intuitive and fast forecasts through its flagship algorithm, Silverkite.

LinkedIn 1.7k Jan 4, 2023
Meerkat provides fast and flexible data structures for working with complex machine learning datasets.

Meerkat makes it easier for ML practitioners to interact with high-dimensional, multi-modal data. It provides simple abstractions for data inspection, model evaluation and model training supported by efficient and robust IO under the hood.

Robustness Gym 115 Dec 12, 2022
A basic Ray Tracer that exploits numpy arrays and functions to work fast.

Python-Fast-Raytracer A basic Ray Tracer that exploits numpy arrays and functions to work fast. The code is written keeping as much readability as pos

Rafael de la Fuente 393 Dec 27, 2022