This repository lets you interact with Lean through a REPL.

Overview

lean-gym

This repository lets you interact with Lean through a REPL. See Formal Mathematics Statement Curriculum Learning for a presentation of lean-gym.

Setup

# Download pre-built binaries and build the project (targeting mathlib).
bash ./scripts/setup.sh

Usage

lean --run src/repl.lean

Starts a fresh REPL. Once started, the REPL accepts the following commands:

  • init_search: takes a declaration name as well as a list of open namespaces to initialize a search at the given declaration opening the provided namespaces, and returning the initial tactic state (along with a fresh search_id and tactic_state_id).
  • run_tac: takes a search_id, a tactic_state_id and a tactic to apply at the tactic state denoted by the provided ids.
  • clear_search: takes a search_id to clear all state related to a search.

The commands can be interleaved freely enabling the parallelization of multiple proof searches against the same REPL.

$ lean --run src/repl.lean

["init_search", ["intermediate_field.adjoin.range_algebra_map_subset", "intermediate_field finite_dimensional polynomial"]]
{"error":null,"search_id":"0","tactic_state":"⊢ ∀ (F : Type u_1) [_inst_1 : field F] {E : Type u_2} [_inst_2 : field E] [_inst_3 : algebra F E] (S : set E),\tset.range ⇑(algebra_map F E) ⊆ ↑(intermediate_field.adjoin F S)","tactic_state_id":"0"}

["init_search", ["int.prime.dvd_mul", ""]]
{"error":null,"search_id":"1","tactic_state":"⊢ ∀ {m n : ℤ} {p : ℕ}, nat.prime p → ↑p ∣ m * n → p ∣ m.nat_abs ∨ p ∣ n.nat_abs","tactic_state_id":"0"}

["run_tac",["0","0","intros"]]
{"error":null,"search_id":"0","tactic_state":"F : Type u_1,\t_inst_1 : field F,\tE : Type u_2,\t_inst_2 : field E,\t_inst_3 : algebra F E,\tS : set E\t⊢ set.range ⇑(algebra_map F E) ⊆ ↑(intermediate_field.adjoin F S)","tactic_state_id":"1"}

["run_tac",["1","0","intros"]]
{"error":null,"search_id":"1","tactic_state":"m n : ℤ,\tp : ℕ,\thp : nat.prime p,\th : ↑p ∣ m * n\t⊢ p ∣ m.nat_abs ∨ p ∣ n.nat_abs","tactic_state_id":"1"}

["run_tac",["1","1","apply (nat.prime.dvd_mul hp).mp"]]
{"error":null,"search_id":"1","tactic_state":"m n : ℤ,\tp : ℕ,\thp : nat.prime p,\th : ↑p ∣ m * n\t⊢ p ∣ m.nat_abs * n.nat_abs","tactic_state_id":"2"}

["run_tac",["1","2","rw ← int.nat_abs_mul"]]
{"error":null,"search_id":"1","tactic_state":"m n : ℤ,\tp : ℕ,\thp : nat.prime p,\th : ↑p ∣ m * n\t⊢ p ∣ (m * n).nat_abs","tactic_state_id":"3"}

["run_tac",["1","3","simp"]]
{"error":"run_tac_failed: pos=(some ⟨1, 2⟩) msg=simplify tactic failed to simplify","search_id":null,"tactic_state":null,"tactic_state_id":null}

["run_tac",["1","5","exact int.coe_nat_dvd_left.mp h"]]
{"error":"unknown_id: search_id=1 tactic_state_id=5","search_id":null,"tactic_state":null,"tactic_state_id":null}

["run_tac",["1","3","exact int.coe_nat_dvd_left.mp h"]]
{"error":null,"search_id":"1","tactic_state":"no goals","tactic_state_id":"4"}

["clear_search",["1"]]
{"error":null,"search_id":"1","tactic_state":null,"tactic_state_id":null}

["run_tac",["0","1","intros x hx,"]]
{"error":null,"search_id":"0","tactic_state":"F : Type u_1,\t_inst_1 : field F,\tE : Type u_2,\t_inst_2 : field E,\t_inst_3 : algebra F E,\tS : set E,\tx : E,\thx : x ∈ set.range ⇑(algebra_map F E)\t⊢ x ∈ ↑(intermediate_field.adjoin F S)","tactic_state_id":"2"}

["run_tac",["0","2","cases hx with f hf"]]
{"error":null,"search_id":"0","tactic_state":"F : Type u_1,\t_inst_1 : field F,\tE : Type u_2,\t_inst_2 : field E,\t_inst_3 : algebra F E,\tS : set E,\tx : E,\tf : F,\thf : ⇑(algebra_map F E) f = x\t⊢ x ∈ ↑(intermediate_field.adjoin F S)","tactic_state_id":"3"}

["run_tac",["0","3","rw ← hf"]]
{"error":null,"search_id":"0","tactic_state":"F : Type u_1,\t_inst_1 : field F,\tE : Type u_2,\t_inst_2 : field E,\t_inst_3 : algebra F E,\tS : set E,\tx : E,\tf : F,\thf : ⇑(algebra_map F E) f = x\t⊢ ⇑(algebra_map F E) f ∈ ↑(intermediate_field.adjoin F S)","tactic_state_id":"4"}

["run_tac",["0","4","exact adjoin.algebra_map_mem F S f"]]
{"error":null,"search_id":"0","tactic_state":"no goals","tactic_state_id":"5"}

["clear_search",["0"]]
{"error":null,"search_id":"0","tactic_state":null,"tactic_state_id":null}

Declaration names

Declaration names and open namespaces as recorded by lean_proof_recording are available in the data/ directory to be used with the init_search command.

Notes

The REPL is subject to crashes in rare cases. Empirically such crash happens no more than ~0.01% of the time.

When a tactic state is reached with no left goals, some custom logic is run to check that the resulting proof's type matches the top level goal type and does not rely on sorry. We also check for the presence of undefined in the proof term. As an example, the following MiniF2F proofs will safely fail with error proof_validation_failed.

["init_search", ["mathd_algebra_35", ""]]
["run_tac", ["0", "0", "intros"]]
["run_tac", ["0", "1", "sorry"]]
["init_search", ["induction_divisibility_3divnto3m2n", ""]]
["run_tac", ["0", "0", "intros"]]
["run_tac", ["0", "1", "rw [add_comm]"]]
["run_tac", ["0", "2", "have h3 : 1 * (n + 1) ≤ (n + 1)"]]
["run_tac", ["0", "3", "rw one_mul"]]
["run_tac", ["0", "4", "apply dvd_trans"]]
["run_tac", ["0", "5", "swap"]]
["run_tac", ["0", "6", "simp []"]]
["init_search", ["mathd_numbertheory_13", ""]]
["run_tac", ["0", "0", "intros u v hu hv hsum"]]
["run_tac", ["0", "1", "intro h"]]
["run_tac", ["0", "2", "contrapose h"]]
["run_tac", ["0", "3", "intro hn"]]
["run_tac", ["0", "4", "exact not_lt_of_lt hn undefined"]]
Comments
  • Return error when parse failed without exiting.

    Return error when parse failed without exiting.

    This is a simple modification of the response logic when parse failed occurs.

    Lean-gym currently returns a fatal error when parse fails and exits the repl process, which loses all search history. This can occur so easily, that simply inputting an empty line will can cause this issue.

    I change the parse_request function that makes it return a fake LeanPEPLRequest containing the parse_failed command and error message. An additional handle_parse_failed function is also added and returns a normal error message without quitting the repl process.

    I tested the change in Linux, it works fine.

    p.s. great work, looking forward to future updates.

    opened by wiio12 3
  • Question about PACT tactic dataset

    Question about PACT tactic dataset

    Hi, great work here! But I got some questions when reproducing your paper "Formal Mathematics Statement Curriculum Learning". In this paper, you use the tactic dataset in PACT. However, when I follow the instructions from PACT and create the tactic dataset, most of the tactics and proofs can not reach the "no goals" state when directly applying the tactics in the dataset step by step. Only 30% of them are actually applicable and successfully proven. I wonder if I am doing something wrong here or missing some important steps?

    opened by wiio12 2
  • How to search a customed theorem with lean-gym?

    How to search a customed theorem with lean-gym?

    Hi, I wonder how to search a customed theorem with lean-gym? For example, ∀ (a b : ℝ), ∃ x, (a + b) ^ 3 = a ^ 3 + x * a ^ 2 * b + 3 * a * b ^ 2 + b ^ 3. Thanks!

    opened by liebenxj 0
  • How to integrate lean-gym with python?

    How to integrate lean-gym with python?

    Great job! I want to use lean-gym as an engine to train my AI just like RL gym, but I don't know how to integrate lean-gym with python. Could you please provide an example code? Thanks!

    opened by venom12138 0
  • Accessing the Global Environment

    Accessing the Global Environment

    Hi,

    Thanks for providing a great tool! I wonder whether it is possible to access the global environment (e.g., existing definitions and theorems either in the same file or imported from other files) using lean-gym. Thanks!

    opened by yangky11 0
  • Decreasing Performance

    Decreasing Performance

    Hi, I have some problems with decreasing performance. Each usage of "run_tac" seems to slow down the usage of further "run_tac"s. At the end is a python file to show the behavior: it repeats (init_search, 2000x run_tac, clear_search) and takes 15s, 17s, 19s, 21s, ... Is there a way around it or should I just reopen lean-gym periodically?

    import time
    import subprocess
    import os
    import json
    
    lean_gym_path = os.path.dirname(os.path.realpath(__file__))
    lean_gym = subprocess.Popen(['lean', '--run', 'src/repl.lean'],
                                cwd=lean_gym_path,
                                stdin=subprocess.PIPE,
                                stdout=subprocess.PIPE)
    
    def run_lean_cmd(cmd: str):
        lean_gym.stdin.write(cmd.encode('utf-8'))
        lean_gym.stdin.flush()
        return json.loads(lean_gym.stdout.readline().decode('utf-8'))
    
    def init_search(decl: str) -> dict:
        cmd = f'["init_search", ["{decl}", ""]]\n'
        return run_lean_cmd(cmd)
    
    def run_tac(search_id: int, tactic_state_id: int, tactic: str) -> dict:
        cmd = f'["run_tac",["{search_id}","{tactic_state_id}","{tactic}"]]\n'
        return run_lean_cmd(cmd)
    
    def clear_search(search_id: int) -> dict:
        cmd = f'["clear_search",["{search_id}"]]\n'
        return run_lean_cmd(cmd)
    
    t1 = time.time()
    for i in range(10000):
        t2 = time.time()
        if i > 1:
            print(t2 - t1)
        t1 = t2
    
        result = init_search('nat.mul_left_eq_self_iff')
        search_id = int(result['search_id'])
        state_id = int(result['tactic_state_id'])
    
        result = run_tac(search_id, state_id, 'intros')
        search_id = int(result['search_id'])
        state_id = int(result['tactic_state_id'])
    
        for state_id in range(1000):
            search_id = int(result['search_id'])
            state_id = int(result['tactic_state_id'])
            result = run_tac(search_id, state_id, 'rw mul_comm')
    
        clear_search(search_id)
    
    opened by todbeibrot 0
  • Proof checks should happen after every tactic

    Proof checks should happen after every tactic

    Right now it seems that all proof checks are at the end. This violates the Markovian assumption of Lean gym since a bad tactic upstream can lead to a failure of the final tactic step. Most of the current checks for sorry, undefined, and maybe some of the others are valid checks to do after every proof step. The only one I know for sure won't work is the metavariable check. That isn't supposed to hold until a proof is complete.

    opened by jasonrute 6
Owner
OpenAI
OpenAI
Code of paper Interact, Embed, and EnlargE (IEEE): Boosting Modality-specific Representations for Multi-Modal Person Re-identification.

Interact, Embed, and EnlargE (IEEE): Boosting Modality-specific Representations for Multi-Modal Person Re-identification We provide the codes for repr

null 12 Dec 12, 2022
A library of scripts that interact with the PythonTurtle module to create games, drawings, and more

TurtleLib TurtleLib is a library of scripts that interact with the PythonTurtle module to create games, drawings, and more! Using the Scripts Copy or

null 1 Jan 15, 2022
Official repository of OFA. Paper: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

Paper | Blog OFA is a unified multimodal pretrained model that unifies modalities (i.e., cross-modality, vision, language) and tasks (e.g., image gene

OFA Sys 1.4k Jan 8, 2023
Spam your friends and famly and when you do your famly will disown you and you will have no friends.

SpamBot9000 Spam your friends and family and when you do your family will disown you and you will have no friends. Terms of Use Disclaimer: Please onl

DJ15 0 Jun 9, 2022
BMW TechOffice MUNICH 148 Dec 21, 2022
Official repository for the paper "Can You Learn an Algorithm? Generalizing from Easy to Hard Problems with Recurrent Networks"

Easy-To-Hard The official repository for the paper "Can You Learn an Algorithm? Generalizing from Easy to Hard Problems with Recurrent Networks". Gett

Avi Schwarzschild 52 Sep 8, 2022
A repository that finds a person who looks like you by using face recognition technology.

Find Your Twin Hello everyone, I've always wondered how casting agencies do the casting for a scene where a certain actor is young or old for a movie

Cengizhan Yurdakul 3 Jan 29, 2022
Game Agent Framework. Helping you create AIs / Bots that learn to play any game you own!

Serpent.AI - Game Agent Framework (Python) Update: Revival (May 2020) Development work has resumed on the framework with the aim of bringing it into 2

Serpent.AI 6.4k Jan 5, 2023
Official code for Score-Based Generative Modeling through Stochastic Differential Equations

Score-Based Generative Modeling through Stochastic Differential Equations This repo contains the official implementation for the paper Score-Based Gen

Yang Song 818 Jan 6, 2023
[CVPR 2021] Unsupervised 3D Shape Completion through GAN Inversion

ShapeInversion Paper Junzhe Zhang, Xinyi Chen, Zhongang Cai, Liang Pan, Haiyu Zhao, Shuai Yi, Chai Kiat Yeo, Bo Dai, Chen Change Loy "Unsupervised 3D

null 100 Dec 22, 2022
An implementation of "Optimal Textures: Fast and Robust Texture Synthesis and Style Transfer through Optimal Transport"

Optex An implementation of Optimal Textures: Fast and Robust Texture Synthesis and Style Transfer through Optimal Transport for TU Delft CS4240. You c

Hans Brouwer 33 Jan 5, 2023
This repo contains the code and data used in the paper "Wizard of Search Engine: Access to Information Through Conversations with Search Engines"

Wizard of Search Engine: Access to Information Through Conversations with Search Engines by Pengjie Ren, Zhongkun Liu, Xiaomeng Song, Hongtao Tian, Zh

null 19 Oct 27, 2022
Framework for joint representation learning, evaluation through multimodal registration and comparison with image translation based approaches

CoMIR: Contrastive Multimodal Image Representation for Registration Framework ?? Registration of images in different modalities with Deep Learning ??

Methods for Image Data Analysis - MIDA 55 Dec 9, 2022
PIGLeT: Language Grounding Through Neuro-Symbolic Interaction in a 3D World [ACL 2021]

piglet PIGLeT: Language Grounding Through Neuro-Symbolic Interaction in a 3D World [ACL 2021] This repo contains code and data for PIGLeT. If you like

Rowan Zellers 51 Oct 8, 2022
code for our paper "Source Data-absent Unsupervised Domain Adaptation through Hypothesis Transfer and Labeling Transfer"

SHOT++ Code for our TPAMI submission "Source Data-absent Unsupervised Domain Adaptation through Hypothesis Transfer and Labeling Transfer" that is ext

null 75 Dec 16, 2022
Detection of drones using their thermal signatures from thermal camera through YOLO-V3 based CNN with modifications to encapsulate drone motion

Drone Detection using Thermal Signature This repository highlights the work for night-time drone detection using a using an Optris PI Lightweight ther

Chong Yu Quan 6 Dec 31, 2022