Reinforcement Learning for Guiding the E Theorem Prover

نویسندگان

چکیده

Automated Theorem Proving (ATP) systems search for aproof in a rapidly growing space of possibilities. Heuristicshave profound impact on search, and ATP makeheavy use heuristics. This work uses reinforcement learn-ing to learn metaheuristic that decides which heuristic useat each step proof the E system. Proximalpolicy optimization is used dynamically select heuristicfrom fixed set, based current state E. The approachis evaluated its ability reduce number inferencesteps successful searches, as an indicator in-telligent search.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Guiding a Theorem Prover with Soft Constraints

Attempts to use finite models to guide the search for proofs by resolution and the like in first order logic all suffer from the need to trade off the expense of generating and maintaining models against the improvement in quality of guidance as investment in the semantic aspect of the reasoning is increased. Previous attempts to resolve this tradeoff have resulted either in poor selection of m...

متن کامل

E - a brainiac theorem prover

We describe the superposition-based theorem prover E. E is a sound and complete prover for clausal first order logic with equality. Important properties of the prover include strong redundancy elimination criteria, the DISCOUNT loop proof procedure, a very flexible interface for specifying search control heuristics, and an efficient inference engine. We also discuss strength and weaknesses of t...

متن کامل

Experiments with Strategy Learning for E Prover

Automated theorem provers (ATPs) consist of a number of complicated algorithms, that can be parameterized and combined together in different ways. Examples of such parameterizations are clause weighting and selection schemes, term orderings, sets of inference and reduction rules used, etc. E [8] (as some other ATPs) has a language for packaging such useful combinations of parameterizations into...

متن کامل

The Heuristic Theorem Prover: Yet Another SMT Modulo Theorem Prover

HTP is an SMT Modulo theorem prover similar to many others.[2–6, 9, 11] As input, HTP accepts problems using the SMT-LIB format[8]. As output, HTP will answer either SAT, UNSAT or UNKNOWN. Alternatively, HTP can be run in a preprocessing mode in which the output is the simplified problem in SMTLIB format. An evidence file showing the derivation in a human readable form can be produced. There is...

متن کامل

Guiding Inference Through Relational Reinforcement Learning

Reasoning plays a central role in intelligent systems that operate in complex situations that involve time constraints. In this paper, we present the Adaptive Logic Interpreter, a reasoning system that acquires a controlled inference strategy adapted to the scenario at hand, using a variation on relational reinforcement learning. Employing this inference mechanism in a reactive agent architectu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... International Florida Artificial Intelligence Research Society Conference

سال: 2023

ISSN: ['2334-0762', '2334-0754']

DOI: https://doi.org/10.32473/flairs.36.133334