Autoformalization: Comparative Overview

Side-by-side comparison of autoformalization papers across agents, tools, datasets, and evaluation methods.

Input & Output

Paper	Input	Output
Minimal_Agent	Formal Statement	Formal Proof
Agentic Proof Automation	Formal Statement	Formal Proof
ATF	NLP Statement	Formal Statement
Textbook Formalizer	NLP Document	Formal Statement + Formal Proof
Ax-Prover	Formal Statement	Formal Proof
Designing Agentic Tool (proposal)	NLP Statement	Formal Statement + Formal Proof
Lifelong Learner	Formal Statement	Formal Proof
M2F	NLP Document	Formal Statement + Formal Proof
MathCode	NLP Statement	Formal Statement + Formal Proof
MerLean	NLP Document	Formal Statement + Formal Proof
Numina-Lean-Agent	Formal Statement	Formal Proof
Process-Driven Autoformalization	NLP Statement + NLP Proof	Formal Statement + Formal Proof
ProofBridge	NLP Statement + NLP Proof	Formal Statement + Formal Proof

Agents Used

Paper	Orchestrator	Formal Prover	Formalize Statement	Formalizer (proof+statement)	NLP=FL Semantic Check	Break to Lemma	Statement Extractor	Nlp Prover	Nlp Proof Reviewer	Code Generation	Rank Formalizations	Autoinformalizer	Memory Manager	Engineering Reviewer	Maintainer	Progress	Scanner	Triage	Planner
Minimal_Agent	·	✓	·	·	·	·	·	·	·	·	·	·	✓	·	·	·	·	·	·
Agentic Proof Automation	·	✓	·	·	·	·	·	·	·	·	·	·	·	·	·	·	·	·	·
ATF	·	·	·	✓	✓	·	·	·	·	·	·	·	·	·	·	·	·	·	·
Textbook Formalizer	·	✓	✓	·	✓	·	·	·	·	·	·	·	·	✓	✓	✓	✓	✓	·
Ax-Prover	✓	✓	·	·	·	·	·	·	·	·	·	·	·	·	·	·	·	·	·
Designing Agentic Tool (proposal)	✓	·	·	✓	·	·	·	✓	✓	✓	·	·	·	·	·	·	·	·	·
M2F	·	✓	✓	·	·	✓	✓	✓	·	·	·	·	·	·	·	·	·	·	·
MathCode	·	✓	✓	·	·	·	·	✓	·	·	·	·	·	·	·	·	·	·	✓
MerLean	·	·	·	✓	✓	·	✓	·	·	·	·	✓	·	·	·	·	·	·	·
Numina-Lean-Agent	✓	✓	·	·	·	✓	·	✓	✓	·	·	·	·	·	·	·	·	·	·
Process-Driven Autoformalization	·	✓	·	·	·	·	·	·	·	·	✓	·	·	·	·	·	·	·	·
ProofBridge	·	✓	·	·	✓	·	·	·	·	·	·	·	·	·	·	·	·	·	·

Tools & MCPs Available

Paper	Lean 4 MCP	Theorem Search	Document Search	Code Execution	File System	Blueprint Tool	Bash Tool	Git Tools	Issue Tracker	Anti Trivialization Guard
Minimal_Agent	✓	✓	✓	·	·	·	·	·	·	·
Agentic Proof Automation	✓	·	·	·	✓	·	·	·	·	·
ATF	✓	·	·	·	·	·	·	·	·	·
Textbook Formalizer	✓	✓	·	·	✓	·	✓	✓	✓	·
Ax-Prover	✓	✓	·	·	✓	·	·	·	·	·
Designing Agentic Tool (proposal)	✓	·	✓	✓	·	·	·	·	·	·
M2F	✓	·	·	·	·	·	·	·	·	·
MathCode	✓	·	✓	·	✓	·	·	·	·	✓
MerLean	✓	✓	·	·	✓	·	·	·	·	·
Numina-Lean-Agent	✓	✓	·	·	·	✓	·	·	·	·
Process-Driven Autoformalization	✓	·	·	·	·	·	·	·	·	·
ProofBridge	✓	✓	·	·	·	·	·	·	·	·

Datasets Used

Paper	MiniF2F	PutnamBench	FATE	LeanCat	Mathlib Theorems	MATH	GSM8K	Research Papers	Textbook	Combibench	Formalmath Lite	Proverbench
Minimal_Agent	✓	✓	✓	✓	·	·	·	·	·	·	·	·
ATF	·	·	·	·	·	·	·	·	·	✓	✓	✓
Textbook Formalizer	·	·	·	·	·	·	·	·	✓	·	·	·
M2F	·	·	✓	·	·	·	·	✓	✓	·	·	·
MerLean	·	·	·	·	·	·	·	✓	·	·	·	·
Process-Driven Autoformalization	·	·	·	·	✓	✓	✓	·	·	·	·	·
ProofBridge	✓	·	·	·	·	·	·	·	·	·	·	·

Statement Formalization Evaluation

Paper	Evaluation Method
ATF	For the human evaluation, three domain experts independently review 100 randomly sampled instances from each benchmark, with the majority opinion serving as the final performance judgment.
Textbook Formalizer	Post-run, an interactive blueprint website enables human community verification side-by-side. No quantitative metric for statement accuracy is reported.
M2F	Human domain experts compare the generated Lean declarations directly against the original, source-linked text spans.
MathCode	Two-layer check: (1) hardcoded anti-trivialization guard rejects statements reduced to True/False, (2) LLM-based semantic evaluator grades fidelity A-D against original NLP problem on 5-point checklist (objects, quantifiers, hypotheses, conclusions, multi-part coverage). Optional strict A+ recheck and double-check mode with cross-model agreement.
MerLean	Manual review to ensure all new definitions and axioms are mathematically accurate and rigorously constructed, using Autoinformalizer.
Process-Driven Autoformalization	Human Lean 4 experts manually evaluated the outputs.
ProofBridge	Prove equivilance in Lean4 restricted tactic set with Gemini-2.5-Pro.