Summaries | Arshia Soltani Moakhar

autoformalization

Autoformalization: Comparative Overview

Side-by-side comparison of autoformalization papers across agents, tools, datasets, and evaluation methods.

Automatic Textbook Formalization

A multi-agent framework utilizing code-capable language models to translate and prove graduate-level mathematical textbooks in the Lean theorem prover.

MathCode / AUTOLEAN

A terminal-based AI coding assistant with a built-in math formalization engine that converts natural language math problems into Lean 4 formal theorem statements and proofs.

APOLLO Automated LLM and Lean Collaboration for Advanced Formal Reasoning

A modular framework that coordinates large language models, the Lean compiler, and automated solvers to repair and verify formal mathematical proofs.

A Minimal Agent for Automated Theorem Proving

An evaluation of a modular agent architecture for automated theorem proving in Lean 4 featuring iterative refinement, context management, and search tools.

Agentic Proof Automation A Case Study

A case study evaluating the use of large language model agents for mechanizing the semantic type soundness of a formal system in Lean 4.

2510.01346v2.pdf

Aristotle is an automated theorem proving system that integrates a Lean proof search algorithm, an informal reasoning pipeline, and a dedicated geometry solver.

Autoformalizer with Tool Feedback

A framework that improves autoformalization by integrating syntactic compiler feedback and semantic consistency checks during the generation process.

Ax-Prover A Deep Reasoning Agentic Framework for Theorem Proving in Mathematics and Quantum Physics

A multi-agent framework utilizing the Model Context Protocol to equip general-purpose LLMs with Lean tools for automated theorem proving.

CriticLean Critic-Guided Reinforcement Learning for Mathematical Formalization

A reinforcement learning framework that explicitly trains a critic model to evaluate and improve the semantic fidelity of natural language to Lean 4 mathematical formalizations.

Designing an Agentic Tool for AI-Assisted Mathematical Proofs

A five-layer orchestration architecture for AI-assisted mathematical proof development incorporating computational experimentation, adversarial review, and formal verification.

Hilbert Recursively Building Formal Proofs with Informal Reasoning

A multi-agent framework that orchestrates general-purpose reasoning models and specialized prover models to recursively decompose and solve formal mathematical theorems.

Learning to Repair Lean Proofs from Compiler Feedback

A supervised learning approach and dataset for repairing erroneous Lean proofs using compiler feedback and structured natural-language diagnoses.

M2F Automated Formalization of Mathematical Literature at Scale

A two-stage framework for automated, project-scale formalization of mathematical literature into buildable Lean projects using verifier-certified refinement.

MerLean An Agentic Framework for Autoformalization in Quantum Computation

A bidirectional framework that extracts mathematical statements from LaTeX, formalizes them into Lean 4 code, and translates the verified code back into natural language.

Numina-Lean-Agent An Open and General Agentic Reasoning System for Formal Mathematics

A formal mathematical reasoning system that combines a general coding agent with specialized tools via the Model Context Protocol to interact with the Lean theorem prover.

Process-Driven Autoformalization in Lean 4

This paper introduces the Process-Driven Autoformalization (PDA) framework, which leverages process-level feedback from the Lean 4 compiler to iteratively improve the autoformalization of natural language mathematics.

ProofBridge Auto-Formalization of Natural Language Proofs in Lean via Joint Embeddings

ProofBridge is a framework that auto-formalizes natural language proofs into Lean 4 using cross-modal retrieval via joint embeddings, retrieval-augmented fine-tuning, and iterative verifier-guided repair.

Seed-Prover 1.5 Mastering Undergraduate-Level Theorem Proving via Learning from Experience

Documentation of the training methodologies and experimental scaling results of the Seed-Prover 1.5 formal theorem proving model.

Solving Formal Math Problems by Decomposition and Iterative Reflection

An agent-based framework that orchestrates interaction between a general-purpose LLM and the Lean 4 environment via reflective decomposition and iterative proof repair.

Construction-Verification A Benchmark for Applied Mathematics in Lean 4

A Lean 4 evaluation framework and benchmark requiring explicit solution construction prior to logical verification.

LeanAgent, Lifelong Learning for Formal Theorem Proving

A lifelong learning framework for formal theorem proving that combines curriculum learning and progressive training to maintain and extend performance across multiple Lean repositories.

TheoremLlama Transforming General-Purpose LLMs into Lean4 Experts

This paper proposes an end-to-end framework and dataset generation pipeline to train large language models for formal mathematical theorem proving in Lean 4.

Towards a Common Framework for Autoformalization

A unified conceptual framework defining autoformalization across interactive theorem proving, logic programming, planning, and knowledge representation.

Reasoning

Chain of Images for Intuitively Reasoning

This paper proposes a Chain-of-Images (CoI) method for multimodal models to solve reasoning problems by generating a series of images as intermediate representations, using a Symbolic Multimodal Large Language Model (SyMLLM).

Dream 7B Diffusion Large Language Models

Dream 7B is a 7-billion parameter diffusion language model that refines text in parallel, initialized from an autoregressive model and trained using context-adaptive noise scheduling.

Imagine while Reasoning in Space Multimodal Visualization-of-Thought

Multimodal Visualization-of-Thought (MVoT) is proposed to enable Multimodal Large Language Models (MLLMs) to generate interleaved verbal and visual reasoning traces for spatial reasoning tasks.

LLM Latent Reasoning as Chain of Superposition

Train an encoder that summarizes reasoning chunks. Then train a latent reasoning model on the summaries it produces from some CoT data.

LLMs are Single-threaded Reasoners, Demystifying the Working Mechanism of Soft Thinking

Vanilla Soft Thinking pushes the model to the greedy token sampling internally. They showed that the model usually continues to work only with the most probable next token. To mitigate this issue they suggest adding noise to logits and get better performance.

Latent Chain-of-Thought? Decoding the Depth-Recurrent Transformer

This work investigates the internal mechanics of the Huginn depth-recurrent Transformer on arithmetic tasks to find evidence for latent chain-of-thought reasoning.

Latent Sketchpad Sketching Visual Thoughts to Elicit Multimodal Reasoning in MLLMs

A framework that equips Multimodal Large Language Models with an internal visual scratchpad to generate visual latents during autoregressive reasoning, which can be translated into interpretable sketches.

Multiplex Thinking Reasoning via Token wise Branch and Merge

Make soft-thinking a bit random. Then train with GRPO.

Reasoning Within the Mind Dynamic Multimodal Interleaving in Latent Space

Training-free latent reasoning. Optimize latent reasoning tokens to maximize model confidence, which correlates with accuracy.

Reinforcement Learning via Self-Distillation

Self-Distillation Policy Optimization (SDPO) leverages a model's in-context learning capabilities to convert tokenized environment feedback into dense, logit-level reinforcement learning signals.