OptMATH

OptMATH: A Scalable Bidirectional Data Synthesis Framework for Optimization Modeling

Overview

OptMATH is a scalable framework for synthesizing high-quality optimization modeling datasets. The framework consists of a bidirectional pipeline that:

Generates problem data (PD) with controllable complexity from seed mathematical formulations (MF)
Creates natural language (NL) descriptions through backtranslation
Validates the correspondence between NL and PD through forward modeling and rejection sampling

Key Features

Scalable data synthesis framework for optimization modeling
Coverage of 10+ real-world applications through 53 seed generators
Released **OptMATH-Train** with over 200K high-quality training instances and OptMATH-Bench, a challenging benchmark pushing the boundaries of LLM capabilities
State-of-the-art performance on multiple benchmarks

Dataset

OptMATH consists of two main components:

OptMATH-Train

Over 200k high-quality and diverse optimization problems.
Covers diverse optimization scenarios including logistics, supply chain, manufacturing etc.

OptMATH-Bench

A challenging benchmark comprising “hard instances” characterized by:

Extended natural language contexts (2.9× longer than MAMO EasyLP)
Complex constraints
Coverage of various problem types (LP, MILP, IP, NLP, SOCP)

Results

We use the LLaMAFactory framework for fine-tuning. For more details, please refer to **https://github.com/hiyouga/LLaMA-Factory**.

Main Results

The primary results are presented in Table 1. First, our best-performing model, OptMATH-Qwen2.5-32B, achieves superior performance across all benchmarks, surpassing proprietary large language models such as GPT-3.5-Turbo, GPT4, and Deepseek-V3, despite these models having tens of times more parameters. Furthermore, our OptMATH-Qwen2.5-7B outperforms ORLM-LLaMA-3-8B, a model of comparable size, on all benchmarks and demonstrates performance only marginally inferior to Deepseek-V3. Collectively, these results demonstrate that **training with OptMATH-Train significantly enhances the model’s optimization modeling capabilities.**

Ablation Study

Ablation study on Model Size

Ablation study on Data Size

As shown in the figure below, the performance of Qwen2.5-1.5B across different benchmarks varies with the amount of training data. The model demonstrates **notable improvements in optimization modeling capabilities even with a small portion of the OptMATH-Train dataset.** The performance gains gradually level off as more training data is added, showing a typical diminishing returns pattern.** **

Citation

 @misc{lu2025optmathscalablebidirectionaldata,
       title={OptMATH: A Scalable Bidirectional Data Synthesis Framework for Optimization Modeling}, 
       author={Hongliang Lu and Zhonglin Xie and Yaoyu Wu and Can Ren and 
               Yuxuan Chen and Zaiwen Wen},
       year={2025},
       eprint={2502.11102},
       archivePrefix={arXiv},
       primaryClass={cs.AI},
       url={https://arxiv.org/abs/2502.11102}, 
 }

Contact

We hope that the package is useful for your application. If you have any bug reports or comments, please feel free to email one of the toolbox authors:

**Hongliang Lu, **lhl@pku.edu.cn.
**Zhonglin Xie, **zlxie@pku.edu.cn
**Zaiwen Wen, **wenzw@pku.edu.cn