OptMATH

OptMATH

OptMATH: A Scalable Bidirectional Data Synthesis Framework for Optimization Modeling

Paper Dataset GitHub stars

Overview

OptMATH is a scalable framework for synthesizing high-quality optimization modeling datasets. The framework consists of a bidirectional pipeline that:

  1. Generates problem data (PD) with controllable complexity from seed mathematical formulations (MF)
  2. Creates natural language (NL) descriptions through backtranslation
  3. Validates the correspondence between NL and PD through forward modeling and rejection sampling Framework Overview

Key Features

  • Scalable data synthesis framework for optimization modeling
  • Coverage of 10+ real-world applications through 53 seed generators
  • Released **OptMATH-Train** with over 200K high-quality training instances and OptMATH-Bench, a challenging benchmark pushing the boundaries of LLM capabilities
  • State-of-the-art performance on multiple benchmarks

Dataset

OptMATH consists of two main components:

OptMATH-Train

  • Over 200k high-quality and diverse optimization problems.
  • Covers diverse optimization scenarios including logistics, supply chain, manufacturing etc. Application Scenarios Distribution

OptMATH-Bench

A challenging benchmark comprising “hard instances” characterized by:

  • Extended natural language contexts (2.9× longer than MAMO EasyLP)
  • Complex constraints
  • Coverage of various problem types (LP, MILP, IP, NLP, SOCP)

Results

We use the LLaMAFactory framework for fine-tuning. For more details, please refer to **https://github.com/hiyouga/LLaMA-Factory**.

Main Results

The primary results are presented in Table 1. First, our best-performing model, OptMATH-Qwen2.5-32B, achieves superior performance across all benchmarks, surpassing proprietary large language models such as GPT-3.5-Turbo, GPT4, and Deepseek-V3, despite these models having tens of times more parameters. Furthermore, our OptMATH-Qwen2.5-7B outperforms ORLM-LLaMA-3-8B, a model of comparable size, on all benchmarks and demonstrates performance only marginally inferior to Deepseek-V3. Collectively, these results demonstrate that **training with OptMATH-Train significantly enhances the model’s optimization modeling capabilities.** Performance Comparison

Ablation Study

Ablation study on Model Size

Model Scaling

Ablation study on Data Size

As shown in the figure below, the performance of Qwen2.5-1.5B across different benchmarks varies with the amount of training data. The model demonstrates **notable improvements in optimization modeling capabilities even with a small portion of the OptMATH-Train dataset.** The performance gains gradually level off as more training data is added, showing a typical diminishing returns pattern.** **Data Scaling(1.5B))

Citation

1
2
3
4
5
6
7
8
9
10
 @misc{lu2025optmathscalablebidirectionaldata,
       title={OptMATH: A Scalable Bidirectional Data Synthesis Framework for Optimization Modeling}, 
       author={Hongliang Lu and Zhonglin Xie and Yaoyu Wu and Can Ren and 
               Yuxuan Chen and Zaiwen Wen},
       year={2025},
       eprint={2502.11102},
       archivePrefix={arXiv},
       primaryClass={cs.AI},
       url={https://arxiv.org/abs/2502.11102}, 
 }

Contact

We hope that the package is useful for your application. If you have any bug reports or comments, please feel free to email one of the toolbox authors: