Generative Universal Verifier as Multimodal
Meta-Reasoner

Introduction

We introduce Generative Universal Verifier, a novel concept and plugin designed for next-generation multimodal reasoning in vision-language models and unified multimodal models, providing the fundamental capability of reflection and refinement on visual outcomes during the reasoning and generation process.

ViVerBench: A comprehensive benchmark spanning 16 categories of critical tasks for evaluating visual outcomes in multimodal reasoning.
OmniVerifier: Trained on large-scale visual verification data, the first omni-capable generative verifier trained for universal visual verification and achieves notable gains on ViVerBench(+8.3).
OmniVerifier-TTS: A sequential test-time scaling paradigm that leverages the universal verifier to bridge image generation and editing within unified models, enhancing the upper bound of generative ability through iterative fine-grained optimization.
OmniVerifier-M1: A generalist multimodal meta-verifier that leverages symbolic meta-verification and decoupled RL training to achieve robust visual verification, fine-grained error localization, and state-of-the-art performance on ViVerBench.

OmniVerifier advances both reliable reflection during generation and scalable test-time refinement, marking a step toward more trustworthy and controllable next-generation reasoning systems.

New Updates

[2026.05] OmniVerifier-M1 is accepted by ICML 2026.

[2026.05] We release OmniVerifier-M1, advancing multimodal verifiers through symbolic meta-verification.

[2026.02] OmniVerifier is accepted by ICLR 2026 (Oral Paper, Top 1%).

[2025.11] Inference code of two automated pipelines for visual verifier data construction is released.

[2025.10] Inference code of Sequential OmniVerifier-TTS (based on Qwen-Image) is released.

[2025.10] Evaluation code of ViVerBench is released.

[2025.10] Training code of OmniVerifier is released.

Installation

git clone https://github.com/Cominclip/OmniVerifier.git
cd OmniVerifier
pip install -e .

Quick Start: Generated Image Verification

Use the following command to test OmniVerifier-7B on a generated image:

python inference.py

Please modify image_path and prompt to your own settings.

The model will output both an answer and an explanation, indicating whether the image is strictly aligned with the given prompt.

Part1: ViVerBench Evaluation

We provide two evaluation approaches: rule-based and model-based. As a first step, store the model outputs in a JSON file such as your_model.json.

For rule-based evaluation:

python viverbench_eval_rule_based.py --model_response your_model.json

For model-based evaluation, we use GPT-4.1 as the judge model:

python viverbench_eval_model_based.py --model_response your_model.json

Part2: OmniVerifier RL Training

We apply DAPO to directly train Qwen2.5VL-7B without cold start:

bash examples/qwen2_5_vl_7b_dapo.sh

After training, you should merge the checkpoint in Hugging Face format:

python3 scripts/model_merger.py --local_dir checkpoints/omniverifier/exp_name/global_step_1/actor

Part3: OmniVerifier-TTS

We provide the code for sequential Omniverifier-TTS using Qwen-Image. You should first generate the step0 image and use this script for iteratively self-refine:

python sequential_omniverifier_tts.py

Part4: OmniVerifier-M1 RL Training

Decoupled Training

bash examples/M1_decoupled_training.sh

Joint Training

bash examples/M1_joint_training.sh

Citation

@article{zhang2025generative,
  title={Generative Universal Verifier as Multimodal Meta-Reasoner},
  author={Zhang, Xinchen and Zhang, Xiaoying and Wu, Youbin and Cao, Yanbin and Zhang, Renrui and Chu, Ruihang and Yang, Ling and Yang, Yujiu},
  journal={arXiv preprint arXiv:2510.13804},
  year={2025}
}
@article{zhang2026omniverifier,
  title={OmniVerifier-M1: Multimodal Meta-Verifier with Explicit Structured Recalibration},
  author={Zhang, Xinchen and Liu, Bowei and Liu, Jiale and Shi, Chufan and Zhang, Yizhen and Liu, Junhong and Zhang, Youliang and Li, Zhiheng and Yang, Yujiu and Yang, Ling},
  journal={arXiv preprint arXiv:2605.28805},
  year={2026}
}

Acknowledgements

OmniVerifier is built upon several solid works. Thanks to EasyR1 and veRL for their wonderful work and codebase!

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
data		data
examples		examples
scripts		scripts
tests		tests
verl		verl
README.md		README.md
inference.py		inference.py
inference_vllm.py		inference_vllm.py
requirements.txt		requirements.txt
sequential_omniverifier_tts.py		sequential_omniverifier_tts.py
setup.py		setup.py
viverbench_eval_model_based.py		viverbench_eval_model_based.py
viverbench_eval_rule_based.py		viverbench_eval_rule_based.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Generative Universal Verifier as Multimodal
Meta-Reasoner

Introduction

New Updates

Installation

Quick Start: Generated Image Verification

Part1: ViVerBench Evaluation

Part2: OmniVerifier RL Training

Part3: OmniVerifier-TTS

Part4: OmniVerifier-M1 RL Training

Decoupled Training

Joint Training

Citation

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Generative Universal Verifier as Multimodal Meta-Reasoner

Introduction

New Updates

Installation

Quick Start: Generated Image Verification

Part1: ViVerBench Evaluation

Part2: OmniVerifier RL Training

Part3: OmniVerifier-TTS

Part4: OmniVerifier-M1 RL Training

Decoupled Training

Joint Training

Citation

Acknowledgements

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Generative Universal Verifier as Multimodal
Meta-Reasoner

Packages