Bowen Qin

I am a 1st-year Ph.D. student at the National University of Singapore (NUS), advised by Prof. Yao Lu.

Previously, I was a researcher at the Beijing Academy of Artificial Intelligence (BAAI), specializing in the evaluation, alignment, and code intelligence of large language models (LLMs). Previously, I obtained my master’s degree with top honors from the Shenzhen Institute of Advanced Technology (SIAT), Chinese Academy of Sciences (CAS), under the guidance of Prof. Min Yang in 2023. I was a research intern at Alibaba DAMO Academy mentored by Binyuan Hui.

I am a member of the BIRD team, which drives the development of text-to-SQL for real-world database applications.

Throughout my academic journey, I collaborated with many talented researchers, including: Jinyang Li, Duanyu Feng, Binyuan Hui and Yequan Wang.

news

Jul 01, 2024	Our team secured 7th place out of over 100 global competitors in the AI Safety and Security Challenge hosted by AI Singapore (AISG) and the National University of Singapore (NUS), and was invited to attend the Singapore International Cyber Week (SICW) 2024.

selected publications

arXiv

BIRD-INTERACT: Re-imagining Text-to-SQL Evaluation for Large Language Models via Lens of Dynamic Interactions

Nan Huo, Xiaohan Xu, Jinyang Li, and 8 more authors

arXiv preprint arXiv:2510.05318, Oct 2025

PDF
arXiv

FlagEval Findings Report: A Preliminary Evaluation of Large Reasoning Models on Automatically Verifiable Textual and Visual Questions

Bowen Qin, C Yue, F Yin, and 7 more authors

arXiv preprint arXiv:2509.17177, Sep 2025

PDF
ACL

FlagEvalMM: A Flexible Framework for Comprehensive Multimodal Model Evaluation

Z He, Y Liu, J Zheng, and 5 more authors

In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL), Aug 2025
ACL

FlagEval-Arena: A Side-by-Side Comparative Evaluation Platform for Large Language Models and Text-Driven AIGC

JS Zheng, R Xuan, Bowen Qin, and 4 more authors

In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL), Aug 2025
arXiv

SWE-SQL: Illuminating LLM Pathways to Solve User SQL Issues in Real-World Applications

Jinyang Li, Xiaolong Li, Reynold Cheng, and 2 more authors

arXiv preprint arXiv:2506.18951, Jun 2025

PDF
arXiv

Micro-Act: Mitigate Knowledge Conflict in Question Answering via Actionable Self-Reasoning

Nan Huo, Jinyang Li, Bowen Qin, and 5 more authors

arXiv preprint arXiv:2506.05278, Jun 2025

PDF
arXiv

The Price of a Second Thought: On the Evaluation of Reasoning Efficiency in Large Language Models

Siqi Fan, Bowen Qin, Peng Han, and 3 more authors

arXiv preprint arXiv:2505.22017, May 2025
COLING

SUN: Exploring intrinsic uncertainties in text-to-SQL parsers

Bowen Qin, Lihan Wang, Binyuan Hui, and 7 more authors

In Proceedings of the 29th International Conference on Computational Linguistics, 2022

Best Paper PDF

Best paper recommended, reviewer’s score: 5 / 5 / 4
ACL

SHARE: An SLM-based Hierarchical Action CorREction Assistant for Text-to-SQL

Ge Qu, Jinyang Li, Bowen Qin, and 4 more authors

In Association for Computational Linguistics (ACL Findings), 2024

PDF
SIGKDD

Proton: Probing schema linking information from pre-trained language models for text-to-sql parsing

Lihan Wang, Bowen Qin, Binyuan Hui, and 8 more authors

In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2022

PDF
AAAI

Legend: Leveraging Representation Engineering to Annotate Safety Margin for Preference Datasets

Duanyu Feng, Bowen Qin, Chen Huang, and 3 more authors

In Proceedings of the AAAI Conference on Artificial Intelligence, 2025

PDF
arXiv

HanFei-1.0: China’s First Large-Scale Legal Model

Wanwei He, Jiabao Wen, Lei Zhang, and 7 more authors

arXiv preprint, 2023

PDF
NeurIPS

Can LLM already serve as a database interface? A big bench for large-scale database grounded text-to-sqls

Jinyang Li, Binyuan Hui, Ge Qu, and 8 more authors

In Advances in Neural Information Processing Systems, 2023

PDF