Bowen Qin
National University of Singapore (NUS), Singapore. eyuansu71@gmail.com
I am a 1st-year Ph.D. student at the National University of Singapore (NUS), advised by Prof. Yao Lu.
Previously, I was a researcher at the Beijing Academy of Artificial Intelligence (BAAI), specializing in the evaluation, alignment, and code intelligence of large language models (LLMs). Previously, I obtained my master’s degree with top honors from the Shenzhen Institute of Advanced Technology (SIAT), Chinese Academy of Sciences (CAS), under the guidance of Prof. Min Yang in 2023. I was a research intern at Alibaba DAMO Academy mentored by Binyuan Hui.
I am a member of the BIRD team, which drives the development of text-to-SQL for real-world database applications.
Throughout my academic journey, I collaborated with many talented researchers, including: Jinyang Li, Duanyu Feng, Binyuan Hui and Yequan Wang.
news
| Jul 01, 2024 | Our team secured 7th place out of over 100 global competitors in the AI Safety and Security Challenge hosted by AI Singapore (AISG) and the National University of Singapore (NUS), and was invited to attend the Singapore International Cyber Week (SICW) 2024. |
|---|
selected publications
- arXivBIRD-INTERACT: Re-imagining Text-to-SQL Evaluation for Large Language Models via Lens of Dynamic InteractionsarXiv preprint arXiv:2510.05318, Oct 2025
- arXivFlagEval Findings Report: A Preliminary Evaluation of Large Reasoning Models on Automatically Verifiable Textual and Visual QuestionsarXiv preprint arXiv:2509.17177, Sep 2025
- ACLFlagEvalMM: A Flexible Framework for Comprehensive Multimodal Model EvaluationIn Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL), Aug 2025
- ACLFlagEval-Arena: A Side-by-Side Comparative Evaluation Platform for Large Language Models and Text-Driven AIGCIn Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL), Aug 2025
- arXivSWE-SQL: Illuminating LLM Pathways to Solve User SQL Issues in Real-World ApplicationsarXiv preprint arXiv:2506.18951, Jun 2025
- arXivMicro-Act: Mitigate Knowledge Conflict in Question Answering via Actionable Self-ReasoningarXiv preprint arXiv:2506.05278, Jun 2025
- arXivThe Price of a Second Thought: On the Evaluation of Reasoning Efficiency in Large Language ModelsarXiv preprint arXiv:2505.22017, May 2025
- COLINGSUN: Exploring intrinsic uncertainties in text-to-SQL parsersIn Proceedings of the 29th International Conference on Computational Linguistics, 2022
- ACL
- SIGKDDProton: Probing schema linking information from pre-trained language models for text-to-sql parsingIn Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2022
- AAAILegend: Leveraging Representation Engineering to Annotate Safety Margin for Preference DatasetsIn Proceedings of the AAAI Conference on Artificial Intelligence, 2025
- arXiv
- NeurIPSCan LLM already serve as a database interface? A big bench for large-scale database grounded text-to-sqlsIn Advances in Neural Information Processing Systems, 2023