Bowen Qin 秦博文

Researcher
Beijing Academy of Artificial Intelligence

eyuansu71 AT gmail DOT com

Short Bio

Bowen Qin is a researcher at the Beijing Academy of Artificial Intelligence (BAAI), specializing in the evaluation, alignment, and code intelligence of large language models (LLMs). At BAAI, he contributes to FlagEval platform, where he leads the design of its core subjective evaluation system, enhancing the accuracy and scalability of LLM evaluation. He earned his master’s degree with top honors from the Shenzhen Institute of Advanced Technology (SIAT), Chinese Academy of Sciences (CAS), under the guidance by Prof. Min Yang at 2023. He was a research intern at Alibaba DAMO Academy mentored by Binyuan Hui, who is very insightful and patient mentor. He is a member of the BIRD team, which drives the development of text-to-SQL for read-world database applications. Throughout his academic journey, he collaborated with many talent researchers, including: Jinyang Li, Duanyu Feng, Binyuan Hui and Yequan Wang.

News

2023.12 We developed [FlagJudge], a GenRM model that predates [DeepMind’s similar work] by one year.

2024.7 Our team secured 7th place out of over 100 global competitors in the AI Safety and Security Challenge hosted by AI Singapore (AISG) and the National University of Singapore (NUS), and was invited to attend the Singapore International Cyber Week (SICW) 2024.

2024.09 FlagEval releases the latest evaluation leaderboard, including almost 300 models. It covers subjective evaluation, objective evaluation, arena battle evaluation, debate evaluation, multimodal evaluation, text-to-image evaluation, text-to-video evaluation, and more. [FlagEval Leaderboard]

Pre-print Draft [Google Scholar]

(Interns or Students, *Equal Contribution)

  1. Towards analyzing and understanding the limitations of DPO: A theoretical perspective
    Duanyu Feng, Bowen Qin, Chen Huang, Zheng Zhang, Wenqiang Lei
    arXiv preprint arXiv:2404.04626, 2024, [pdf] (Although this paper is simple and has not been officially published, I think it is the most interesting work I have done this year.)

  2. Towards understanding the influence of reward margin on preference model performance
    Bowen Qin, Duanyu Feng, Xi Yang
    arXiv preprint arXiv:2404.04932, 2024, [pdf]

  3. Conversational Few-Shot Prompting: Rethinking Few-Shot Prompting for Chat Language Model
    Bowen Qin, Duanyu Feng, Xi Yang
    2024

Selected Publications [Google Scholar]

(Interns or Students, *Equal Contribution)

Publication List

  1. Legend: Leveraging Representation Engineering to Annotate Safety Margin for Preference Datasets
    Duanyu Feng, Bowen Qin, Chen Huang, Youcheng Huang, Zheng Zhang, Wenqiang Lei
    Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2025, [pdf]

  2. Before generation, align it! A novel and effective strategy for mitigating hallucinations in text-to-sql generation
    Ge Qu, Jinyang Li, Bowen Li, Bowen Qin, Nan Huo, Chenhao Ma, Reynold Cheng
    Association for Computational Linguistics (ACL Findings) 2024, [pdf]

  3. Can LLM already serve as a database interface? A big bench for large-scale database grounded text-to-sqls
    Jinyang Li, Binyuan Hui, Ge Qu, Jiaxi Yang, Binhua Li, Bowen Li, Bailin Wang, Bowen Qin, Ruiying Geng, Nan Huo, et al.
    Advances in Neural Information Processing Systems (NeurIPS), 2023, [pdf]

  4. Graphix-t5: Mixing pre-trained transformers with graph-aware layers for text-to-sql parsing
    Jinyang Li, Binyuan Hui, Reynold Cheng, Bowen Qin, Chenhao Ma, Nan Huo, Fei Huang, Wenyu Du, Luo Si, Yongbin Li
    Proceedings of the AAAI conference on artificial intelligence (AAAI), 2023, [pdf]

  5. FLM-101B: An open LLM and how to train it with $100k budget
    Xiang Li, Yiqun Yao, Xin Jiang, Xuezhi Fang, Xuying Meng, Siqi Fan, Peng Han, Jing Li, Li Du, Bowen Qin, et al.
    arXiv preprint arXiv:2309.03852, 2023, [pdf]

  6. SUN: Exploring intrinsic uncertainties in text-to-SQL parsers
    Bowen Qin, Lihan Wang, Binyuan Hui, Bowen Li, Xiangpeng Wei, Binhua Li, Fei Huang, Luo Si, Min Yang, Yongbin Li
    Proceedings of the 29th International Conference on Computational Linguistics (COLING), 2022
    [pdf] (Best paper recommonded, reviewer’s score: 5 / 5 / 4)

  7. Sdcup: Schema dependency-enhanced curriculum pre-training for table semantic parsing
    Bowen Qin, Lihan Wang, Binyuan Hui, Ruiying Geng, Zheng Cao, Min Yang, Jian Sun, Yongbin Li
    Knowledge-Based Systems, 2022, [pdf]

  8. Proton: Probing schema linking information from pre-trained language models for text-to-sql parsing
    Lihan Wang, Bowen Qin*, Binyuan Hui, Bowen Li, Min Yang, Bailin Wang, Binhua Li, Jian Sun, Fei Huang, Luo Si, et al.
    Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (SIGKDD), 2022, [pdf]

  9. A survey on text-to-sql parsing: Concepts, methods, and future directions
    Bowen Qin, Binyuan Hui, Lihan Wang, Min Yang, Jinyang Li, Binhua Li, Ruiying Geng, Rongyu Cao, Jian Sun, Luo Si, et al.
    arXiv preprint arXiv:2208.13629, 2022 , [pdf]

  10. Exploring auxiliary reasoning tasks for task-oriented dialog systems with meta cooperative learning
    Bowen Qin, Min Yang, Lidong Bing, Qingshan Jiang, Chengming Li, Ruifeng Xu
    Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2021 , [pdf]

  11. S^2 SQL: Injecting Syntax to Question-Schema Interaction Graph Encoder for Text-to-SQL Parsers
    Binyuan Hui, Ruiying Geng, Lihan Wang, Bowen Qin, Bowen Li, Jian Sun, Yongbin Li
    Association for Computational Linguistics (ACL Findings), 2022, [pdf]

  12. FR–KDE: a hybrid fuzzy rule-based information fusion method with its application in biomedical classification
    Xingjian Song, Bowen Qin*, Fuyuan Xiao
    International Journal of Fuzzy Systems, 2021 , [pdf]

  13. A fuzzy preference-based Dempster-Shafer evidence theory for decision fusion
    Chaosheng Zhu, Bowen Qin*, Fuyuan Xiao, Zehong Cao, Hari Mohan Pandey
    Information Sciences, 2021 , [pdf]

Honors and Awards

Professional Services

Reviewer

Co-workers

Visitor Map