董骞

董骞

我目前在清华大学 计算机科学与技术系 信息检索实验室(THUIR)攻读博士学位,预计2026年6月毕业。 很荣幸在马少平教授、 刘奕群教授和 艾清遥教授的指导下进行学术研究。

我的研究兴趣是 algo & infra co-designed model architectures for scalable parameters, context, and especially intelligence

            XHS

"Do not go gentle into that good night."

— Dylan Thomas

最新动态

2026.02🚀 GLM-5 技术报告发布!我是模型架构核心贡献者之一。
2025.12🚀 GLM-4.7 发布!Blog
2025.11📄 SelfRACG accepted to EMNLP 2025 — Letting LLMs self-express retrieval queries for better code generation in one model arch.
2025.09🚀 GLM-4.6 发布!Blog
2025.08🚀 GLM-4.5 技术报告发布!我是 post-training 阶段 sparse attention adaptation 方向的贡献者之一。
2025.07📄 Qilin accepted to SIGIR 2025 — A multimodal IR dataset capturing real APP-level user sessions.
2025.04📄 DecoupledRAG accepted to WWW 2025 — Decoupling context and knowledge via cross-attention for efficient RAG.
2024.07📄 RLCF accepted to SIGIR 2024 — Aligning LLMs for IR through unsupervised contrastive feedback.
2023.10📄 I³Retriever accepted to CIKM 2023 — Incorporating implicit query-document interaction into retrievers via a generative module.
2023.07📄 T²Ranking accepted to SIGIR 2023 — A large-scale Chinese passage ranking benchmark.
2022.07📄 KERM accepted to SIGIR 2022 — Incorporating explicit knowledge into PLMs for passage re-ranking.
2022.02📄 DGRe published in Data Science and Engineering — Disentangled Causal Intervention for BERT-based Ad Hoc Document Ranking.
2021.07📄 R-FORMER accepted to SIGIR 2021 — Modeling Global Consistency Graphs for Entangled Multi-Task Legal Judgment Prediction.
2021.04📄 LGRe accepted to DASFAA 2021 — Refining BERT-based Document Ranking via Latent Graph Recurrent Networks.

代表性论文

完整论文列表 (Google Scholar)

主要作者论文

  • SelfRACG: Enabling LLMs to Self-Express and Retrieve for Code Generation
    EMNLP 2025
    TH-ACCF-B
  • Qilin: A Multimodal Information Retrieval Dataset with APP-level User Sessions
    SIGIR 2025
    TH-ACCF-A
  • DecoupledRAG: An Efficient and Effective RAG Framework via Cross Attention
    WWW 2025
    TH-ACCF-A
  • Unsupervised LLM Alignment for IR via Contrastive Feedback
    SIGIR 2024
    TH-ACCF-A
  • T²Ranking: A Large-scale Chinese Benchmark for Passage Ranking
    SIGIR 2023
    TH-ACCF-A
  • I³Retriever: Incorporating Implicit Interaction in PLMs for Passage Retrieval
    CIKM 2023
    TH-BCCF-B
  • Incorporating Explicit Knowledge in PLMs for Passage Re-ranking
    SIGIR 2022
    TH-ACCF-A
  • Legal Judgment Prediction via Relational Learning
    SIGIR 2021
    TH-ACCF-A

合作论文

  • CalibraEval: Calibrating Prediction Distribution to Mitigate Selection Bias in LLMs-as-Judges
    ACL 2025
    CCF-A
  • DELTA: Discriminative Encoder for Legal Case Retrieval via Structural Word Alignment
    AAAI 2025
    CCF-A
  • BLADE: Enhancing Black-box LLMs with Small Domain-Specific Models
    AAAI 2025
    CCF-A
  • SAILER: Structure-aware Pre-trained Language Model for Legal Case Retrieval
    SIGIR 2023
    CCF-A

教育背景

2022 –博士研究生清华大学计算机科学与技术系
2019 – 2022工程硕士中国科学院软件研究所
2015 – 2019工程学士华南理工大学软件学院

荣誉奖项

2021国家奖学金(Top 1%

关于我

我是一个精酿啤酒爱好者,从清爽的小麦啤到浓郁的 IPA,从比利时白啤到赛松,我热衷于品味来自世界各地的精酿佳作 🍻