董骞 - 清华大学

我目前在清华大学计算机科学与技术系信息检索实验室（THUIR）攻读博士学位，预计2026年6月毕业。很荣幸在马少平教授、刘奕群教授和艾清遥教授的指导下进行学术研究。

我的研究兴趣是 algo & infra co-designed model architectures for scalable parameters, context, and especially intelligence。

"Do not go gentle into that good night."

— Dylan Thomas

最新动态

2026.02	🚀 GLM-5 技术报告发布！我是模型架构核心贡献者之一。
2025.12	🚀 GLM-4.7 发布！Blog
2025.11	📄 SelfRACG accepted to EMNLP 2025 — Letting LLMs self-express retrieval queries for better code generation in one model arch.
2025.09	🚀 GLM-4.6 发布！Blog
2025.08	🚀 GLM-4.5 技术报告发布！我是 post-training 阶段 sparse attention adaptation 方向的贡献者之一。
2025.07	📄 Qilin accepted to SIGIR 2025 — A multimodal IR dataset capturing real APP-level user sessions.
2025.04	📄 DecoupledRAG accepted to WWW 2025 — Decoupling context and knowledge via cross-attention for efficient RAG.
2024.07	📄 RLCF accepted to SIGIR 2024 — Aligning LLMs for IR through unsupervised contrastive feedback.
2023.10	📄 I³Retriever accepted to CIKM 2023 — Incorporating implicit query-document interaction into retrievers via a generative module.
2023.07	📄 T²Ranking accepted to SIGIR 2023 — A large-scale Chinese passage ranking benchmark.
2022.07	📄 KERM accepted to SIGIR 2022 — Incorporating explicit knowledge into PLMs for passage re-ranking.
2022.02	📄 DGRe published in Data Science and Engineering — Disentangled Causal Intervention for BERT-based Ad Hoc Document Ranking.
2021.07	📄 R-FORMER accepted to SIGIR 2021 — Modeling Global Consistency Graphs for Entangled Multi-Task Legal Judgment Prediction.
2021.04	📄 LGRe accepted to DASFAA 2021 — Refining BERT-based Document Ranking via Latent Graph Recurrent Networks.

代表性论文

完整论文列表 (Google Scholar)

主要作者论文

SelfRACG: Enabling LLMs to Self-Express and Retrieve for Code Generation
EMNLP 2025
TH-ACCF-B
Qilin: A Multimodal Information Retrieval Dataset with APP-level User Sessions
SIGIR 2025
TH-ACCF-A
DecoupledRAG: An Efficient and Effective RAG Framework via Cross Attention
WWW 2025
TH-ACCF-A
Unsupervised LLM Alignment for IR via Contrastive Feedback
SIGIR 2024
TH-ACCF-A
T²Ranking: A Large-scale Chinese Benchmark for Passage Ranking
SIGIR 2023
TH-ACCF-A
I³Retriever: Incorporating Implicit Interaction in PLMs for Passage Retrieval
CIKM 2023
TH-BCCF-B
Incorporating Explicit Knowledge in PLMs for Passage Re-ranking
SIGIR 2022
TH-ACCF-A
Legal Judgment Prediction via Relational Learning
SIGIR 2021
TH-ACCF-A

合作论文

CalibraEval: Calibrating Prediction Distribution to Mitigate Selection Bias in LLMs-as-Judges
ACL 2025
CCF-A
DELTA: Discriminative Encoder for Legal Case Retrieval via Structural Word Alignment
AAAI 2025
CCF-A
BLADE: Enhancing Black-box LLMs with Small Domain-Specific Models
AAAI 2025
CCF-A
SAILER: Structure-aware Pre-trained Language Model for Legal Case Retrieval
SIGIR 2023
CCF-A

教育背景

2022 –	博士研究生，清华大学计算机科学与技术系
2019 – 2022	工程硕士，中国科学院软件研究所
2015 – 2019	工程学士，华南理工大学软件学院

荣誉奖项

2021	国家奖学金（Top 1%）

关于我

我是一个精酿啤酒爱好者，从清爽的小麦啤到浓郁的 IPA，从比利时白啤到赛松，我热衷于品味来自世界各地的精酿佳作 🍻