RAG-MCP: Mitigating Prompt Bloat in LLM Tool Selection via Retrieval-Augmented Generation Paper • 2505.03275 • Published May 6 • 7
DeepScientist: Advancing Frontier-Pushing Scientific Findings Progressively Paper • 2509.26603 • Published 11 days ago • 16
Middo Collection Dataset & Models for paper "Middo: Model-Informed Dynamic Data Optimization for Enhanced LLM Fine-Tuning via Closed-Loop Learning" • 10 items • Updated 19 days ago • 3
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs Paper • 2508.16153 • Published Aug 22 • 149
RExBench: Can coding agents autonomously implement AI research extensions? Paper • 2506.22598 • Published Jun 27 • 11
CipherBank: Exploring the Boundary of LLM Reasoning Capabilities through Cryptography Challenges Paper • 2504.19093 • Published Apr 27 • 17
MetaLadder: Ascending Mathematical Solution Quality via Analogical-Problem Reasoning Transfer Paper • 2503.14891 • Published Mar 19 • 22
MathFusion: Enhancing Mathematic Problem-solving of LLM through Instruction Fusion Paper • 2503.16212 • Published Mar 20 • 25
LEMMA: Learning from Errors for MatheMatical Advancement in LLMs Paper • 2503.17439 • Published Mar 21 • 15