-
ExGRPO: Learning to Reason from Experience
Paper • 2510.02245 • Published • 70 -
A Practitioner's Guide to Multi-turn Agentic Reinforcement Learning
Paper • 2510.01132 • Published • 5 -
Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models
Paper • 2510.04618 • Published • 41 -
MixReasoning: Switching Modes to Think
Paper • 2510.06052 • Published • 19
Collections
Discover the best community collections!
Collections including paper arxiv:2510.02245
-
HalluGuard: Evidence-Grounded Small Reasoning Models to Mitigate Hallucinations in Retrieval-Augmented Generation
Paper • 2510.00880 • Published -
Position: Privacy Is Not Just Memorization!
Paper • 2510.01645 • Published • 1 -
Less LLM, More Documents: Searching for Improved RAG
Paper • 2510.02657 • Published • 2 -
ExGRPO: Learning to Reason from Experience
Paper • 2510.02245 • Published • 70
-
What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective
Paper • 2410.23743 • Published • 63 -
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level
Paper • 2411.03562 • Published • 67 -
Polynomial Composition Activations: Unleashing the Dynamics of Large Language Models
Paper • 2411.03884 • Published • 28 -
MM-IQ: Benchmarking Human-Like Abstraction and Reasoning in Multimodal Models
Paper • 2502.00698 • Published • 24
-
ExGRPO: Learning to Reason from Experience
Paper • 2510.02245 • Published • 70 -
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems
Paper • 2508.07407 • Published • 96 -
rStar2-Agent: Agentic Reasoning Technical Report
Paper • 2508.20722 • Published • 111 -
Memory-R1: Enhancing Large Language Model Agents to Manage and Utilize Memories via Reinforcement Learning
Paper • 2508.19828 • Published • 6
-
LLM Pruning and Distillation in Practice: The Minitron Approach
Paper • 2408.11796 • Published • 57 -
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering
Paper • 2408.09174 • Published • 52 -
To Code, or Not To Code? Exploring Impact of Code in Pre-training
Paper • 2408.10914 • Published • 43 -
Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications
Paper • 2408.11878 • Published • 63
-
ExGRPO: Learning to Reason from Experience
Paper • 2510.02245 • Published • 70 -
A Practitioner's Guide to Multi-turn Agentic Reinforcement Learning
Paper • 2510.01132 • Published • 5 -
Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models
Paper • 2510.04618 • Published • 41 -
MixReasoning: Switching Modes to Think
Paper • 2510.06052 • Published • 19
-
HalluGuard: Evidence-Grounded Small Reasoning Models to Mitigate Hallucinations in Retrieval-Augmented Generation
Paper • 2510.00880 • Published -
Position: Privacy Is Not Just Memorization!
Paper • 2510.01645 • Published • 1 -
Less LLM, More Documents: Searching for Improved RAG
Paper • 2510.02657 • Published • 2 -
ExGRPO: Learning to Reason from Experience
Paper • 2510.02245 • Published • 70
-
ExGRPO: Learning to Reason from Experience
Paper • 2510.02245 • Published • 70 -
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems
Paper • 2508.07407 • Published • 96 -
rStar2-Agent: Agentic Reasoning Technical Report
Paper • 2508.20722 • Published • 111 -
Memory-R1: Enhancing Large Language Model Agents to Manage and Utilize Memories via Reinforcement Learning
Paper • 2508.19828 • Published • 6
-
What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective
Paper • 2410.23743 • Published • 63 -
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level
Paper • 2411.03562 • Published • 67 -
Polynomial Composition Activations: Unleashing the Dynamics of Large Language Models
Paper • 2411.03884 • Published • 28 -
MM-IQ: Benchmarking Human-Like Abstraction and Reasoning in Multimodal Models
Paper • 2502.00698 • Published • 24
-
LLM Pruning and Distillation in Practice: The Minitron Approach
Paper • 2408.11796 • Published • 57 -
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering
Paper • 2408.09174 • Published • 52 -
To Code, or Not To Code? Exploring Impact of Code in Pre-training
Paper • 2408.10914 • Published • 43 -
Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications
Paper • 2408.11878 • Published • 63