-
Snowflake/Arctic-Text2SQL-R1-7B
8B • Updated • 5.81k • 42 -
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning
Paper • 2505.24726 • Published • 271 -
Reinforcement Pre-Training
Paper • 2506.08007 • Published • 260 -
Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights
Paper • 2506.16406 • Published • 126
Collections
Discover the best community collections!
Collections including paper arxiv:2508.16072
-
Training a Foundation Model for Materials on a Budget
Paper • 2508.16067 • Published • 2 -
PosterGen: Aesthetic-Aware Paper-to-Poster Generation via Multi-Agent LLMs
Paper • 2508.17188 • Published • 15 -
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs
Paper • 2508.16153 • Published • 131 -
AgentScope 1.0: A Developer-Centric Framework for Building Agentic Applications
Paper • 2508.16279 • Published • 29
-
GPT4Motion: Scripting Physical Motions in Text-to-Video Generation via Blender-Oriented GPT Planning
Paper • 2311.12631 • Published • 15 -
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Paper • 2401.06066 • Published • 56 -
VideoScene: Distilling Video Diffusion Model to Generate 3D Scenes in One Step
Paper • 2504.01956 • Published • 41 -
UrbanLLaVA: A Multi-modal Large Language Model for Urban Intelligence with Spatial Reasoning and Understanding
Paper • 2506.23219 • Published • 7
-
rStar2-Agent: Agentic Reasoning Technical Report
Paper • 2508.20722 • Published • 97 -
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs
Paper • 2508.16153 • Published • 131 -
Beyond Pass@1: Self-Play with Variational Problem Synthesis Sustains RLVR
Paper • 2508.14029 • Published • 117 -
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
Paper • 2508.18265 • Published • 179
-
Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models
Paper • 2506.06395 • Published • 130 -
Magistral
Paper • 2506.10910 • Published • 64 -
Overclocking LLM Reasoning: Monitoring and Controlling Thinking Path Lengths in LLMs
Paper • 2506.07240 • Published • 7 -
Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation
Paper • 2506.09991 • Published • 56
-
Contrastive Decoding Improves Reasoning in Large Language Models
Paper • 2309.09117 • Published • 39 -
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models
Paper • 2310.08491 • Published • 55 -
Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding
Paper • 2411.04282 • Published • 38 -
Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models
Paper • 2411.14432 • Published • 26
-
Snowflake/Arctic-Text2SQL-R1-7B
8B • Updated • 5.81k • 42 -
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning
Paper • 2505.24726 • Published • 271 -
Reinforcement Pre-Training
Paper • 2506.08007 • Published • 260 -
Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights
Paper • 2506.16406 • Published • 126
-
rStar2-Agent: Agentic Reasoning Technical Report
Paper • 2508.20722 • Published • 97 -
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs
Paper • 2508.16153 • Published • 131 -
Beyond Pass@1: Self-Play with Variational Problem Synthesis Sustains RLVR
Paper • 2508.14029 • Published • 117 -
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
Paper • 2508.18265 • Published • 179
-
Training a Foundation Model for Materials on a Budget
Paper • 2508.16067 • Published • 2 -
PosterGen: Aesthetic-Aware Paper-to-Poster Generation via Multi-Agent LLMs
Paper • 2508.17188 • Published • 15 -
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs
Paper • 2508.16153 • Published • 131 -
AgentScope 1.0: A Developer-Centric Framework for Building Agentic Applications
Paper • 2508.16279 • Published • 29
-
Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models
Paper • 2506.06395 • Published • 130 -
Magistral
Paper • 2506.10910 • Published • 64 -
Overclocking LLM Reasoning: Monitoring and Controlling Thinking Path Lengths in LLMs
Paper • 2506.07240 • Published • 7 -
Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation
Paper • 2506.09991 • Published • 56
-
GPT4Motion: Scripting Physical Motions in Text-to-Video Generation via Blender-Oriented GPT Planning
Paper • 2311.12631 • Published • 15 -
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Paper • 2401.06066 • Published • 56 -
VideoScene: Distilling Video Diffusion Model to Generate 3D Scenes in One Step
Paper • 2504.01956 • Published • 41 -
UrbanLLaVA: A Multi-modal Large Language Model for Urban Intelligence with Spatial Reasoning and Understanding
Paper • 2506.23219 • Published • 7
-
Contrastive Decoding Improves Reasoning in Large Language Models
Paper • 2309.09117 • Published • 39 -
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models
Paper • 2310.08491 • Published • 55 -
Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding
Paper • 2411.04282 • Published • 38 -
Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models
Paper • 2411.14432 • Published • 26