There is no such thing as a tokenizer-free lunch
By
•
•
71ModernVBERT: Towards Smaller Visual Document Retrievers
By
and 4 others
•
•
20Model Quality: Hugging Face Is All You Need
By
•
•
20CU-1 for Autonomous UI Agent Systems: An Open Alternative to Proprietary Solutions
By
•
•
12Code a simple RAG from scratch
By
•
•
212When Does Reasoning Matter? Unpacking the Contribution of Reasoning to LLM Performance
By
and 1 other
•
•
10How I Trained Action Chunking Transformer (ACT) on SO-101: My Journey, Gotchas, and Lessons
By
•
•
9Preserving Agency: Why AI Safety Needs Community, Not Corporate Control
By
•
•
9Uncensor any LLM with abliteration
By
•
•
685Small Language Models (SLM): A Comprehensive Overview
By
•
•
78Gaia2 Leaderboard Update: New Models and New Observations
By
and 3 others
•
•
6From GRPO to DAPO and GSPO: What, Why, and How
By
•
•
35RexBERT: Encoders for a brave new world of E-Commerce
By
and 1 other
•
•
46Nemotron-Personas-Japan: Synthesized Data for Sovereign AI
By
and 6 others
•
•
25Nemotron-Personas-Japan: ソブリン AI のための合成データセット
By
and 6 others
•
•
7Cactus: High-Performance AI Inference on Any Smartphone
By
•
•
5Introduction to State Space Models (SSM)
By
•
•
176arXiv实用技巧,如何让你的paper关注度变高?
By
•
•
14Fine-Tuning Your First Large Language Model (LLM) with PyTorch and Hugging Face
By
•
•
72How to Train an Antibody Developability Model
By
and 1 other
•
•
14