LIMI: Less is More for Agency

To learn more about LIMI, feel free to explore our documentation and resources. Our release consists of the following sections:

Model Zoo && Quick Start: Basic usage and demonstrations with Transformers, vLLM, and SGLang for LIMI and LIMI-Air;
Evaluation: Comprehensive evaluation suite with metrics for agentic capabilities assessment;
Prompting: Usage of LIMI with frameworks for agentic applications, tool use, and reasoning tasks.

Overview

LIMI is an agentic model fine‑tuned from GLM‑4.5 using compact, high‑quality data to emphasize:

Targeted capabilities: tool use, multi‑turn correction, spec compliance
Long‑context trajectory with tokenizer‑filtered samples
OpenAI‑style messages with optional function/tool calls

Model Details

Base model: zai-org/GLM-4.5
Training framework: slime
Training data: curated conversations from GAIR/LIMI

Performance on AgencyBench

Our models achieve state-of-the-art performance across multiple agentic evaluation tasks:

Model	FTFC (↑)	RC@3 (↑)	SR@3 (↑)	Avg.
GLM-4.5-Air	15.0	16.1	20.0	17.0
GLM-4.5	37.8	50.0	47.4	45.1
GLM-4.5-CodeAgent	48.0	48.0	47.5	47.8
LIMI-Air	35.4	34.3	33.1	34.3
LIMI	71.7	74.2	74.6	73.5

For detailed benchmark results, experimental setup, and comprehensive comparisons, please refer to our paper.

Model Zoo

Our LIMO model is available on Hugging Face 🤗:

Model	Backbone	Size	Link
LIMI	GLM‑4.5	353B	https://huggingface.co/GAIR/LIMI
LIMI‑Air	GLM‑4.5‑Air	107B	https://huggingface.co/GAIR/LIMI-Air

Datasets

We release our datasets through Hugging Face 🤗:

Name: GAIR/LIMI
Summary: curated agentic SFT data (OpenAI messages, optional tools, normalized tool‑call arguments); current release contains ~78 high‑quality samples.
Link: https://huggingface.co/datasets/GAIR/LIMI

Quick Start

Start with HF Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model = AutoModelForCausalLM.from_pretrained(
    "GAIR/LIMI", torch_dtype="auto", device_map="auto", trust_remote_code=True
)
tok = AutoTokenizer.from_pretrained("GAIR/LIMI", trust_remote_code=True)

messages = [
    {"role": "system", "content": "You are a helpful assistant tasked with discovering mathematical function structures for scientific systems."},
    {"role": "user", "content": "Modify the equation.py function, considering the physical meaning and relationships of the inputs."}
]

text = tok.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tok(text, return_tensors="pt").to(model.device)
out = model.generate(
    **inputs,
    max_new_tokens=4096,
    temperature=0.6,
    top_p=0.95,
    do_sample=True,
)
print(tok.decode(out[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True))

Start with VLLM

from vllm import LLM, SamplingParams
from transformers import AutoTokenizer

llm = LLM(model="GAIR/LIMI", trust_remote_code=True)
tok = AutoTokenizer.from_pretrained("GAIR/LIMI", trust_remote_code=True)
text = tok.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
out = llm.generate(text, SamplingParams(temperature=0.6, max_tokens=4096, top_p=0.95))
print(out[0].outputs[0].text)

Prompting

Messages follow OpenAI chat format; include a grounding system message when helpful.
Example:

[
  {"role": "system", "content": "You are a helpful assistant tasked with discovering mathematical function structures for scientific systems."},
  {"role": "user", "content": "Modify the equation.py function, considering the physical meaning and relationships of the inputs."}
]

Evaluation

We report FTFC (First‑Turn Functional Completeness), SR@R (Success Rate at R), and RC@R (Remaining Chances at R) with R=3.
See the paper for experimental protocol and scores.

Limitations

May produce incorrect tool arguments or overfit to frequent schemas
Not safety‑filtered for sensitive domains; use with guardrails and oversight

License

Inherits base model (GLM‑4.5) terms; verify upstream license before deployment

Citation

@misc{xiao2025limiagency,
      title={LIMI: Less is More for Agency}, 
      author={Yang Xiao and Mohan Jiang and Jie Sun and Keyu Li and Jifan Lin and Yumin Zhuang and Ji Zeng and Shijie Xia and Qishuo Hua and Xuefeng Li and Xiaojie Cai and Tongyu Wang and Yue Zhang and Liming Liu and Xia Wu and Jinlong Hou and Yuan Cheng and Wenjie Li and Xiang Wang and Dequan Wang and Pengfei Liu},
      year={2025},
      eprint={2509.17567},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2509.17567}, 
}

Downloads last month: 7

Safetensors

Model size

353B params

Tensor type

BF16

F32