arxiv:2509.06631

Guided Decoding and Its Critical Role in Retrieval-Augmented Generation

Published on Sep 8

· Submitted by

MElHuseyni on Sep 9

Upvote

Authors:

Özay Ezerceli ,

Mahmut El Huseyni ,

Reyhan Bayraktar

Abstract

Guided decoding methods in Retrieval-Augmented Generation (RAG) systems are evaluated for structured output generation, revealing performance variations across different prompting setups.

AI-generated summary

The integration of Large Language Models (LLMs) into various applications has driven the need for structured and reliable responses. A key challenge in Retrieval-Augmented Generation (RAG) systems is ensuring that outputs align with expected formats while minimizing hallucinations. This study examines the role of guided decoding in RAG systems, comparing three methods, Outlines, XGrammar, and LM Format Enforcer, across different multi-turn prompting setups (0-turn, 1-turn, and 2-turn). By evaluating success rates, hallucination rates, and output quality, we provide insights into their performance and applicability. Our findings reveal how multi-turn interactions influence guided decoding, uncovering unexpected performance variations that can inform method selection for specific use cases. This work advances the understanding of structured output generation in RAG systems, offering both theoretical insights and practical guidance for LLM deployment.

View arXiv page View PDF Add to collection

Community

MElHuseyni

Paper author Paper submitter about 15 hours ago

This study investigates how guided decoding enhances Retrieval-Augmented Generation (RAG) by enforcing structured outputs and reducing hallucinations. It compares three methods—Outlines, XGrammar, and LM Format Enforcer—across zero, one, and two-turn prompting setups. Evaluating success rates, hallucination rates, and output quality, the authors reveal that multi-turn prompts significantly affect performance, with different decoding strategies excelling in distinct scenarios. These insights help select appropriate methods for structured, reliable RAG applications

librarian-bot

about 3 hours ago

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2509.06631 in a model README.md to link it from this page.

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2509.06631 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.