Requested tokens (537) exceed context window of 512

#4
by ankushrastogi04 - opened

Hi,

I am trying to generate a summary using gemma-3-27b-it-Q4_K_M.gguf, and I am using llama_cpp to load the model, but I keep getting the error
"Requested tokens (537) exceed context window of 512 ". Can you please help me?

lm = Llama(
model_path="gemma-3-27b-it-Q4_K_M.gguf",
n_gpu_layers=100)

response = llm(prompt)

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment