Requested tokens (537) exceed context window of 512
#4
by
ankushrastogi04
- opened
Hi,
I am trying to generate a summary using gemma-3-27b-it-Q4_K_M.gguf, and I am using llama_cpp to load the model, but I keep getting the error
"Requested tokens (537) exceed context window of 512 ". Can you please help me?
lm = Llama(
model_path="gemma-3-27b-it-Q4_K_M.gguf",
n_gpu_layers=100)
response = llm(prompt)