4096 maximum context? Bug?

#1
by Shanesan - opened

Can someone verify this for me? 4096 context window on this build, and really balloons in memory requirements compared to GGUF equivalents. Is this working right?

This comment has been hidden (marked as Off-Topic)
MLX Community org

Can someone verify this for me? 4096 context window on this build, and really balloons in memory requirements compared to GGUF equivalents. Is this working right?

Have the same problem. Seems like all mlx gemma 3 models are broken this way.

🥲 加载模型失败

Failed to load model

Error when loading model: ValueError: Model type gemma3 not supported.

same problem

MLX Community org
edited Mar 14
🥲 加载模型失败

Failed to load model

Error when loading model: ValueError: Model type gemma3 not supported.

same problem

That is not the same problem. Update your MLX engine and LM Studio (if using that) to the latest version to load the model. Or start a new thread.

MLX Community org

Can confirm the Context Size config is wrong in 27B MLX (shows 4096 when should be reporting 128k).

The config.json file for Gemma-3 models does not contain the "max_position_embeddings" value.
I suspect 4096 is the default value used by the software you are using.

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment