4096 maximum context? Bug?

by Shanesan - opened Mar 12

Mar 12

Can someone verify this for me? 4096 context window on this build, and really balloons in memory requirements compared to GGUF equivalents. Is this working right?

NOBODYVONNOTHING

Mar 12

This comment has been hidden (marked as Off-Topic)

ProtoParticle

MLX Community org Mar 13

Can someone verify this for me? 4096 context window on this build, and really balloons in memory requirements compared to GGUF equivalents. Is this working right?

Have the same problem. Seems like all mlx gemma 3 models are broken this way.

chuyuan

Mar 14

🥲 加载模型失败

Failed to load model

Error when loading model: ValueError: Model type gemma3 not supported.

same problem

miabchdave

MLX Community org Mar 14

•

edited Mar 14

🥲 加载模型失败

Failed to load model

Error when loading model: ValueError: Model type gemma3 not supported.

same problem

That is not the same problem. Update your MLX engine and LM Studio (if using that) to the latest version to load the model. Or start a new thread.

miabchdave

MLX Community org Mar 14

Can confirm the Context Size config is wrong in 27B MLX (shows 4096 when should be reporting 128k).

gitkaz

Mar 19

The config.json file for Gemma-3 models does not contain the "max_position_embeddings" value.
I suspect 4096 is the default value used by the software you are using.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment