4096 maximum context? Bug?
Can someone verify this for me? 4096 context window on this build, and really balloons in memory requirements compared to GGUF equivalents. Is this working right?
Can someone verify this for me? 4096 context window on this build, and really balloons in memory requirements compared to GGUF equivalents. Is this working right?
Have the same problem. Seems like all mlx gemma 3 models are broken this way.
🥲 加载模型失败
Failed to load model
Error when loading model: ValueError: Model type gemma3 not supported.
same problem
🥲 加载模型失败 Failed to load model Error when loading model: ValueError: Model type gemma3 not supported.
same problem
That is not the same problem. Update your MLX engine and LM Studio (if using that) to the latest version to load the model. Or start a new thread.
Can confirm the Context Size config is wrong in 27B MLX (shows 4096 when should be reporting 128k).
The config.json file for Gemma-3 models does not contain the "max_position_embeddings" value.
I suspect 4096 is the default value used by the software you are using.