Llamacpp Quantizations of Qwen3-235B-A22B

Original model: Qwen/Qwen3-235B-A22B-Instruct-2507.

All quants made based on bartowski1182-llama.cpp.

All quants using BF16 convertion from unsloth/Qwen3-235B-A22B-Instruct-2507-GGUF/BF16.

Q2_K : 77.60 GiB (2.84 BPW)

Q4_K_M : 133.27 GiB (4.87 BPW)

Download(Example)

# !pip install huggingface_hub hf_transfer
import os
os.environ["HF_HUB_ENABLE_HF_TRANSFER"] = "1"
from huggingface_hub import snapshot_download
snapshot_download(
    repo_id = "bobchenyx/Qwen3-235B-A22B-Instruct-2507-GGUF",
    local_dir = "bobchenyx/Qwen3-235B-A22B-Instruct-2507-GGUF",
    allow_patterns = ["*Q2_K*"],
)

Downloads last month: 26

GGUF

Model size

235B params

Architecture

qwen3moe

Hardware compatibility

2-bit

4-bit

Model tree for bobchenyx/Qwen3-235B-A22B-Instruct-2507-GGUF

Base model

Qwen/Qwen3-235B-A22B-Instruct-2507

Quantized

(47)

this model

Collection including bobchenyx/Qwen3-235B-A22B-Instruct-2507-GGUF

Qwen3-MoE

Collection

Qwen3's MoE • 3 items • Updated 24 days ago