Llamacpp Quantizations of Kimi-K2-Instruct

Original model: moonshotai/Kimi-K2-Instruct.

All quants made based on bartowski1182-llama.cpp.

All quants using imatrix & BF16 convertion from unsloth/Kimi-K2-Instruct-GGUF/BF16.

IQ1_S : 197.39 GiB (1.65 BPW)

IQ1_M : 206.03 GiB (1.72 BPW)

IQ2_S : 265.71 GiB (2.22 BPW)

Q2_K : 335.39 GiB (2.81 BPW)


Download(Example)

# !pip install huggingface_hub hf_transfer
import os
os.environ["HF_HUB_ENABLE_HF_TRANSFER"] = "1"
from huggingface_hub import snapshot_download
snapshot_download(
    repo_id = "bobchenyx/Kimi-K2-Instruct-GGUF",
    local_dir = "bobchenyx/Kimi-K2-Instruct-GGUF",
    allow_patterns = ["*IQ1_M*"],
)
Downloads last month
45
GGUF
Model size
1,026B params
Architecture
deepseek2
Hardware compatibility
Log In to view the estimation

1-bit

2-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for bobchenyx/Kimi-K2-Instruct-GGUF

Quantized
(20)
this model