Llamacpp Quantizations of Kimi-K2-Instruct

Original model: moonshotai/Kimi-K2-Instruct.

All quants made based on bartowski1182-llama.cpp.

All quants using imatrix & BF16 convertion from unsloth/Kimi-K2-Instruct-GGUF/BF16.

IQ1_S : 197.39 GiB (1.65 BPW)

IQ1_M : 206.03 GiB (1.72 BPW)

IQ2_S : 265.71 GiB (2.22 BPW)

Q2_K : 335.39 GiB (2.81 BPW)

Download(Example)

# !pip install huggingface_hub hf_transfer
import os
os.environ["HF_HUB_ENABLE_HF_TRANSFER"] = "1"
from huggingface_hub import snapshot_download
snapshot_download(
    repo_id = "bobchenyx/Kimi-K2-Instruct-GGUF",
    local_dir = "bobchenyx/Kimi-K2-Instruct-GGUF",
    allow_patterns = ["*IQ1_M*"],
)

Downloads last month: 45

GGUF

Model size

1,026B params

Architecture

deepseek2

Hardware compatibility

1-bit

2-bit

Model tree for bobchenyx/Kimi-K2-Instruct-GGUF

Base model

moonshotai/Kimi-K2-Instruct

Quantized

(20)

this model