Apertus-8B-Instruct-2509-NVFP4

NVFP4-quantized version of swiss-ai/Apertus-8B-Instruct-2509 produced with llmcompressor.

Notes

  • Quantization scheme: NVFP4 (linear layers, lm_head excluded)
  • Calibration samples: 512
  • Max sequence length during calibration: 2048
Downloads last month
163
Safetensors
Model size
5B params
Tensor type
BF16
ยท
F8_E4M3
ยท
F32
ยท
U8
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for llmat/Apertus-8B-Instruct-2509-NVFP4

Quantized
(22)
this model