metadata
library_name: transformers
license: bsd-3-clause
base_model:
- deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
tags:
- DeepSeek
- DeepSeek-R1-Distill-Qwen-7B
- GPTQ
- Int4
DeepSeek-R1-Distill-Qwen-7B-GPTQ-Int4
This version of DeepSeek-R1-Distill-Qwen-7B has been converted to run on the Axera NPU using w4a16 quantization.
This model has been optimized with the following LoRA:
Compatible with Pulsar2 version: 3.4(Not released yet)
Convert tools links:
For those who are interested in model conversion, you can try to export axmodel through the original repo : https://huggingface.co/jakiAJK/DeepSeek-R1-Distill-Qwen-7B_GPTQ-int4
Pulsar2 Link, How to Convert LLM from Huggingface to axmodel
Support Platform
- AX650
- AX650N DEMO Board
- M4N-Dock(爱芯派Pro)
- M.2 Accelerator card
Chips | w8a16 | w4a16 |
---|---|---|
AX650 | 2.7 tokens/sec | 5 tokens/sec |