|
--- |
|
library_name: transformers |
|
license: bsd-3-clause |
|
base_model: |
|
- deepseek-ai/DeepSeek-R1-Distill-Qwen-7B |
|
tags: |
|
- DeepSeek |
|
- DeepSeek-R1-Distill-Qwen-7B |
|
- GPTQ |
|
- Int4 |
|
--- |
|
|
|
# DeepSeek-R1-Distill-Qwen-7B-GPTQ-Int4 |
|
|
|
This version of DeepSeek-R1-Distill-Qwen-7B has been converted to run on the Axera NPU using **w4a16** quantization. |
|
|
|
This model has been optimized with the following LoRA: |
|
|
|
Compatible with Pulsar2 version: 3.4(Not released yet) |
|
|
|
## Convert tools links: |
|
|
|
For those who are interested in model conversion, you can try to export axmodel through the original repo : https://huggingface.co/jakiAJK/DeepSeek-R1-Distill-Qwen-7B_GPTQ-int4 |
|
|
|
[Pulsar2 Link, How to Convert LLM from Huggingface to axmodel](https://pulsar2-docs.readthedocs.io/en/latest/appendix/build_llm.html) |
|
|
|
[AXera NPU LLM Runtime](https://github.com/AXERA-TECH/ax-llm) |
|
|
|
## Support Platform |
|
|
|
- AX650 |
|
- AX650N DEMO Board |
|
- [M4N-Dock(爱芯派Pro)](https://wiki.sipeed.com/hardware/zh/maixIV/m4ndock/m4ndock.html) |
|
- [M.2 Accelerator card](https://axcl-docs.readthedocs.io/zh-cn/latest/doc_guide_hardware.html) |
|
|
|
|Chips|w8a16|w4a16| |
|
|--|--|--| |
|
|AX650| 2.7 tokens/sec|5 tokens/sec| |
|
|