AXERA-TECH
/

DeepSeek-R1-Distill-Qwen-7B-GPTQ-Int4

DeepSeek-R1-Distill-Qwen-7B

DeepSeek-R1-Distill-Qwen-7B-GPTQ-int4

Model card Files Files and versions Community

DeepSeek-R1-Distill-Qwen-7B-GPTQ-Int4 / README.md

qqc1989's picture

Update README.md

065c2f7 verified 3 months ago

|

1.16 kB

	---
	library_name: transformers
	license: bsd-3-clause
	base_model:
	- deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
	tags:
	- DeepSeek
	- DeepSeek-R1-Distill-Qwen-7B
	- GPTQ
	- Int4
	---

	# DeepSeek-R1-Distill-Qwen-7B-GPTQ-Int4

	This version of DeepSeek-R1-Distill-Qwen-7B has been converted to run on the Axera NPU using w4a16 quantization.

	This model has been optimized with the following LoRA:

	Compatible with Pulsar2 version: 3.4(Not released yet)

	## Convert tools links:

	For those who are interested in model conversion, you can try to export axmodel through the original repo : https://huggingface.co/jakiAJK/DeepSeek-R1-Distill-Qwen-7B_GPTQ-int4

	[Pulsar2 Link, How to Convert LLM from Huggingface to axmodel](https://pulsar2-docs.readthedocs.io/en/latest/appendix/build_llm.html)

	[AXera NPU LLM Runtime](https://github.com/AXERA-TECH/ax-llm)

	## Support Platform

	- AX650
	- AX650N DEMO Board
	- [M4N-Dock(爱芯派Pro)](https://wiki.sipeed.com/hardware/zh/maixIV/m4ndock/m4ndock.html)
	- [M.2 Accelerator card](https://axcl-docs.readthedocs.io/zh-cn/latest/doc_guide_hardware.html)

	\|Chips\|w8a16\|w4a16\|
	\|--\|--\|--\|
	\|AX650\| 2.7 tokens/sec\|5 tokens/sec\|