metadata

library_name: transformers
license: bsd-3-clause
base_model:
  - deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
tags:
  - DeepSeek
  - DeepSeek-R1-Distill-Qwen-7B
  - GPTQ
  - Int4

DeepSeek-R1-Distill-Qwen-7B-GPTQ-Int4

This version of DeepSeek-R1-Distill-Qwen-7B has been converted to run on the Axera NPU using w4a16 quantization.

This model has been optimized with the following LoRA:

Compatible with Pulsar2 version: 3.4(Not released yet)

Convert tools links:

For those who are interested in model conversion, you can try to export axmodel through the original repo : https://huggingface.co/jakiAJK/DeepSeek-R1-Distill-Qwen-7B_GPTQ-int4

Pulsar2 Link, How to Convert LLM from Huggingface to axmodel

AXera NPU LLM Runtime

Support Platform

AX650
- AX650N DEMO Board
- M4N-Dock(爱芯派Pro)
- M.2 Accelerator card

Chips	w8a16	w4a16
AX650	2.7 tokens/sec	5 tokens/sec