qqc1989's picture
Update README.md
065c2f7 verified
|
raw
history blame
1.16 kB
metadata
library_name: transformers
license: bsd-3-clause
base_model:
  - deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
tags:
  - DeepSeek
  - DeepSeek-R1-Distill-Qwen-7B
  - GPTQ
  - Int4

DeepSeek-R1-Distill-Qwen-7B-GPTQ-Int4

This version of DeepSeek-R1-Distill-Qwen-7B has been converted to run on the Axera NPU using w4a16 quantization.

This model has been optimized with the following LoRA:

Compatible with Pulsar2 version: 3.4(Not released yet)

Convert tools links:

For those who are interested in model conversion, you can try to export axmodel through the original repo : https://huggingface.co/jakiAJK/DeepSeek-R1-Distill-Qwen-7B_GPTQ-int4

Pulsar2 Link, How to Convert LLM from Huggingface to axmodel

AXera NPU LLM Runtime

Support Platform

Chips w8a16 w4a16
AX650 2.7 tokens/sec 5 tokens/sec