Update README.md
Browse files
README.md
CHANGED
@@ -9,7 +9,7 @@ pipeline_tag: any-to-any
|
|
9 |
---
|
10 |
===================================================================================
|
11 |
|
12 |
-
本模型是阿里魔搭 https://huggingface.co/modelscope/Nexus-GenV2 官方模型的量化版本,其中 Qwen-VL 部分采用 NF4 量化,微调 generation_decoder 和 edit_decoder 部分采用 float8_e4m3fn
|
13 |
|
14 |
This model is a quantized version of the official Ali ModelScope https://huggingface.co/modelscope/Nexus-GenV2. The Qwen-VL part uses NF4 quantization, and the generation_decoder and edit_decoder fine-tuned parts using float8_e4m3fn quantization. Users can through the official code and adjustment a little for model loading method to directly use this model for inference without quantization, which significantly reduces download traffic and disk space usage.
|
15 |
|
|
|
9 |
---
|
10 |
===================================================================================
|
11 |
|
12 |
+
本模型是阿里魔搭 https://huggingface.co/modelscope/Nexus-GenV2 官方模型的量化版本,其中 Qwen-VL 部分采用 NF4 量化,微调 generation_decoder 和 edit_decoder 部分采用 float8_e4m3fn 量化,用户可采用官方代码,稍微调整一下模型加载方式的代码,即可在无需量化的情况下直接加载本模型进行推理,大大减少模型下载流量和硬盘占用空间。
|
13 |
|
14 |
This model is a quantized version of the official Ali ModelScope https://huggingface.co/modelscope/Nexus-GenV2. The Qwen-VL part uses NF4 quantization, and the generation_decoder and edit_decoder fine-tuned parts using float8_e4m3fn quantization. Users can through the official code and adjustment a little for model loading method to directly use this model for inference without quantization, which significantly reduces download traffic and disk space usage.
|
15 |
|