wikeeyang commited on
Commit
8c75e73
·
verified ·
1 Parent(s): f0e6dc0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -9,7 +9,7 @@ pipeline_tag: any-to-any
9
  ---
10
  ===================================================================================
11
 
12
- 本模型是阿里魔搭 https://huggingface.co/modelscope/Nexus-GenV2 官方模型的量化版本,其中 Qwen-VL 部分采用 NF4 量化,微调 generation_decoder 和 edit_decoder 部分采用 float8_e4m3fn 量化,用户可采用官方代码,经少许调整模型加载方式的代码调整,即可在无需量化的情况下直接加载本模型进行推理,下载流量和硬盘占用空间大大减少。
13
 
14
  This model is a quantized version of the official Ali ModelScope https://huggingface.co/modelscope/Nexus-GenV2. The Qwen-VL part uses NF4 quantization, and the generation_decoder and edit_decoder fine-tuned parts using float8_e4m3fn quantization. Users can through the official code and adjustment a little for model loading method to directly use this model for inference without quantization, which significantly reduces download traffic and disk space usage.
15
 
 
9
  ---
10
  ===================================================================================
11
 
12
+ 本模型是阿里魔搭 https://huggingface.co/modelscope/Nexus-GenV2 官方模型的量化版本,其中 Qwen-VL 部分采用 NF4 量化,微调 generation_decoder 和 edit_decoder 部分采用 float8_e4m3fn 量化,用户可采用官方代码,稍微调整一下模型加载方式的代码,即可在无需量化的情况下直接加载本模型进行推理,大大减少模型下载流量和硬盘占用空间。
13
 
14
  This model is a quantized version of the official Ali ModelScope https://huggingface.co/modelscope/Nexus-GenV2. The Qwen-VL part uses NF4 quantization, and the generation_decoder and edit_decoder fine-tuned parts using float8_e4m3fn quantization. Users can through the official code and adjustment a little for model loading method to directly use this model for inference without quantization, which significantly reduces download traffic and disk space usage.
15