uday610's picture
Update README.md
e15be98 verified
|
raw
history blame
1.05 kB
---
license: mit
language:
- multilingual
base_model:
- microsoft/Phi-3.5-mini-instruct
pipeline_tag: text-generation
library_name: transformers
tags:
- nlp
- code
- onnx
- amd
---
# microsoft/Phi-3.5-mini-instruct
- ## Introduction
- Quantization Tool: Quark 0.6.0
- OGA Model Builder: v0.5.1
- Postprocess
- ## Quantization Strategy
- AWQ / Group 128 / Asymmetric / UINT4 Weights / FP16 activations
- Excluded Layers: None
```
python3 quantize_quark.py \
--model_dir "$model" \
--output_dir "$output_dir" \
--quant_scheme w_uint4_per_group_asym \
--num_calib_data 128 \
--quant_algo awq \
--dataset pileval_for_awq_benchmark \
--seq_len 512 \
--model_export quark_safetensors \
--data_type float16 \
--exclude_layers [] \
--custom_mode awq
```
- ## OGA Model Builder
```
python builder.py \
-i <quantized safetensor model dir> \
-o <oga model output dir> \
-p int4 \
-e dml
```
- PostProcessed to generate Hybrid Model
-