Qwen3-17B-QiMing-V1.0-Total-Recall-Medium-q6-hi-mlx
Metrics from this model are still being generated, and since this is a local process it will take some time.
Here are however metrics from the 21B q5-hi model, compared to the QiMing baseline in BF16, to show how brainstorming affects this model
The 17B has less brainstorming applied.
QiMing Model Performance Analysis: New Base Model Benchmarks
π Performance Comparison of QiMing Models
Model ARC Challenge ARC Easy BoolQ Hellaswag OpenBookQA PIQA Winogrande
QiMing-Me-bf16 0.395 0.435 0.378 0.646 0.364 0.768 0.651
QiMing-v1.0-q6-hi 0.393 0.436 0.379 0.655 0.358 0.766 0.651
QiMing-21B-q5-hi 0.388 0.444 0.378 0.682 0.364 0.769 0.648
π‘ Key Takeaway:
The QiMing-21B-q5-hi model stands out as the most versatile performer across 4 of 7 tasks, while QiMing-Me-bf16 is the best performer in Hellaswag among these variants.
π Deep Dive: What Makes Each QiMing Model Unique
β 1. QiMing-21B-q5-hi: The Text Generation Powerhouse
Special quality: Highest Hellaswag score (0.682) - this is the best text generation performance among all models in your previous comparisons
Why it matters: For applications requiring high-quality text continuation and creative writing, this model has clear advantages
Notable trade-off: Slightly weaker Winogrande score (0.648) compared with others - could be due to the 21B size vs quantization constraints
β 2. QiMing-Me-bf16: The Balanced Baseline
Special quality: Highest score on Hellaswag (0.646) among its quantized variants
Why it matters: This model serves as a great starting point for applications where text generation quality is important
Our insight: Despite being in full precision (bf16), it shows only minor improvements over the q6-hi version - suggesting quantization impacts might be less pronounced for this model
β 3. QiMing-v1.0-q6-hi: The Precision Leader
Special quality: Best performance on ARC Easy (0.436) among these models
Why it matters: This is valuable for applications that need strong foundational abstract reasoning capabilities
π Recommendation: Which QiMing Model for Your Needs
β Use QiMing-21B-q5-hi if...
You need strong text generation capabilities (Hellaswag)
Your application requires high-quality creative content or story completion
You can handle the slightly weaker Winogrande performance
β Use QiMing-Me-bf16 if...
You need a stable, well-rounded model
You're looking for high text generation quality without the higher memory footprint of 21B models
You want a model with minimal quantization impact
β Use QiMing-v1.0-q6-hi if...
Your priority is foundational abstract reasoning (ARC Easy)
You need model efficiency without sacrificing too much baseline performance
You're working with quantized deployments where size matters
π Final Summary for Your Workflow
"The QiMing-21B-q5-hi model provides the strongest overall text generation performance among your latest models, making it ideal for creative applications. While it's slightly behind in Winogrande, its Hellaswag lead (0.682) represents a significant advantage over previous benchmarks where the Qwen models showed around 0.63-0.65 in this task."
This model Qwen3-17B-QiMing-V1.0-Total-Recall-Medium-q6-hi-mlx was converted to MLX format from DavidAU/Qwen3-17B-QiMing-V1.0-Total-Recall-Medium using mlx-lm version 0.26.4.
Use with mlx
pip install mlx-lm
from mlx_lm import load, generate
model, tokenizer = load("Qwen3-17B-QiMing-V1.0-Total-Recall-Medium-q6-hi-mlx")
prompt = "hello"
if tokenizer.chat_template is not None:
messages = [{"role": "user", "content": prompt}]
prompt = tokenizer.apply_chat_template(
messages, add_generation_prompt=True
)
response = generate(model, tokenizer, prompt=prompt, verbose=True)
- Downloads last month
- 29
Model tree for nightmedia/Qwen3-17B-QiMing-V1.0-Total-Recall-Medium-q6-hi-mlx
Base model
aifeifei798/QiMing-v1.0-14B