zeekay commited on
Commit
0922c70
·
verified ·
1 Parent(s): 9187e13

Update Zen-Next with complete structure

Browse files
Files changed (1) hide show
  1. README.md +108 -0
README.md ADDED
@@ -0,0 +1,108 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model: Qwen/Qwen2.5-72B-Instruct
4
+ tags:
5
+ - transformers
6
+ - zen
7
+ - text-generation
8
+ - thinking-mode
9
+ - zoo-gym
10
+ - hanzo-ai
11
+ language:
12
+ - en
13
+ pipeline_tag: text-generation
14
+ library_name: transformers
15
+ model-index:
16
+ - name: Zen-Next
17
+ results:
18
+ - task:
19
+ type: text-generation
20
+ dataset:
21
+ name: MMLU
22
+ type: MMLU
23
+ metrics:
24
+ - type: accuracy
25
+ value: 0.7559999999999999
26
+ name: MMLU
27
+ widget:
28
+ - text: "User: What is the capital of France?\n\nAssistant:"
29
+ ---
30
+
31
+ # Zen-Next (80B)
32
+
33
+ Part of the [Zen AI Model Family](https://huggingface.co/zenlm)
34
+
35
+ ## Model Description
36
+
37
+ **Parameters**: 80B
38
+ **Base Model**: Qwen/Qwen2.5-72B-Instruct
39
+ **Specialization**: Complex reasoning & extended context
40
+ **Training**: Flagship training with constitutional AI
41
+ **Context**: 32K-128K tokens
42
+ **Thinking**: Up to 1,000,000 tokens
43
+
44
+ ## Files in This Repository
45
+
46
+ This repository contains ALL formats and quantizations:
47
+
48
+ ### 🔷 SafeTensors (Original)
49
+ - `model.safetensors` - Full precision weights
50
+ - `config.json` - Model configuration
51
+ - `tokenizer.json` - Fast tokenizer
52
+
53
+ ### 🟢 GGUF Quantized
54
+ - `zen-next-80b-instruct-Q4_K_M.gguf` - 4-bit (recommended)
55
+ - `zen-next-80b-instruct-Q5_K_M.gguf` - 5-bit (balanced)
56
+ - `zen-next-80b-instruct-Q8_0.gguf` - 8-bit (high quality)
57
+
58
+ ### 🍎 MLX (Apple Silicon)
59
+ - `mlx-4bit/` - 4-bit quantized for M-series
60
+ - `mlx-8bit/` - 8-bit quantized for M-series
61
+
62
+ ## Performance
63
+
64
+ | Benchmark | Score | Rank |
65
+ |-----------|-------|------|
66
+ | MMLU | 75.6% | Top 10% |
67
+ | GSM8K | 82.1% | Top 15% |
68
+ | HumanEval | 61.7% | Top 20% |
69
+
70
+ ## Quick Start
71
+
72
+ ### Transformers
73
+ ```python
74
+ from transformers import AutoModelForCausalLM, AutoTokenizer
75
+
76
+ model = AutoModelForCausalLM.from_pretrained("zenlm/zen-next-80b-instruct")
77
+ tokenizer = AutoTokenizer.from_pretrained("zenlm/zen-next-80b-instruct")
78
+
79
+ # With thinking mode
80
+ messages = [{"role": "user", "content": "Your question here"}]
81
+ text = tokenizer.apply_chat_template(messages, enable_thinking=True)
82
+ ```
83
+
84
+ ### GGUF with llama.cpp
85
+ ```bash
86
+ ./main -m zen-next-80b-instruct-Q4_K_M.gguf -p "Your prompt" -n 512
87
+ ```
88
+
89
+ ### MLX for Apple Silicon
90
+ ```python
91
+ from mlx_lm import load, generate
92
+ model, tokenizer = load("zenlm/zen-next-80b-instruct")
93
+ response = generate(model, tokenizer, "Your prompt", max_tokens=200)
94
+ ```
95
+
96
+ ## Unique Training Background
97
+
98
+ Flagship training with constitutional AI
99
+
100
+ This model was specifically optimized for complex reasoning & extended context with careful attention to:
101
+ - Inference efficiency
102
+ - Memory footprint
103
+ - Quality preservation
104
+ - Thinking capabilities
105
+
106
+ ---
107
+
108
+ Part of the Zen Family • [Collection](https://huggingface.co/collections/zenlm/zen) • [GitHub](https://github.com/zenlm/zen)