update readme
Browse files
README.md
CHANGED
@@ -20,7 +20,7 @@ Trained on a corpus of 4 trillion tokens, this model demonstrates that native 1-
|
|
20 |
|
21 |
➡️ **Technical Report:** [BitNet b1.58 2B4T Technical Report](https://arxiv.org)
|
22 |
|
23 |
-
➡️ **Official Code:** [microsoft/BitNet (bitnet.cpp)](https://github.com/microsoft/BitNet)
|
24 |
|
25 |
## Model Variants
|
26 |
|
@@ -112,7 +112,6 @@ BitNet b1.58 2B4T was evaluated against leading open-weight full-precision LLMs
|
|
112 |
| **Latency (CPU Decoding)** | 48ms | 41ms | 65ms | 67ms | 124ms | **29ms** |
|
113 |
| **Energy (Estimated)** | 0.258J | 0.186J | 0.347J | 0.425J | 0.649J | **0.028J** |
|
114 |
| **Training Tokens (Pre-train)**| 9T* | 2T** | 18T | 11T | 1.1T | 4T |
|
115 |
-
|--------------------------------|--------------|------------|--------------|--------------|------------|---------------------|
|
116 |
| ARC-Challenge | 37.80 | 38.40 | 46.67 | 43.52 | 44.80 | **49.91** |
|
117 |
| ARC-Easy | 63.17 | 63.13 | **76.01** | 62.92 | 72.14 | 74.79 |
|
118 |
| OpenbookQA | 34.80 | 38.80 | 40.80 | **46.00** | 40.20 | 41.60 |
|
|
|
20 |
|
21 |
➡️ **Technical Report:** [BitNet b1.58 2B4T Technical Report](https://arxiv.org)
|
22 |
|
23 |
+
➡️ **Official Inference Code:** [microsoft/BitNet (bitnet.cpp)](https://github.com/microsoft/BitNet)
|
24 |
|
25 |
## Model Variants
|
26 |
|
|
|
112 |
| **Latency (CPU Decoding)** | 48ms | 41ms | 65ms | 67ms | 124ms | **29ms** |
|
113 |
| **Energy (Estimated)** | 0.258J | 0.186J | 0.347J | 0.425J | 0.649J | **0.028J** |
|
114 |
| **Training Tokens (Pre-train)**| 9T* | 2T** | 18T | 11T | 1.1T | 4T |
|
|
|
115 |
| ARC-Challenge | 37.80 | 38.40 | 46.67 | 43.52 | 44.80 | **49.91** |
|
116 |
| ARC-Easy | 63.17 | 63.13 | **76.01** | 62.92 | 72.14 | 74.79 |
|
117 |
| OpenbookQA | 34.80 | 38.80 | 40.80 | **46.00** | 40.20 | 41.60 |
|