Commit
·
6072bc1
1
Parent(s):
b0635b5
update model card v1
Browse files
README.md
CHANGED
@@ -4,4 +4,153 @@ language:
|
|
4 |
- en
|
5 |
base_model:
|
6 |
- deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
|
7 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
4 |
- en
|
5 |
base_model:
|
6 |
- deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
|
7 |
+
---
|
8 |
+
# Model Overview
|
9 |
+
|
10 |
+
|
11 |
+
### Description:
|
12 |
+
Nemotron-Research-Reasoning-Qwen-1.5B is the world’s leading 1.5B open-weight model for complex reasoning tasks such as mathematical problems, coding challenges, and scientific questions.
|
13 |
+
It is trained using the Group Relative Policy Optimization (GRPO) algorithm on a diverse and comprehensive set of datasets.
|
14 |
+
Our model has achieved impressive results, outperforming Deepseek’s model by a large margin on a broad range of tasks including math, coding, and GPQA.
|
15 |
+
|
16 |
+
This model is for research and development only.
|
17 |
+
|
18 |
+
|
19 |
+
### License/Terms of Use
|
20 |
+
TBD
|
21 |
+
|
22 |
+
|
23 |
+
### Deployment Geography:
|
24 |
+
Global <br>
|
25 |
+
|
26 |
+
### Use Case: <br>
|
27 |
+
Researchers and developers can use this model to solve math, coding and STEM questions.
|
28 |
+
|
29 |
+
### Release Date: <br>
|
30 |
+
Huggingface x/xx/2025 via [URL] <br>
|
31 |
+
|
32 |
+
|
33 |
+
## References(s):
|
34 |
+
Haven't published yet but will release a paper at the same time as the model release. <br>
|
35 |
+
|
36 |
+
[Qwen2.5 Technical Report](https://arxiv.org/abs/2412.15115)
|
37 |
+
|
38 |
+
## Model Architecture:
|
39 |
+
**Architecture Type:** Dense decoder-only Transformer model <br>
|
40 |
+
|
41 |
+
**Network Architecture:** [DeepSeek-R1-Distill-Qwen-1.5B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B) <br>
|
42 |
+
|
43 |
+
**This model was developed based on DeepSeek-R1-Distill-Qwen-1.5B <br>
|
44 |
+
|
45 |
+
|
46 |
+
## Input: <br>
|
47 |
+
**Input Type(s):** Text <br>
|
48 |
+
**Input Format:** String <br>
|
49 |
+
**Input Parameters:** 1D <br>
|
50 |
+
**Other Properties Related to Input:** Context length up to 32,000 tokens <br>
|
51 |
+
|
52 |
+
## Output: <br>
|
53 |
+
**Output Type(s):** Text <br>
|
54 |
+
**Output Format:** String <br>
|
55 |
+
**Output Parameters:** 1D <br>
|
56 |
+
**Other Properties Related to Output:** Context length up to 32,000 tokens <br>
|
57 |
+
|
58 |
+
Our AI models are designed and/or optimized to run on NVIDIA GPU-accelerated systems. By leveraging NVIDIA’s hardware (e.g. GPU cores) and software frameworks (e.g., CUDA libraries), the model achieves faster training and inference times compared to CPU-only solutions. <br>
|
59 |
+
|
60 |
+
## Software Integration:
|
61 |
+
**Runtime Engine(s):** Transformers
|
62 |
+
|
63 |
+
**Supported Hardware Microarchitecture Compatibility:** <br>
|
64 |
+
* NVIDIA Ampere <br>
|
65 |
+
* NVIDIA Hopper <br>
|
66 |
+
|
67 |
+
**Preferred/Supported Operating System(s):**
|
68 |
+
* Linux <br>
|
69 |
+
|
70 |
+
|
71 |
+
## Model Version(s):
|
72 |
+
1.0
|
73 |
+
|
74 |
+
## Training, Testing, and Evaluation Datasets:
|
75 |
+
|
76 |
+
** The total size (in number of data points): 479K <br>
|
77 |
+
** Total number of datasets: 5 <br>
|
78 |
+
|
79 |
+
** Dataset partition: Training [90%], testing [5%], validation [5%] <br>
|
80 |
+
** Time period for training data collection [1984-01-01 to 2023-01-01] <br>
|
81 |
+
** Time period for testing data collection [2024-01-01 to 2025-04-01] <br>
|
82 |
+
** Time period for validation data collection [2024-01-01 to 2025-04-01] <br>
|
83 |
+
|
84 |
+
|
85 |
+
### Training Dataset:
|
86 |
+
**Link:**
|
87 |
+
| Dataset | Link |
|
88 |
+
|---------------------------|-------------------------------------------------------------------------------------------|
|
89 |
+
| DeepScaleR-Preview-Dataset | [Link](https://huggingface.co/datasets/agentica-org/DeepScaleR-Preview-Dataset) |
|
90 |
+
| Eurus-2-RL-Data | [Link](https://huggingface.co/datasets/PRIME-RL/Eurus-2-RL-Data) |
|
91 |
+
| Reasoning-gym | [Link](https://github.com/open-thought/reasoning-gym) |
|
92 |
+
| IFEval | [Link](https://jirasw.nvidia.com/browse/DGPTT-1916) |
|
93 |
+
| SCP-116K | [Link](https://huggingface.co/datasets/EricLu/SCP-116K) |
|
94 |
+
|
95 |
+
|
96 |
+
Data Collection Method by dataset: <br>
|
97 |
+
* Hybrid: Automated, Human, Synthetic <br>
|
98 |
+
|
99 |
+
Labeling Method by dataset: <br>
|
100 |
+
* Hybrid: Automated, Human, Synthetic <br>
|
101 |
+
|
102 |
+
**Properties (Quantity, Dataset Descriptions, Sensor(s)):** 479K question and answer pairs <br>
|
103 |
+
|
104 |
+
|
105 |
+
### Testing Dataset:
|
106 |
+
**Link:**
|
107 |
+
| Dataset | Link |
|
108 |
+
|---------------------------|-------------------------------------------------------------------------------------------|
|
109 |
+
| DeepScaleR-Preview-Dataset | [Link](https://huggingface.co/datasets/agentica-org/DeepScaleR-Preview-Dataset) |
|
110 |
+
| Eurus-2-RL-Data | [Link](https://huggingface.co/datasets/PRIME-RL/Eurus-2-RL-Data) |
|
111 |
+
| Reasoning-gym | [Link](https://github.com/open-thought/reasoning-gym) |
|
112 |
+
| IFEval | [Link](https://jirasw.nvidia.com/browse/DGPTT-1916) |
|
113 |
+
| SCP-116K | [Link](https://huggingface.co/datasets/EricLu/SCP-116K) |
|
114 |
+
|
115 |
+
|
116 |
+
Data Collection Method by dataset: <br>
|
117 |
+
* Hybrid: Automated, Human, Synthetic <br>
|
118 |
+
|
119 |
+
Labeling Method by dataset: <br>
|
120 |
+
* Hybrid: Automated, Human, Synthetic <br>
|
121 |
+
|
122 |
+
**Properties (Quantity, Dataset Descriptions, Sensor(s)):** 22K question and answer pairs <br>
|
123 |
+
|
124 |
+
|
125 |
+
### Evaluation Dataset:
|
126 |
+
|
127 |
+
**Link:**
|
128 |
+
AIME: https://huggingface.co/datasets/opencompass/AIME2025
|
129 |
+
|
130 |
+
AMC: https://huggingface.co/datasets/AI-MO/aimo-validation-amc
|
131 |
+
|
132 |
+
**Benchmark Score <br>
|
133 |
+
|
134 |
+
|Dataset|Score|
|
135 |
+
|---|---|
|
136 |
+
|AIME|48.1|
|
137 |
+
|AMC|79.3|
|
138 |
+
|
139 |
+
Data Collection Method by dataset: <br>
|
140 |
+
* Automated <br>
|
141 |
+
|
142 |
+
Labeling Method by dataset: <br>
|
143 |
+
* Human <br>
|
144 |
+
|
145 |
+
**Properties (Quantity, Dataset Descriptions, Sensor(s)):** 100 math question and answer pairs <br>
|
146 |
+
|
147 |
+
|
148 |
+
# Inference:
|
149 |
+
**Acceleration Engine:** Transformers <br>
|
150 |
+
**Test Hardware:** <br>
|
151 |
+
- 1x H100-80GB GPU
|
152 |
+
|
153 |
+
## Ethical Considerations:
|
154 |
+
NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.
|
155 |
+
|
156 |
+
Please report security vulnerabilities or NVIDIA AI Concerns [here](https://www.nvidia.com/en-us/support/submit-security-vulnerability/).
|