StepLaw
/

StepLaw-N_119M-D_3.0B-LR2.762e-03-BS131072

@@ -27,13 +27,13 @@ This model is part of the [StepLaw-N_119M-D_3.0B](https://huggingface.co/collect
 ### Training Parameters
 - **Learning rate (lr)**: 2.762e-03
-- **Batch size (bs)**: 64
 - **Training iterations**: 30517
 - **Training tokens (D)**: 4.0B
 ## Model Description
-StepLaw models are trained with various hyperparameter settings to enable research on scaling laws and hyperparameter optimization. This specific model was trained with learning rate 2.762e-03 and batch size 64 for 30517 iterations, using a total of 4.0B training tokens.
 ## Usage Example

 ### Training Parameters
 - **Learning rate (lr)**: 2.762e-03
+- **Batch size (bs)**: 131072
 - **Training iterations**: 30517
 - **Training tokens (D)**: 4.0B
 ## Model Description
+StepLaw models are trained with various hyperparameter settings to enable research on scaling laws and hyperparameter optimization. This specific model was trained with learning rate 2.762e-03 and batch size 131072 for 30517 iterations, using a total of 4.0B training tokens.
 ## Usage Example