calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 40

Training Loss	Epoch	Step	Validation Loss
3.4016	1.0	6	2.7548
2.4113	2.0	12	1.9763
1.8198	3.0	18	1.7136
1.6501	4.0	24	1.5981
1.5935	5.0	30	1.8221
1.6315	6.0	36	1.5616
1.517	7.0	42	1.5486
1.5428	8.0	48	1.5514
1.5141	9.0	54	1.5408
1.4794	10.0	60	1.4949
1.4543	11.0	66	1.4572
1.3969	12.0	72	1.4083
1.3618	13.0	78	1.4682
1.3821	14.0	84	1.3403
1.3074	15.0	90	1.2534
1.2315	16.0	96	1.2563
1.1914	17.0	102	1.2468
1.1783	18.0	108	1.1124
1.1323	19.0	114	1.0756
1.0616	20.0	120	1.0507
1.0337	21.0	126	0.9989
0.9947	22.0	132	0.9760
0.9878	23.0	138	0.9351
0.942	24.0	144	0.9184
0.928	25.0	150	0.9415
0.9594	26.0	156	0.8797
0.9115	27.0	162	0.8550
0.8768	28.0	168	0.8376
0.8587	29.0	174	0.8375
0.8481	30.0	180	0.8013
0.8344	31.0	186	0.8112
0.8215	32.0	192	0.7831
0.8095	33.0	198	0.7643
0.7946	34.0	204	0.7568
0.7808	35.0	210	0.7311
0.7696	36.0	216	0.7247
0.75	37.0	222	0.7109
0.7464	38.0	228	0.7044
0.7408	39.0	234	0.6994
0.7476	40.0	240	0.6968