my-lora-hf-sum
This model is a fine-tuned version of google/mt5-small on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 8.8834
- Rouge1: 0.7694
- Rouge2: 0.2147
- Rougel: 0.6887
- Rougelsum: 0.7056
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 2
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
---|---|---|---|---|---|---|---|
26.8 | 0.0251 | 5 | 17.7235 | 0.4479 | 0.0229 | 0.4381 | 0.4408 |
22.7614 | 0.0503 | 10 | 17.5699 | 0.5096 | 0.0229 | 0.4851 | 0.4837 |
25.4291 | 0.0754 | 15 | 17.4004 | 0.4895 | 0.0229 | 0.4760 | 0.4770 |
24.2329 | 0.1005 | 20 | 17.3355 | 0.4011 | 0.0229 | 0.4091 | 0.4097 |
27.2941 | 0.1256 | 25 | 17.3291 | 0.3714 | 0.0229 | 0.3764 | 0.3806 |
26.2486 | 0.1508 | 30 | 17.2167 | 0.3714 | 0.0229 | 0.3764 | 0.3806 |
22.6219 | 0.1759 | 35 | 17.0234 | 0.3714 | 0.0229 | 0.3764 | 0.3806 |
22.8832 | 0.2010 | 40 | 16.9310 | 0.3492 | 0.0229 | 0.3543 | 0.3580 |
24.3708 | 0.2261 | 45 | 16.8126 | 0.3722 | 0.0229 | 0.3798 | 0.3821 |
24.1446 | 0.2513 | 50 | 16.7162 | 0.3442 | 0.0229 | 0.3526 | 0.3520 |
24.4794 | 0.2764 | 55 | 16.5429 | 0.3442 | 0.0229 | 0.3526 | 0.3520 |
24.2086 | 0.3015 | 60 | 16.3896 | 0.3442 | 0.0229 | 0.3526 | 0.3520 |
23.4245 | 0.3266 | 65 | 16.2550 | 0.3218 | 0.0229 | 0.3278 | 0.3254 |
21.0837 | 0.3518 | 70 | 16.0502 | 0.3223 | 0.0229 | 0.3304 | 0.3275 |
23.4064 | 0.3769 | 75 | 15.9576 | 0.3223 | 0.0229 | 0.3304 | 0.3275 |
20.5418 | 0.4020 | 80 | 15.7345 | 0.3086 | 0.0229 | 0.3166 | 0.3160 |
22.4424 | 0.4271 | 85 | 15.5302 | 0.2892 | 0.0229 | 0.2941 | 0.2983 |
20.0642 | 0.4523 | 90 | 15.3498 | 0.3296 | 0.0493 | 0.3358 | 0.3386 |
21.9881 | 0.4774 | 95 | 15.1718 | 0.3165 | 0.0493 | 0.3215 | 0.3253 |
22.4813 | 0.5025 | 100 | 15.0168 | 0.3169 | 0.0493 | 0.3216 | 0.3257 |
24.2963 | 0.5276 | 105 | 14.7464 | 0.3309 | 0.0493 | 0.3337 | 0.3370 |
23.2294 | 0.5528 | 110 | 14.4756 | 0.3597 | 0.0493 | 0.3608 | 0.3642 |
20.3992 | 0.5779 | 115 | 14.2512 | 0.3714 | 0.0493 | 0.3584 | 0.3635 |
20.2597 | 0.6030 | 120 | 13.9527 | 0.3771 | 0.0493 | 0.3697 | 0.3743 |
20.6987 | 0.6281 | 125 | 13.7137 | 0.4282 | 0.0743 | 0.4172 | 0.4245 |
19.3361 | 0.6533 | 130 | 13.5206 | 0.4030 | 0.0743 | 0.4052 | 0.4107 |
21.4137 | 0.6784 | 135 | 13.3524 | 0.4030 | 0.0743 | 0.4052 | 0.4107 |
19.3918 | 0.7035 | 140 | 13.2160 | 0.3961 | 0.0852 | 0.3963 | 0.4077 |
18.2876 | 0.7286 | 145 | 13.0702 | 0.3970 | 0.0852 | 0.3976 | 0.4095 |
22.2212 | 0.7538 | 150 | 12.8667 | 0.3457 | 0.0852 | 0.3437 | 0.3524 |
19.2252 | 0.7789 | 155 | 12.6627 | 0.3457 | 0.0852 | 0.3437 | 0.3524 |
17.9289 | 0.8040 | 160 | 12.5312 | 0.3457 | 0.0852 | 0.3437 | 0.3524 |
19.3069 | 0.8291 | 165 | 12.3408 | 0.3457 | 0.0852 | 0.3437 | 0.3524 |
20.2723 | 0.8543 | 170 | 12.1137 | 0.3457 | 0.0852 | 0.3437 | 0.3524 |
17.534 | 0.8794 | 175 | 11.8986 | 0.3458 | 0.0852 | 0.3442 | 0.3524 |
19.06 | 0.9045 | 180 | 11.6703 | 0.3458 | 0.0852 | 0.3442 | 0.3524 |
21.1059 | 0.9296 | 185 | 11.4621 | 0.3613 | 0.0852 | 0.3553 | 0.3673 |
18.3575 | 0.9548 | 190 | 11.2626 | 0.3611 | 0.0852 | 0.3551 | 0.3670 |
18.8256 | 0.9799 | 195 | 11.0890 | 0.3705 | 0.0852 | 0.3655 | 0.3783 |
16.6283 | 1.0050 | 200 | 10.9119 | 0.3705 | 0.0852 | 0.3655 | 0.3783 |
17.0705 | 1.0302 | 205 | 10.7746 | 0.3705 | 0.0852 | 0.3655 | 0.3783 |
16.583 | 1.0553 | 210 | 10.6290 | 0.3596 | 0.0852 | 0.3531 | 0.3684 |
17.4136 | 1.0804 | 215 | 10.4930 | 0.3811 | 0.0852 | 0.3752 | 0.3898 |
16.53 | 1.1055 | 220 | 10.3394 | 0.3811 | 0.0852 | 0.3752 | 0.3898 |
16.3147 | 1.1307 | 225 | 10.2160 | 0.3811 | 0.0852 | 0.3752 | 0.3898 |
17.313 | 1.1558 | 230 | 10.0941 | 0.3811 | 0.0852 | 0.3752 | 0.3898 |
14.9139 | 1.1809 | 235 | 9.9897 | 0.3956 | 0.0852 | 0.3896 | 0.4050 |
15.4727 | 1.2060 | 240 | 9.9119 | 0.4320 | 0.0852 | 0.4258 | 0.4437 |
16.2121 | 1.2312 | 245 | 9.8533 | 0.4467 | 0.0852 | 0.4426 | 0.4585 |
15.7117 | 1.2563 | 250 | 9.8013 | 0.4467 | 0.0852 | 0.4426 | 0.4585 |
14.3507 | 1.2814 | 255 | 9.7481 | 0.4467 | 0.0852 | 0.4426 | 0.4585 |
14.6657 | 1.3065 | 260 | 9.6891 | 0.4467 | 0.0852 | 0.4426 | 0.4585 |
15.2625 | 1.3317 | 265 | 9.6280 | 0.4463 | 0.0852 | 0.4420 | 0.4580 |
16.0463 | 1.3568 | 270 | 9.5713 | 0.4465 | 0.0852 | 0.4429 | 0.4584 |
15.1612 | 1.3819 | 275 | 9.5196 | 0.4465 | 0.0852 | 0.4429 | 0.4584 |
14.8572 | 1.4070 | 280 | 9.4743 | 0.4971 | 0.0979 | 0.4862 | 0.5034 |
14.3652 | 1.4322 | 285 | 9.4267 | 0.4971 | 0.0979 | 0.4862 | 0.5034 |
16.1322 | 1.4573 | 290 | 9.3751 | 0.4971 | 0.0979 | 0.4862 | 0.5034 |
14.4116 | 1.4824 | 295 | 9.3180 | 0.4975 | 0.0979 | 0.4864 | 0.5034 |
14.166 | 1.5075 | 300 | 9.2584 | 0.5281 | 0.0979 | 0.5003 | 0.5185 |
14.4904 | 1.5327 | 305 | 9.2126 | 0.5281 | 0.0979 | 0.5002 | 0.5185 |
15.3989 | 1.5578 | 310 | 9.1765 | 0.5281 | 0.0979 | 0.5002 | 0.5185 |
14.7175 | 1.5829 | 315 | 9.1373 | 0.5280 | 0.0979 | 0.5001 | 0.5184 |
15.1778 | 1.6080 | 320 | 9.1008 | 0.5710 | 0.1109 | 0.5275 | 0.5473 |
15.717 | 1.6332 | 325 | 9.0625 | 0.6370 | 0.1429 | 0.5776 | 0.5899 |
15.786 | 1.6583 | 330 | 9.0358 | 0.6370 | 0.1429 | 0.5776 | 0.5899 |
13.8109 | 1.6834 | 335 | 9.0115 | 0.6781 | 0.1772 | 0.6099 | 0.6218 |
13.7277 | 1.7085 | 340 | 8.9898 | 0.6781 | 0.1772 | 0.6099 | 0.6218 |
14.6217 | 1.7337 | 345 | 8.9701 | 0.7030 | 0.1891 | 0.6333 | 0.6474 |
13.9167 | 1.7588 | 350 | 8.9508 | 0.7236 | 0.1891 | 0.6544 | 0.6707 |
14.0584 | 1.7839 | 355 | 8.9358 | 0.7696 | 0.2147 | 0.6896 | 0.7060 |
15.5297 | 1.8090 | 360 | 8.9224 | 0.7695 | 0.2147 | 0.6887 | 0.7057 |
15.3239 | 1.8342 | 365 | 8.9141 | 0.7695 | 0.2147 | 0.6887 | 0.7057 |
14.1664 | 1.8593 | 370 | 8.9070 | 0.7695 | 0.2147 | 0.6887 | 0.7057 |
13.7965 | 1.8844 | 375 | 8.9005 | 0.7695 | 0.2147 | 0.6887 | 0.7057 |
13.9088 | 1.9095 | 380 | 8.8938 | 0.7694 | 0.2147 | 0.6887 | 0.7056 |
14.4487 | 1.9347 | 385 | 8.8886 | 0.7694 | 0.2147 | 0.6887 | 0.7056 |
16.2926 | 1.9598 | 390 | 8.8851 | 0.7694 | 0.2147 | 0.6887 | 0.7056 |
13.9456 | 1.9849 | 395 | 8.8834 | 0.7694 | 0.2147 | 0.6887 | 0.7056 |
Framework versions
- PEFT 0.14.0
- Transformers 4.49.0
- Pytorch 2.6.0+cu124
- Datasets 3.3.2
- Tokenizers 0.21.0
- Downloads last month
- 1
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for benitoals/my-lora-hf-sum
Base model
google/mt5-small