my-lora-hf-sum

This model is a fine-tuned version of google/mt5-small on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 8.8834
  • Rouge1: 0.7694
  • Rouge2: 0.2147
  • Rougel: 0.6887
  • Rougelsum: 0.7056

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 2

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
26.8 0.0251 5 17.7235 0.4479 0.0229 0.4381 0.4408
22.7614 0.0503 10 17.5699 0.5096 0.0229 0.4851 0.4837
25.4291 0.0754 15 17.4004 0.4895 0.0229 0.4760 0.4770
24.2329 0.1005 20 17.3355 0.4011 0.0229 0.4091 0.4097
27.2941 0.1256 25 17.3291 0.3714 0.0229 0.3764 0.3806
26.2486 0.1508 30 17.2167 0.3714 0.0229 0.3764 0.3806
22.6219 0.1759 35 17.0234 0.3714 0.0229 0.3764 0.3806
22.8832 0.2010 40 16.9310 0.3492 0.0229 0.3543 0.3580
24.3708 0.2261 45 16.8126 0.3722 0.0229 0.3798 0.3821
24.1446 0.2513 50 16.7162 0.3442 0.0229 0.3526 0.3520
24.4794 0.2764 55 16.5429 0.3442 0.0229 0.3526 0.3520
24.2086 0.3015 60 16.3896 0.3442 0.0229 0.3526 0.3520
23.4245 0.3266 65 16.2550 0.3218 0.0229 0.3278 0.3254
21.0837 0.3518 70 16.0502 0.3223 0.0229 0.3304 0.3275
23.4064 0.3769 75 15.9576 0.3223 0.0229 0.3304 0.3275
20.5418 0.4020 80 15.7345 0.3086 0.0229 0.3166 0.3160
22.4424 0.4271 85 15.5302 0.2892 0.0229 0.2941 0.2983
20.0642 0.4523 90 15.3498 0.3296 0.0493 0.3358 0.3386
21.9881 0.4774 95 15.1718 0.3165 0.0493 0.3215 0.3253
22.4813 0.5025 100 15.0168 0.3169 0.0493 0.3216 0.3257
24.2963 0.5276 105 14.7464 0.3309 0.0493 0.3337 0.3370
23.2294 0.5528 110 14.4756 0.3597 0.0493 0.3608 0.3642
20.3992 0.5779 115 14.2512 0.3714 0.0493 0.3584 0.3635
20.2597 0.6030 120 13.9527 0.3771 0.0493 0.3697 0.3743
20.6987 0.6281 125 13.7137 0.4282 0.0743 0.4172 0.4245
19.3361 0.6533 130 13.5206 0.4030 0.0743 0.4052 0.4107
21.4137 0.6784 135 13.3524 0.4030 0.0743 0.4052 0.4107
19.3918 0.7035 140 13.2160 0.3961 0.0852 0.3963 0.4077
18.2876 0.7286 145 13.0702 0.3970 0.0852 0.3976 0.4095
22.2212 0.7538 150 12.8667 0.3457 0.0852 0.3437 0.3524
19.2252 0.7789 155 12.6627 0.3457 0.0852 0.3437 0.3524
17.9289 0.8040 160 12.5312 0.3457 0.0852 0.3437 0.3524
19.3069 0.8291 165 12.3408 0.3457 0.0852 0.3437 0.3524
20.2723 0.8543 170 12.1137 0.3457 0.0852 0.3437 0.3524
17.534 0.8794 175 11.8986 0.3458 0.0852 0.3442 0.3524
19.06 0.9045 180 11.6703 0.3458 0.0852 0.3442 0.3524
21.1059 0.9296 185 11.4621 0.3613 0.0852 0.3553 0.3673
18.3575 0.9548 190 11.2626 0.3611 0.0852 0.3551 0.3670
18.8256 0.9799 195 11.0890 0.3705 0.0852 0.3655 0.3783
16.6283 1.0050 200 10.9119 0.3705 0.0852 0.3655 0.3783
17.0705 1.0302 205 10.7746 0.3705 0.0852 0.3655 0.3783
16.583 1.0553 210 10.6290 0.3596 0.0852 0.3531 0.3684
17.4136 1.0804 215 10.4930 0.3811 0.0852 0.3752 0.3898
16.53 1.1055 220 10.3394 0.3811 0.0852 0.3752 0.3898
16.3147 1.1307 225 10.2160 0.3811 0.0852 0.3752 0.3898
17.313 1.1558 230 10.0941 0.3811 0.0852 0.3752 0.3898
14.9139 1.1809 235 9.9897 0.3956 0.0852 0.3896 0.4050
15.4727 1.2060 240 9.9119 0.4320 0.0852 0.4258 0.4437
16.2121 1.2312 245 9.8533 0.4467 0.0852 0.4426 0.4585
15.7117 1.2563 250 9.8013 0.4467 0.0852 0.4426 0.4585
14.3507 1.2814 255 9.7481 0.4467 0.0852 0.4426 0.4585
14.6657 1.3065 260 9.6891 0.4467 0.0852 0.4426 0.4585
15.2625 1.3317 265 9.6280 0.4463 0.0852 0.4420 0.4580
16.0463 1.3568 270 9.5713 0.4465 0.0852 0.4429 0.4584
15.1612 1.3819 275 9.5196 0.4465 0.0852 0.4429 0.4584
14.8572 1.4070 280 9.4743 0.4971 0.0979 0.4862 0.5034
14.3652 1.4322 285 9.4267 0.4971 0.0979 0.4862 0.5034
16.1322 1.4573 290 9.3751 0.4971 0.0979 0.4862 0.5034
14.4116 1.4824 295 9.3180 0.4975 0.0979 0.4864 0.5034
14.166 1.5075 300 9.2584 0.5281 0.0979 0.5003 0.5185
14.4904 1.5327 305 9.2126 0.5281 0.0979 0.5002 0.5185
15.3989 1.5578 310 9.1765 0.5281 0.0979 0.5002 0.5185
14.7175 1.5829 315 9.1373 0.5280 0.0979 0.5001 0.5184
15.1778 1.6080 320 9.1008 0.5710 0.1109 0.5275 0.5473
15.717 1.6332 325 9.0625 0.6370 0.1429 0.5776 0.5899
15.786 1.6583 330 9.0358 0.6370 0.1429 0.5776 0.5899
13.8109 1.6834 335 9.0115 0.6781 0.1772 0.6099 0.6218
13.7277 1.7085 340 8.9898 0.6781 0.1772 0.6099 0.6218
14.6217 1.7337 345 8.9701 0.7030 0.1891 0.6333 0.6474
13.9167 1.7588 350 8.9508 0.7236 0.1891 0.6544 0.6707
14.0584 1.7839 355 8.9358 0.7696 0.2147 0.6896 0.7060
15.5297 1.8090 360 8.9224 0.7695 0.2147 0.6887 0.7057
15.3239 1.8342 365 8.9141 0.7695 0.2147 0.6887 0.7057
14.1664 1.8593 370 8.9070 0.7695 0.2147 0.6887 0.7057
13.7965 1.8844 375 8.9005 0.7695 0.2147 0.6887 0.7057
13.9088 1.9095 380 8.8938 0.7694 0.2147 0.6887 0.7056
14.4487 1.9347 385 8.8886 0.7694 0.2147 0.6887 0.7056
16.2926 1.9598 390 8.8851 0.7694 0.2147 0.6887 0.7056
13.9456 1.9849 395 8.8834 0.7694 0.2147 0.6887 0.7056

Framework versions

  • PEFT 0.14.0
  • Transformers 4.49.0
  • Pytorch 2.6.0+cu124
  • Datasets 3.3.2
  • Tokenizers 0.21.0
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for benitoals/my-lora-hf-sum

Base model

google/mt5-small
Adapter
(13)
this model