my-lora-sum

This model is a fine-tuned version of google/mt5-small on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 5.3452
  • Rouge1: 2.9750
  • Rouge2: 0.0640
  • Rougel: 2.8027
  • Rougelsum: 2.8035

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 2

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
23.6472 0.0160 5 11.7512 0.4881 0.0353 0.4380 0.4362
21.3789 0.0319 10 11.7109 0.4912 0.0353 0.4526 0.4486
19.719 0.0479 15 11.6279 0.5005 0.0432 0.4531 0.4477
22.2901 0.0639 20 11.6000 0.4946 0.0319 0.4410 0.4381
19.6455 0.0799 25 11.5404 0.5263 0.0530 0.4856 0.4833
19.892 0.0958 30 11.5011 0.5202 0.0396 0.4798 0.4716
19.8908 0.1118 35 11.4428 0.5431 0.0429 0.5013 0.4933
21.5179 0.1278 40 11.4066 0.5156 0.0405 0.4717 0.4682
21.7581 0.1438 45 11.3629 0.5416 0.0325 0.4834 0.4787
20.9988 0.1597 50 11.2869 0.5288 0.0270 0.4752 0.4743
21.3311 0.1757 55 11.2290 0.5260 0.0223 0.4642 0.4629
23.6669 0.1917 60 11.1785 0.4932 0.0188 0.4345 0.4307
18.2309 0.2077 65 11.1129 0.4645 0.0193 0.4156 0.4170
20.9785 0.2236 70 11.0331 0.5031 0.0238 0.4532 0.4530
16.4643 0.2396 75 10.9403 0.4414 0.0177 0.3945 0.3899
16.8242 0.2556 80 10.8374 0.4631 0.0177 0.4165 0.4110
23.1133 0.2716 85 10.7508 0.5054 0.0253 0.4434 0.4380
21.1076 0.2875 90 10.6718 0.4766 0.0178 0.4129 0.4101
17.6879 0.3035 95 10.5684 0.5166 0.0212 0.4634 0.4573
19.3182 0.3195 100 10.4315 0.5774 0.0217 0.5121 0.5070
18.6413 0.3355 105 10.2988 0.5325 0.0257 0.4803 0.4803
19.9276 0.3514 110 10.1745 0.5529 0.0348 0.4877 0.4841
17.9226 0.3674 115 10.0297 0.5852 0.0432 0.5075 0.5039
15.9707 0.3834 120 9.8977 0.5779 0.0403 0.5015 0.4992
16.5054 0.3994 125 9.7266 0.5685 0.0416 0.5011 0.4963
14.6994 0.4153 130 9.5595 0.6368 0.0506 0.5492 0.5447
15.9555 0.4313 135 9.3894 0.6401 0.0438 0.5520 0.5465
14.876 0.4473 140 9.2403 0.6910 0.0467 0.5961 0.5929
17.3615 0.4633 145 9.0976 0.7497 0.0499 0.6499 0.6448
15.8582 0.4792 150 8.9385 0.7409 0.0383 0.6427 0.6398
14.4111 0.4952 155 8.7924 0.7075 0.0457 0.6404 0.6366
13.9827 0.5112 160 8.6373 0.6750 0.0446 0.6211 0.6173
17.1081 0.5272 165 8.5070 0.7013 0.0547 0.6479 0.6420
14.2012 0.5431 170 8.3725 0.7220 0.0544 0.6553 0.6525
13.5024 0.5591 175 8.2316 0.7125 0.0324 0.6540 0.6495
13.2137 0.5751 180 8.1178 0.7591 0.0626 0.6776 0.6720
12.6791 0.5911 185 8.0139 0.7564 0.0302 0.6845 0.6840
13.4419 0.6070 190 7.9119 0.7047 0.0277 0.6559 0.6484
13.469 0.6230 195 7.8019 0.7849 0.0268 0.7215 0.7189
12.8491 0.6390 200 7.6875 0.8839 0.0264 0.8145 0.8153
12.8009 0.6550 205 7.5978 0.9797 0.0375 0.8922 0.8930
13.0495 0.6709 210 7.5245 1.0712 0.0504 0.9587 0.9579
12.3307 0.6869 215 7.4464 1.0918 0.0523 1.0047 0.9994
11.1893 0.7029 220 7.3770 1.1632 0.0695 1.0571 1.0527
12.0019 0.7188 225 7.3174 1.1872 0.0522 1.0827 1.0821
10.5739 0.7348 230 7.2509 1.1771 0.0665 1.0618 1.0630
10.7484 0.7508 235 7.1847 1.3403 0.0782 1.1967 1.1933
11.0539 0.7668 240 7.1230 1.4099 0.1009 1.2615 1.2612
10.6808 0.7827 245 7.0601 1.5268 0.0882 1.3528 1.3506
10.0456 0.7987 250 7.0148 1.4664 0.0789 1.3207 1.3142
9.7895 0.8147 255 6.9850 1.6529 0.0969 1.4578 1.4445
9.3146 0.8307 260 6.9547 1.7461 0.0943 1.5750 1.5566
9.8935 0.8466 265 6.9348 1.9384 0.1090 1.7219 1.7093
9.3272 0.8626 270 6.9348 1.8664 0.1143 1.6837 1.6758
9.4551 0.8786 275 6.9497 2.1360 0.1578 1.9404 1.9297
8.7474 0.8946 280 6.9095 2.1561 0.1494 1.9453 1.9398
8.8361 0.9105 285 6.8680 2.1476 0.1304 1.9360 1.9314
10.3327 0.9265 290 6.8129 2.2233 0.1330 1.9752 1.9732
9.6949 0.9425 295 6.7423 2.2698 0.1491 2.0114 2.0057
9.2422 0.9585 300 6.6728 2.2544 0.1332 2.0282 2.0281
9.1457 0.9744 305 6.5924 2.1699 0.1038 1.9641 1.9592
8.4371 0.9904 310 6.5061 2.2267 0.1423 2.0247 2.0174
8.8187 1.0064 315 6.4224 2.3070 0.1251 2.1206 2.1144
8.6228 1.0224 320 6.3316 2.3591 0.1070 2.1583 2.1714
7.9446 1.0383 325 6.2488 2.3552 0.1209 2.1538 2.1659
7.9126 1.0543 330 6.1856 2.3870 0.1292 2.1680 2.1668
8.0568 1.0703 335 6.1166 2.5029 0.1254 2.2559 2.2636
7.5392 1.0863 340 6.0326 2.4650 0.1265 2.2809 2.2768
8.2361 1.1022 345 5.9569 2.4594 0.1041 2.2155 2.2160
7.7452 1.1182 350 5.8964 2.4663 0.0998 2.2324 2.2333
7.6389 1.1342 355 5.8526 2.4997 0.1063 2.2902 2.2822
7.6098 1.1502 360 5.8106 2.6162 0.1157 2.3610 2.3596
7.8916 1.1661 365 5.7841 2.5536 0.0968 2.3276 2.3290
7.5782 1.1821 370 5.7585 2.5692 0.1032 2.3456 2.3406
7.3974 1.1981 375 5.7402 2.5963 0.1286 2.3806 2.3711
7.0614 1.2141 380 5.7159 2.6104 0.0990 2.3693 2.3758
7.2043 1.2300 385 5.6968 2.6200 0.0934 2.3816 2.3891
6.996 1.2460 390 5.6787 2.5952 0.0641 2.4221 2.4286
8.2705 1.2620 395 5.6632 2.6169 0.0403 2.4257 2.4329
7.1852 1.2780 400 5.6461 2.5815 0.0522 2.4163 2.4257
7.1047 1.2939 405 5.6273 2.5892 0.0486 2.4551 2.4584
7.0063 1.3099 410 5.6064 2.5683 0.0327 2.4328 2.4334
6.8297 1.3259 415 5.5840 2.5684 0.0425 2.4205 2.4136
6.7474 1.3419 420 5.5610 2.5945 0.0279 2.4552 2.4478
7.0426 1.3578 425 5.5415 2.5908 0.0120 2.4275 2.4241
6.6945 1.3738 430 5.5259 2.5609 0.0086 2.4019 2.4040
6.8208 1.3898 435 5.5160 2.6132 0.0086 2.4673 2.4634
6.6174 1.4058 440 5.5070 2.6443 0.0187 2.4737 2.4660
6.7205 1.4217 445 5.4988 2.6701 0.0236 2.5099 2.4985
6.6941 1.4377 450 5.4915 2.6858 0.0236 2.5133 2.5047
6.5988 1.4537 455 5.4851 2.6766 0.0349 2.5185 2.5128
6.4683 1.4696 460 5.4767 2.6661 0.0347 2.4922 2.4918
6.7409 1.4856 465 5.4675 2.6519 0.0347 2.5065 2.5077
6.9734 1.5016 470 5.4599 2.6663 0.0431 2.5285 2.5313
6.7049 1.5176 475 5.4536 2.6913 0.0428 2.5617 2.5574
7.1744 1.5335 480 5.4483 2.7426 0.0469 2.5896 2.5873
6.6163 1.5495 485 5.4434 2.7223 0.0428 2.5761 2.5782
6.6025 1.5655 490 5.4381 2.6981 0.0379 2.5405 2.5406
6.4328 1.5815 495 5.4326 2.6555 0.0321 2.5019 2.4957
6.5182 1.5974 500 5.4278 2.7413 0.0445 2.5816 2.5765
6.6248 1.6134 505 5.4226 2.7559 0.0384 2.5907 2.5833
6.4422 1.6294 510 5.4180 2.8097 0.0384 2.6591 2.6516
6.3315 1.6454 515 5.4139 2.8195 0.0345 2.6772 2.6692
6.3602 1.6613 520 5.4077 2.8024 0.0435 2.6610 2.6544
6.3096 1.6773 525 5.4016 2.8611 0.0558 2.7244 2.7161
6.4627 1.6933 530 5.3956 2.8914 0.0558 2.7576 2.7547
7.0027 1.7093 535 5.3912 2.9207 0.0601 2.7619 2.7592
6.6364 1.7252 540 5.3880 2.9260 0.0601 2.7612 2.7558
6.3329 1.7412 545 5.3835 2.9258 0.0631 2.7616 2.7580
6.3528 1.7572 550 5.3783 2.9378 0.0631 2.7630 2.7573
6.3083 1.7732 555 5.3731 2.9500 0.0631 2.7680 2.7632
6.3824 1.7891 560 5.3686 2.9599 0.0681 2.7896 2.7882
6.2145 1.8051 565 5.3646 2.9371 0.0644 2.7612 2.7619
6.5271 1.8211 570 5.3605 2.9350 0.0647 2.7695 2.7717
6.5508 1.8371 575 5.3568 2.9277 0.0524 2.7676 2.7682
6.4093 1.8530 580 5.3551 2.9401 0.0524 2.7711 2.7727
6.5111 1.8690 585 5.3532 2.9682 0.0641 2.7947 2.7934
6.459 1.8850 590 5.3513 2.9682 0.0641 2.7946 2.7933
6.3278 1.9010 595 5.3497 2.9685 0.0641 2.7994 2.7970
6.7423 1.9169 600 5.3486 2.9733 0.0641 2.7993 2.7969
6.5692 1.9329 605 5.3476 2.9732 0.0641 2.8092 2.8055
6.3758 1.9489 610 5.3467 2.9784 0.0640 2.8074 2.8060
6.4735 1.9649 615 5.3460 2.9749 0.0640 2.8028 2.8034
6.2938 1.9808 620 5.3455 2.9749 0.0640 2.8027 2.8033
6.2542 1.9968 625 5.3452 2.9750 0.0640 2.8027 2.8035

Framework versions

  • PEFT 0.14.0
  • Transformers 4.49.0
  • Pytorch 2.6.0+cu124
  • Datasets 3.3.2
  • Tokenizers 0.21.0
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for benitoals/my-lora-sum

Base model

google/mt5-small
Adapter
(13)
this model