my-lora-hf

This model is a fine-tuned version of google/mt5-small on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 6.2377
  • Rouge1: 2.4724
  • Rouge2: 0.2498
  • Rougel: 2.3079
  • Rougelsum: 2.3216

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 4

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
26.282 0.0251 5 17.7306 0.3903 0.0229 0.3967 0.3971
22.8549 0.0503 10 17.6062 0.3903 0.0229 0.3965 0.3968
25.1857 0.0754 15 17.4623 0.3903 0.0229 0.3965 0.3968
25.0655 0.1005 20 17.5023 0.4048 0.0229 0.4088 0.4102
27.1595 0.1256 25 17.4741 0.4151 0.0229 0.4180 0.4215
27.138 0.1508 30 17.4851 0.4378 0.0229 0.4294 0.4323
23.4906 0.1759 35 17.4142 0.3833 0.0229 0.3912 0.3909
22.624 0.2010 40 17.2692 0.3833 0.0229 0.3912 0.3909
24.1221 0.2261 45 17.0949 0.4187 0.0229 0.4134 0.4156
23.4493 0.2513 50 16.9610 0.4187 0.0229 0.4134 0.4156
25.0229 0.2764 55 16.7691 0.4166 0.0229 0.4113 0.4134
24.346 0.3015 60 16.6668 0.4791 0.0229 0.4539 0.4536
23.1975 0.3266 65 16.4952 0.4151 0.0229 0.4078 0.4102
21.6027 0.3518 70 16.3300 0.4178 0.0229 0.4109 0.4122
24.1149 0.3769 75 16.2625 0.4294 0.0229 0.4270 0.4268
21.8442 0.4020 80 16.1807 0.4169 0.0229 0.4153 0.4119
23.3859 0.4271 85 16.1351 0.4169 0.0229 0.4153 0.4119
20.8352 0.4523 90 16.0392 0.3939 0.0229 0.3906 0.3921
23.6936 0.4774 95 16.0173 0.3721 0.0229 0.3826 0.3810
24.6121 0.5025 100 15.9600 0.3989 0.0229 0.4062 0.4070
24.6205 0.5276 105 15.8700 0.3832 0.0229 0.3946 0.3926
22.8886 0.5528 110 15.7120 0.3752 0.0229 0.3802 0.3868
21.4168 0.5779 115 15.5049 0.4073 0.0229 0.4110 0.4150
20.5502 0.6030 120 15.2778 0.4786 0.0493 0.4807 0.4818
22.1162 0.6281 125 15.0846 0.4644 0.0493 0.4629 0.4681
19.9664 0.6533 130 14.8895 0.4644 0.0493 0.4629 0.4681
21.8179 0.6784 135 14.6839 0.4532 0.0493 0.4506 0.4568
19.7418 0.7035 140 14.5265 0.4401 0.0619 0.4375 0.4440
18.8154 0.7286 145 14.3868 0.4546 0.0619 0.4497 0.4562
23.85 0.7538 150 14.2192 0.4277 0.0619 0.4229 0.4321
19.3083 0.7789 155 14.0616 0.3760 0.0619 0.3745 0.3790
17.8149 0.8040 160 13.9289 0.3604 0.0619 0.3605 0.3670
19.6341 0.8291 165 13.7362 0.3604 0.0619 0.3605 0.3670
20.2189 0.8543 170 13.5559 0.3448 0.0619 0.3452 0.3500
17.6897 0.8794 175 13.3765 0.3328 0.0619 0.3343 0.3396
19.1985 0.9045 180 13.1860 0.3192 0.0619 0.3187 0.3232
22.0973 0.9296 185 12.9943 0.3484 0.0756 0.3435 0.3480
18.2808 0.9548 190 12.8423 0.3484 0.0756 0.3435 0.3480
19.4271 0.9799 195 12.6962 0.3621 0.0756 0.3565 0.3605
16.4328 1.0050 200 12.5391 0.3621 0.0756 0.3565 0.3605
17.2057 1.0302 205 12.3865 0.3621 0.0756 0.3565 0.3605
17.0921 1.0553 210 12.2590 0.3348 0.0489 0.3283 0.3347
18.4743 1.0804 215 12.1398 0.3348 0.0489 0.3283 0.3347
17.084 1.1055 220 11.9490 0.3348 0.0489 0.3283 0.3347
15.3983 1.1307 225 11.7623 0.3201 0.0489 0.3152 0.3216
17.3434 1.1558 230 11.6081 0.3462 0.0489 0.3405 0.3469
14.5055 1.1809 235 11.4089 0.3462 0.0489 0.3405 0.3469
14.5421 1.2060 240 11.1806 0.3455 0.0489 0.3393 0.3465
15.4771 1.2312 245 10.9563 0.3697 0.0489 0.3515 0.3609
14.7773 1.2563 250 10.7472 0.3535 0.0359 0.3390 0.3463
12.5897 1.2814 255 10.5722 0.3758 0.0359 0.3607 0.3718
13.4934 1.3065 260 10.3998 0.4165 0.0359 0.3984 0.4062
14.3636 1.3317 265 10.2015 0.5002 0.0789 0.4602 0.4769
14.8955 1.3568 270 9.9741 0.5282 0.0679 0.4753 0.4912
13.1539 1.3819 275 9.7512 0.5364 0.0506 0.4924 0.5048
13.2489 1.4070 280 9.5085 0.6261 0.0835 0.5834 0.5925
13.4348 1.4322 285 9.2728 0.6255 0.0708 0.5829 0.5920
14.5854 1.4573 290 9.0892 0.6479 0.0815 0.5951 0.6041
12.7121 1.4824 295 8.9407 0.6731 0.0794 0.6209 0.6339
11.9989 1.5075 300 8.7771 0.6901 0.0824 0.6349 0.6466
12.7228 1.5327 305 8.6534 0.8476 0.1285 0.7741 0.7915
13.1025 1.5578 310 8.5266 0.8674 0.1285 0.8033 0.8169
12.4569 1.5829 315 8.4097 0.8674 0.1285 0.8033 0.8169
12.8088 1.6080 320 8.3092 0.8588 0.1285 0.7927 0.8091
12.5782 1.6332 325 8.2020 1.0401 0.1704 0.9733 0.9710
12.2901 1.6583 330 8.1193 1.0807 0.1834 1.0074 1.0059
11.1978 1.6834 335 8.0178 1.2109 0.2315 1.1353 1.1408
11.2324 1.7085 340 7.9492 1.1993 0.2314 1.1099 1.1150
11.2734 1.7337 345 7.8789 1.2551 0.2314 1.1703 1.1775
11.3287 1.7588 350 7.8109 1.3333 0.1776 1.2429 1.2507
10.8443 1.7839 355 7.7369 1.3289 0.1893 1.2214 1.2134
11.3227 1.8090 360 7.6704 1.4954 0.2192 1.3839 1.3687
11.3922 1.8342 365 7.6183 1.5980 0.2288 1.4809 1.4772
10.3799 1.8593 370 7.5631 1.5997 0.2280 1.5053 1.5004
9.838 1.8844 375 7.4964 1.5374 0.1744 1.4412 1.4385
10.5613 1.9095 380 7.4164 1.6077 0.1635 1.4999 1.5129
10.054 1.9347 385 7.3279 1.5019 0.1376 1.3888 1.3910
11.1505 1.9598 390 7.2568 1.5497 0.1376 1.4428 1.4585
9.6994 1.9849 395 7.1809 1.7434 0.2045 1.6128 1.6446
9.8363 2.0101 400 7.0994 1.8268 0.2398 1.6494 1.6631
9.8672 2.0352 405 7.0202 1.9286 0.2395 1.7153 1.7304
9.9883 2.0603 410 6.9526 1.9111 0.2628 1.6911 1.7111
9.4994 2.0854 415 6.8914 1.8796 0.2374 1.6792 1.6873
9.7394 2.1106 420 6.8379 1.8351 0.2244 1.6821 1.6861
9.8364 2.1357 425 6.8016 1.9244 0.2245 1.7330 1.7383
9.1858 2.1608 430 6.7747 1.9020 0.1622 1.7451 1.7457
9.9149 2.1859 435 6.7477 1.9496 0.1622 1.7809 1.7830
10.0384 2.2111 440 6.7207 1.9289 0.1622 1.7660 1.7701
9.3528 2.2362 445 6.6923 1.9978 0.1707 1.8218 1.8346
9.4208 2.2613 450 6.6668 2.0390 0.1871 1.8828 1.8851
9.1942 2.2864 455 6.6424 2.0593 0.2087 1.8986 1.9001
9.3805 2.3116 460 6.6150 2.0838 0.2085 1.9257 1.9294
9.0846 2.3367 465 6.5857 2.0278 0.2085 1.8741 1.8760
8.9866 2.3618 470 6.5632 2.0961 0.2085 1.9329 1.9317
9.0288 2.3869 475 6.5425 2.2513 0.2827 2.0651 2.0584
9.1447 2.4121 480 6.5246 2.3172 0.2967 2.1624 2.1467
8.549 2.4372 485 6.5078 2.2586 0.2967 2.1040 2.1027
9.0221 2.4623 490 6.4919 2.2060 0.2873 2.0706 2.0761
9.0646 2.4874 495 6.4775 2.1466 0.2834 2.0164 2.0235
9.3397 2.5126 500 6.4652 2.1198 0.2834 1.9919 1.9934
8.8204 2.5377 505 6.4537 2.0995 0.2834 1.9680 1.9717
9.139 2.5628 510 6.4415 2.2589 0.2843 2.0866 2.0803
9.0461 2.5879 515 6.4270 2.2693 0.2842 2.0854 2.0796
8.8862 2.6131 520 6.4161 2.2462 0.2714 2.0903 2.0910
8.8184 2.6382 525 6.4065 2.2489 0.2473 2.1277 2.1201
8.3526 2.6633 530 6.3943 2.2481 0.2473 2.1141 2.1084
8.8545 2.6884 535 6.3833 2.3059 0.2610 2.1646 2.1505
8.3531 2.7136 540 6.3753 2.2896 0.2471 2.1653 2.1504
9.0157 2.7387 545 6.3682 2.4845 0.2828 2.3504 2.3363
9.3887 2.7638 550 6.3641 2.4634 0.2833 2.2818 2.2731
8.2536 2.7889 555 6.3593 2.4877 0.2835 2.3238 2.3163
8.073 2.8141 560 6.3517 2.4887 0.2835 2.3240 2.3169
8.2218 2.8392 565 6.3438 2.4546 0.2835 2.3161 2.3041
8.6336 2.8643 570 6.3358 2.4307 0.2841 2.2969 2.2978
8.3906 2.8894 575 6.3305 2.4145 0.2740 2.2496 2.2409
9.0623 2.9146 580 6.3268 2.4416 0.2740 2.2672 2.2637
8.5729 2.9397 585 6.3220 2.4337 0.2745 2.2573 2.2504
8.7356 2.9648 590 6.3184 2.4469 0.2744 2.2564 2.2499
8.1848 2.9899 595 6.3135 2.4153 0.2745 2.2543 2.2510
8.5797 3.0151 600 6.3090 2.3851 0.2745 2.2282 2.2187
8.1148 3.0402 605 6.3054 2.3860 0.2745 2.2292 2.2193
8.5009 3.0653 610 6.3016 2.2895 0.2356 2.1353 2.1412
8.3814 3.0905 615 6.2981 2.3022 0.2353 2.1486 2.1576
8.4252 3.1156 620 6.2963 2.2869 0.2240 2.1604 2.1672
8.4299 3.1407 625 6.2938 2.3894 0.2375 2.2476 2.2677
8.4384 3.1658 630 6.2906 2.3887 0.2375 2.2394 2.2525
8.3078 3.1910 635 6.2874 2.3602 0.2364 2.2151 2.2316
8.2374 3.2161 640 6.2837 2.3519 0.2479 2.2049 2.2205
8.1229 3.2412 645 6.2812 2.3440 0.2479 2.1831 2.2014
7.971 3.2663 650 6.2783 2.3851 0.2479 2.2170 2.2366
8.2733 3.2915 655 6.2748 2.4371 0.2479 2.2770 2.2937
8.2847 3.3166 660 6.2723 2.4377 0.2479 2.2774 2.2940
8.3704 3.3417 665 6.2696 2.4388 0.2488 2.2772 2.2954
8.4976 3.3668 670 6.2681 2.4493 0.2487 2.2942 2.3085
8.4326 3.3920 675 6.2672 2.4643 0.2487 2.3066 2.3156
8.3527 3.4171 680 6.2659 2.4643 0.2487 2.3066 2.3156
8.1029 3.4422 685 6.2630 2.4478 0.2487 2.2890 2.3013
8.3205 3.4673 690 6.2603 2.4514 0.2495 2.2752 2.2896
8.1628 3.4925 695 6.2583 2.4654 0.2495 2.2752 2.2896
7.862 3.5176 700 6.2558 2.4546 0.2497 2.2637 2.2739
8.1908 3.5427 705 6.2533 2.4643 0.2498 2.2769 2.2865
8.084 3.5678 710 6.2516 2.4639 0.2498 2.2766 2.2865
8.0681 3.5930 715 6.2501 2.4996 0.2498 2.3104 2.3203
8.3121 3.6181 720 6.2483 2.4996 0.2498 2.3104 2.3203
8.3079 3.6432 725 6.2469 2.4996 0.2498 2.3097 2.3203
8.0919 3.6683 730 6.2457 2.5122 0.2498 2.3267 2.3343
7.9226 3.6935 735 6.2447 2.5122 0.2498 2.3267 2.3343
8.2032 3.7186 740 6.2440 2.5122 0.2498 2.3267 2.3343
9.1495 3.7437 745 6.2433 2.5122 0.2498 2.3267 2.3343
8.1283 3.7688 750 6.2424 2.5122 0.2498 2.3267 2.3343
8.0652 3.7940 755 6.2416 2.4724 0.2498 2.3079 2.3216
8.5275 3.8191 760 6.2409 2.4724 0.2498 2.3079 2.3216
7.9583 3.8442 765 6.2402 2.4724 0.2498 2.3079 2.3216
7.9768 3.8693 770 6.2394 2.4724 0.2498 2.3079 2.3216
7.96 3.8945 775 6.2390 2.4724 0.2498 2.3079 2.3216
7.7373 3.9196 780 6.2385 2.4724 0.2498 2.3079 2.3216
8.3979 3.9447 785 6.2381 2.4724 0.2498 2.3079 2.3216
7.7723 3.9698 790 6.2378 2.4724 0.2498 2.3079 2.3216
8.2497 3.9950 795 6.2377 2.4724 0.2498 2.3079 2.3216

Framework versions

  • PEFT 0.14.0
  • Transformers 4.49.0
  • Pytorch 2.6.0+cu124
  • Datasets 3.3.2
  • Tokenizers 0.21.0
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for benitoals/my-lora-hf

Base model

google/mt5-small
Adapter
(13)
this model