my-lora-hf
This model is a fine-tuned version of google/mt5-small on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 6.2377
- Rouge1: 2.4724
- Rouge2: 0.2498
- Rougel: 2.3079
- Rougelsum: 2.3216
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 4
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
---|---|---|---|---|---|---|---|
26.282 | 0.0251 | 5 | 17.7306 | 0.3903 | 0.0229 | 0.3967 | 0.3971 |
22.8549 | 0.0503 | 10 | 17.6062 | 0.3903 | 0.0229 | 0.3965 | 0.3968 |
25.1857 | 0.0754 | 15 | 17.4623 | 0.3903 | 0.0229 | 0.3965 | 0.3968 |
25.0655 | 0.1005 | 20 | 17.5023 | 0.4048 | 0.0229 | 0.4088 | 0.4102 |
27.1595 | 0.1256 | 25 | 17.4741 | 0.4151 | 0.0229 | 0.4180 | 0.4215 |
27.138 | 0.1508 | 30 | 17.4851 | 0.4378 | 0.0229 | 0.4294 | 0.4323 |
23.4906 | 0.1759 | 35 | 17.4142 | 0.3833 | 0.0229 | 0.3912 | 0.3909 |
22.624 | 0.2010 | 40 | 17.2692 | 0.3833 | 0.0229 | 0.3912 | 0.3909 |
24.1221 | 0.2261 | 45 | 17.0949 | 0.4187 | 0.0229 | 0.4134 | 0.4156 |
23.4493 | 0.2513 | 50 | 16.9610 | 0.4187 | 0.0229 | 0.4134 | 0.4156 |
25.0229 | 0.2764 | 55 | 16.7691 | 0.4166 | 0.0229 | 0.4113 | 0.4134 |
24.346 | 0.3015 | 60 | 16.6668 | 0.4791 | 0.0229 | 0.4539 | 0.4536 |
23.1975 | 0.3266 | 65 | 16.4952 | 0.4151 | 0.0229 | 0.4078 | 0.4102 |
21.6027 | 0.3518 | 70 | 16.3300 | 0.4178 | 0.0229 | 0.4109 | 0.4122 |
24.1149 | 0.3769 | 75 | 16.2625 | 0.4294 | 0.0229 | 0.4270 | 0.4268 |
21.8442 | 0.4020 | 80 | 16.1807 | 0.4169 | 0.0229 | 0.4153 | 0.4119 |
23.3859 | 0.4271 | 85 | 16.1351 | 0.4169 | 0.0229 | 0.4153 | 0.4119 |
20.8352 | 0.4523 | 90 | 16.0392 | 0.3939 | 0.0229 | 0.3906 | 0.3921 |
23.6936 | 0.4774 | 95 | 16.0173 | 0.3721 | 0.0229 | 0.3826 | 0.3810 |
24.6121 | 0.5025 | 100 | 15.9600 | 0.3989 | 0.0229 | 0.4062 | 0.4070 |
24.6205 | 0.5276 | 105 | 15.8700 | 0.3832 | 0.0229 | 0.3946 | 0.3926 |
22.8886 | 0.5528 | 110 | 15.7120 | 0.3752 | 0.0229 | 0.3802 | 0.3868 |
21.4168 | 0.5779 | 115 | 15.5049 | 0.4073 | 0.0229 | 0.4110 | 0.4150 |
20.5502 | 0.6030 | 120 | 15.2778 | 0.4786 | 0.0493 | 0.4807 | 0.4818 |
22.1162 | 0.6281 | 125 | 15.0846 | 0.4644 | 0.0493 | 0.4629 | 0.4681 |
19.9664 | 0.6533 | 130 | 14.8895 | 0.4644 | 0.0493 | 0.4629 | 0.4681 |
21.8179 | 0.6784 | 135 | 14.6839 | 0.4532 | 0.0493 | 0.4506 | 0.4568 |
19.7418 | 0.7035 | 140 | 14.5265 | 0.4401 | 0.0619 | 0.4375 | 0.4440 |
18.8154 | 0.7286 | 145 | 14.3868 | 0.4546 | 0.0619 | 0.4497 | 0.4562 |
23.85 | 0.7538 | 150 | 14.2192 | 0.4277 | 0.0619 | 0.4229 | 0.4321 |
19.3083 | 0.7789 | 155 | 14.0616 | 0.3760 | 0.0619 | 0.3745 | 0.3790 |
17.8149 | 0.8040 | 160 | 13.9289 | 0.3604 | 0.0619 | 0.3605 | 0.3670 |
19.6341 | 0.8291 | 165 | 13.7362 | 0.3604 | 0.0619 | 0.3605 | 0.3670 |
20.2189 | 0.8543 | 170 | 13.5559 | 0.3448 | 0.0619 | 0.3452 | 0.3500 |
17.6897 | 0.8794 | 175 | 13.3765 | 0.3328 | 0.0619 | 0.3343 | 0.3396 |
19.1985 | 0.9045 | 180 | 13.1860 | 0.3192 | 0.0619 | 0.3187 | 0.3232 |
22.0973 | 0.9296 | 185 | 12.9943 | 0.3484 | 0.0756 | 0.3435 | 0.3480 |
18.2808 | 0.9548 | 190 | 12.8423 | 0.3484 | 0.0756 | 0.3435 | 0.3480 |
19.4271 | 0.9799 | 195 | 12.6962 | 0.3621 | 0.0756 | 0.3565 | 0.3605 |
16.4328 | 1.0050 | 200 | 12.5391 | 0.3621 | 0.0756 | 0.3565 | 0.3605 |
17.2057 | 1.0302 | 205 | 12.3865 | 0.3621 | 0.0756 | 0.3565 | 0.3605 |
17.0921 | 1.0553 | 210 | 12.2590 | 0.3348 | 0.0489 | 0.3283 | 0.3347 |
18.4743 | 1.0804 | 215 | 12.1398 | 0.3348 | 0.0489 | 0.3283 | 0.3347 |
17.084 | 1.1055 | 220 | 11.9490 | 0.3348 | 0.0489 | 0.3283 | 0.3347 |
15.3983 | 1.1307 | 225 | 11.7623 | 0.3201 | 0.0489 | 0.3152 | 0.3216 |
17.3434 | 1.1558 | 230 | 11.6081 | 0.3462 | 0.0489 | 0.3405 | 0.3469 |
14.5055 | 1.1809 | 235 | 11.4089 | 0.3462 | 0.0489 | 0.3405 | 0.3469 |
14.5421 | 1.2060 | 240 | 11.1806 | 0.3455 | 0.0489 | 0.3393 | 0.3465 |
15.4771 | 1.2312 | 245 | 10.9563 | 0.3697 | 0.0489 | 0.3515 | 0.3609 |
14.7773 | 1.2563 | 250 | 10.7472 | 0.3535 | 0.0359 | 0.3390 | 0.3463 |
12.5897 | 1.2814 | 255 | 10.5722 | 0.3758 | 0.0359 | 0.3607 | 0.3718 |
13.4934 | 1.3065 | 260 | 10.3998 | 0.4165 | 0.0359 | 0.3984 | 0.4062 |
14.3636 | 1.3317 | 265 | 10.2015 | 0.5002 | 0.0789 | 0.4602 | 0.4769 |
14.8955 | 1.3568 | 270 | 9.9741 | 0.5282 | 0.0679 | 0.4753 | 0.4912 |
13.1539 | 1.3819 | 275 | 9.7512 | 0.5364 | 0.0506 | 0.4924 | 0.5048 |
13.2489 | 1.4070 | 280 | 9.5085 | 0.6261 | 0.0835 | 0.5834 | 0.5925 |
13.4348 | 1.4322 | 285 | 9.2728 | 0.6255 | 0.0708 | 0.5829 | 0.5920 |
14.5854 | 1.4573 | 290 | 9.0892 | 0.6479 | 0.0815 | 0.5951 | 0.6041 |
12.7121 | 1.4824 | 295 | 8.9407 | 0.6731 | 0.0794 | 0.6209 | 0.6339 |
11.9989 | 1.5075 | 300 | 8.7771 | 0.6901 | 0.0824 | 0.6349 | 0.6466 |
12.7228 | 1.5327 | 305 | 8.6534 | 0.8476 | 0.1285 | 0.7741 | 0.7915 |
13.1025 | 1.5578 | 310 | 8.5266 | 0.8674 | 0.1285 | 0.8033 | 0.8169 |
12.4569 | 1.5829 | 315 | 8.4097 | 0.8674 | 0.1285 | 0.8033 | 0.8169 |
12.8088 | 1.6080 | 320 | 8.3092 | 0.8588 | 0.1285 | 0.7927 | 0.8091 |
12.5782 | 1.6332 | 325 | 8.2020 | 1.0401 | 0.1704 | 0.9733 | 0.9710 |
12.2901 | 1.6583 | 330 | 8.1193 | 1.0807 | 0.1834 | 1.0074 | 1.0059 |
11.1978 | 1.6834 | 335 | 8.0178 | 1.2109 | 0.2315 | 1.1353 | 1.1408 |
11.2324 | 1.7085 | 340 | 7.9492 | 1.1993 | 0.2314 | 1.1099 | 1.1150 |
11.2734 | 1.7337 | 345 | 7.8789 | 1.2551 | 0.2314 | 1.1703 | 1.1775 |
11.3287 | 1.7588 | 350 | 7.8109 | 1.3333 | 0.1776 | 1.2429 | 1.2507 |
10.8443 | 1.7839 | 355 | 7.7369 | 1.3289 | 0.1893 | 1.2214 | 1.2134 |
11.3227 | 1.8090 | 360 | 7.6704 | 1.4954 | 0.2192 | 1.3839 | 1.3687 |
11.3922 | 1.8342 | 365 | 7.6183 | 1.5980 | 0.2288 | 1.4809 | 1.4772 |
10.3799 | 1.8593 | 370 | 7.5631 | 1.5997 | 0.2280 | 1.5053 | 1.5004 |
9.838 | 1.8844 | 375 | 7.4964 | 1.5374 | 0.1744 | 1.4412 | 1.4385 |
10.5613 | 1.9095 | 380 | 7.4164 | 1.6077 | 0.1635 | 1.4999 | 1.5129 |
10.054 | 1.9347 | 385 | 7.3279 | 1.5019 | 0.1376 | 1.3888 | 1.3910 |
11.1505 | 1.9598 | 390 | 7.2568 | 1.5497 | 0.1376 | 1.4428 | 1.4585 |
9.6994 | 1.9849 | 395 | 7.1809 | 1.7434 | 0.2045 | 1.6128 | 1.6446 |
9.8363 | 2.0101 | 400 | 7.0994 | 1.8268 | 0.2398 | 1.6494 | 1.6631 |
9.8672 | 2.0352 | 405 | 7.0202 | 1.9286 | 0.2395 | 1.7153 | 1.7304 |
9.9883 | 2.0603 | 410 | 6.9526 | 1.9111 | 0.2628 | 1.6911 | 1.7111 |
9.4994 | 2.0854 | 415 | 6.8914 | 1.8796 | 0.2374 | 1.6792 | 1.6873 |
9.7394 | 2.1106 | 420 | 6.8379 | 1.8351 | 0.2244 | 1.6821 | 1.6861 |
9.8364 | 2.1357 | 425 | 6.8016 | 1.9244 | 0.2245 | 1.7330 | 1.7383 |
9.1858 | 2.1608 | 430 | 6.7747 | 1.9020 | 0.1622 | 1.7451 | 1.7457 |
9.9149 | 2.1859 | 435 | 6.7477 | 1.9496 | 0.1622 | 1.7809 | 1.7830 |
10.0384 | 2.2111 | 440 | 6.7207 | 1.9289 | 0.1622 | 1.7660 | 1.7701 |
9.3528 | 2.2362 | 445 | 6.6923 | 1.9978 | 0.1707 | 1.8218 | 1.8346 |
9.4208 | 2.2613 | 450 | 6.6668 | 2.0390 | 0.1871 | 1.8828 | 1.8851 |
9.1942 | 2.2864 | 455 | 6.6424 | 2.0593 | 0.2087 | 1.8986 | 1.9001 |
9.3805 | 2.3116 | 460 | 6.6150 | 2.0838 | 0.2085 | 1.9257 | 1.9294 |
9.0846 | 2.3367 | 465 | 6.5857 | 2.0278 | 0.2085 | 1.8741 | 1.8760 |
8.9866 | 2.3618 | 470 | 6.5632 | 2.0961 | 0.2085 | 1.9329 | 1.9317 |
9.0288 | 2.3869 | 475 | 6.5425 | 2.2513 | 0.2827 | 2.0651 | 2.0584 |
9.1447 | 2.4121 | 480 | 6.5246 | 2.3172 | 0.2967 | 2.1624 | 2.1467 |
8.549 | 2.4372 | 485 | 6.5078 | 2.2586 | 0.2967 | 2.1040 | 2.1027 |
9.0221 | 2.4623 | 490 | 6.4919 | 2.2060 | 0.2873 | 2.0706 | 2.0761 |
9.0646 | 2.4874 | 495 | 6.4775 | 2.1466 | 0.2834 | 2.0164 | 2.0235 |
9.3397 | 2.5126 | 500 | 6.4652 | 2.1198 | 0.2834 | 1.9919 | 1.9934 |
8.8204 | 2.5377 | 505 | 6.4537 | 2.0995 | 0.2834 | 1.9680 | 1.9717 |
9.139 | 2.5628 | 510 | 6.4415 | 2.2589 | 0.2843 | 2.0866 | 2.0803 |
9.0461 | 2.5879 | 515 | 6.4270 | 2.2693 | 0.2842 | 2.0854 | 2.0796 |
8.8862 | 2.6131 | 520 | 6.4161 | 2.2462 | 0.2714 | 2.0903 | 2.0910 |
8.8184 | 2.6382 | 525 | 6.4065 | 2.2489 | 0.2473 | 2.1277 | 2.1201 |
8.3526 | 2.6633 | 530 | 6.3943 | 2.2481 | 0.2473 | 2.1141 | 2.1084 |
8.8545 | 2.6884 | 535 | 6.3833 | 2.3059 | 0.2610 | 2.1646 | 2.1505 |
8.3531 | 2.7136 | 540 | 6.3753 | 2.2896 | 0.2471 | 2.1653 | 2.1504 |
9.0157 | 2.7387 | 545 | 6.3682 | 2.4845 | 0.2828 | 2.3504 | 2.3363 |
9.3887 | 2.7638 | 550 | 6.3641 | 2.4634 | 0.2833 | 2.2818 | 2.2731 |
8.2536 | 2.7889 | 555 | 6.3593 | 2.4877 | 0.2835 | 2.3238 | 2.3163 |
8.073 | 2.8141 | 560 | 6.3517 | 2.4887 | 0.2835 | 2.3240 | 2.3169 |
8.2218 | 2.8392 | 565 | 6.3438 | 2.4546 | 0.2835 | 2.3161 | 2.3041 |
8.6336 | 2.8643 | 570 | 6.3358 | 2.4307 | 0.2841 | 2.2969 | 2.2978 |
8.3906 | 2.8894 | 575 | 6.3305 | 2.4145 | 0.2740 | 2.2496 | 2.2409 |
9.0623 | 2.9146 | 580 | 6.3268 | 2.4416 | 0.2740 | 2.2672 | 2.2637 |
8.5729 | 2.9397 | 585 | 6.3220 | 2.4337 | 0.2745 | 2.2573 | 2.2504 |
8.7356 | 2.9648 | 590 | 6.3184 | 2.4469 | 0.2744 | 2.2564 | 2.2499 |
8.1848 | 2.9899 | 595 | 6.3135 | 2.4153 | 0.2745 | 2.2543 | 2.2510 |
8.5797 | 3.0151 | 600 | 6.3090 | 2.3851 | 0.2745 | 2.2282 | 2.2187 |
8.1148 | 3.0402 | 605 | 6.3054 | 2.3860 | 0.2745 | 2.2292 | 2.2193 |
8.5009 | 3.0653 | 610 | 6.3016 | 2.2895 | 0.2356 | 2.1353 | 2.1412 |
8.3814 | 3.0905 | 615 | 6.2981 | 2.3022 | 0.2353 | 2.1486 | 2.1576 |
8.4252 | 3.1156 | 620 | 6.2963 | 2.2869 | 0.2240 | 2.1604 | 2.1672 |
8.4299 | 3.1407 | 625 | 6.2938 | 2.3894 | 0.2375 | 2.2476 | 2.2677 |
8.4384 | 3.1658 | 630 | 6.2906 | 2.3887 | 0.2375 | 2.2394 | 2.2525 |
8.3078 | 3.1910 | 635 | 6.2874 | 2.3602 | 0.2364 | 2.2151 | 2.2316 |
8.2374 | 3.2161 | 640 | 6.2837 | 2.3519 | 0.2479 | 2.2049 | 2.2205 |
8.1229 | 3.2412 | 645 | 6.2812 | 2.3440 | 0.2479 | 2.1831 | 2.2014 |
7.971 | 3.2663 | 650 | 6.2783 | 2.3851 | 0.2479 | 2.2170 | 2.2366 |
8.2733 | 3.2915 | 655 | 6.2748 | 2.4371 | 0.2479 | 2.2770 | 2.2937 |
8.2847 | 3.3166 | 660 | 6.2723 | 2.4377 | 0.2479 | 2.2774 | 2.2940 |
8.3704 | 3.3417 | 665 | 6.2696 | 2.4388 | 0.2488 | 2.2772 | 2.2954 |
8.4976 | 3.3668 | 670 | 6.2681 | 2.4493 | 0.2487 | 2.2942 | 2.3085 |
8.4326 | 3.3920 | 675 | 6.2672 | 2.4643 | 0.2487 | 2.3066 | 2.3156 |
8.3527 | 3.4171 | 680 | 6.2659 | 2.4643 | 0.2487 | 2.3066 | 2.3156 |
8.1029 | 3.4422 | 685 | 6.2630 | 2.4478 | 0.2487 | 2.2890 | 2.3013 |
8.3205 | 3.4673 | 690 | 6.2603 | 2.4514 | 0.2495 | 2.2752 | 2.2896 |
8.1628 | 3.4925 | 695 | 6.2583 | 2.4654 | 0.2495 | 2.2752 | 2.2896 |
7.862 | 3.5176 | 700 | 6.2558 | 2.4546 | 0.2497 | 2.2637 | 2.2739 |
8.1908 | 3.5427 | 705 | 6.2533 | 2.4643 | 0.2498 | 2.2769 | 2.2865 |
8.084 | 3.5678 | 710 | 6.2516 | 2.4639 | 0.2498 | 2.2766 | 2.2865 |
8.0681 | 3.5930 | 715 | 6.2501 | 2.4996 | 0.2498 | 2.3104 | 2.3203 |
8.3121 | 3.6181 | 720 | 6.2483 | 2.4996 | 0.2498 | 2.3104 | 2.3203 |
8.3079 | 3.6432 | 725 | 6.2469 | 2.4996 | 0.2498 | 2.3097 | 2.3203 |
8.0919 | 3.6683 | 730 | 6.2457 | 2.5122 | 0.2498 | 2.3267 | 2.3343 |
7.9226 | 3.6935 | 735 | 6.2447 | 2.5122 | 0.2498 | 2.3267 | 2.3343 |
8.2032 | 3.7186 | 740 | 6.2440 | 2.5122 | 0.2498 | 2.3267 | 2.3343 |
9.1495 | 3.7437 | 745 | 6.2433 | 2.5122 | 0.2498 | 2.3267 | 2.3343 |
8.1283 | 3.7688 | 750 | 6.2424 | 2.5122 | 0.2498 | 2.3267 | 2.3343 |
8.0652 | 3.7940 | 755 | 6.2416 | 2.4724 | 0.2498 | 2.3079 | 2.3216 |
8.5275 | 3.8191 | 760 | 6.2409 | 2.4724 | 0.2498 | 2.3079 | 2.3216 |
7.9583 | 3.8442 | 765 | 6.2402 | 2.4724 | 0.2498 | 2.3079 | 2.3216 |
7.9768 | 3.8693 | 770 | 6.2394 | 2.4724 | 0.2498 | 2.3079 | 2.3216 |
7.96 | 3.8945 | 775 | 6.2390 | 2.4724 | 0.2498 | 2.3079 | 2.3216 |
7.7373 | 3.9196 | 780 | 6.2385 | 2.4724 | 0.2498 | 2.3079 | 2.3216 |
8.3979 | 3.9447 | 785 | 6.2381 | 2.4724 | 0.2498 | 2.3079 | 2.3216 |
7.7723 | 3.9698 | 790 | 6.2378 | 2.4724 | 0.2498 | 2.3079 | 2.3216 |
8.2497 | 3.9950 | 795 | 6.2377 | 2.4724 | 0.2498 | 2.3079 | 2.3216 |
Framework versions
- PEFT 0.14.0
- Transformers 4.49.0
- Pytorch 2.6.0+cu124
- Datasets 3.3.2
- Tokenizers 0.21.0
- Downloads last month
- 1
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for benitoals/my-lora-hf
Base model
google/mt5-small