mt5-lora-hf
This model is a fine-tuned version of google/mt5-small on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 4.8417
- Rouge1: 4.6911
- Rouge2: 0.0143
- Rougel: 4.5972
- Rougelsum: 4.5916
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 6
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
---|---|---|---|---|---|---|---|
26.7014 | 0.0251 | 5 | 17.9690 | 0.4959 | 0.0229 | 0.4821 | 0.4821 |
22.8828 | 0.0503 | 10 | 17.8948 | 0.4330 | 0.0229 | 0.4394 | 0.4427 |
25.3054 | 0.0754 | 15 | 17.7656 | 0.4335 | 0.0229 | 0.4399 | 0.4431 |
24.2626 | 0.1005 | 20 | 17.6523 | 0.4614 | 0.0229 | 0.4498 | 0.4541 |
26.9164 | 0.1256 | 25 | 17.4744 | 0.4291 | 0.0229 | 0.4298 | 0.4340 |
26.7442 | 0.1508 | 30 | 17.3540 | 0.4528 | 0.0229 | 0.4507 | 0.4560 |
22.8846 | 0.1759 | 35 | 17.1714 | 0.4395 | 0.0229 | 0.4407 | 0.4423 |
23.0382 | 0.2010 | 40 | 17.0225 | 0.4176 | 0.0229 | 0.4200 | 0.4272 |
24.5139 | 0.2261 | 45 | 16.9602 | 0.3881 | 0.0229 | 0.3958 | 0.4018 |
23.5225 | 0.2513 | 50 | 16.8832 | 0.4163 | 0.0229 | 0.4241 | 0.4257 |
24.7283 | 0.2764 | 55 | 16.7457 | 0.4156 | 0.0229 | 0.4235 | 0.4254 |
24.5206 | 0.3015 | 60 | 16.4804 | 0.4285 | 0.0229 | 0.4337 | 0.4356 |
22.8146 | 0.3266 | 65 | 16.2983 | 0.4012 | 0.0229 | 0.4064 | 0.4090 |
21.381 | 0.3518 | 70 | 16.0329 | 0.3890 | 0.0229 | 0.3922 | 0.3988 |
23.4543 | 0.3769 | 75 | 15.8419 | 0.3890 | 0.0229 | 0.3922 | 0.3988 |
20.3948 | 0.4020 | 80 | 15.6935 | 0.4034 | 0.0229 | 0.4058 | 0.4152 |
23.315 | 0.4271 | 85 | 15.5347 | 0.3758 | 0.0229 | 0.3724 | 0.3848 |
20.4828 | 0.4523 | 90 | 15.4258 | 0.3758 | 0.0229 | 0.3724 | 0.3848 |
22.6177 | 0.4774 | 95 | 15.2736 | 0.3821 | 0.0229 | 0.3830 | 0.3916 |
22.7118 | 0.5025 | 100 | 15.0719 | 0.3408 | 0.0229 | 0.3414 | 0.3506 |
24.145 | 0.5276 | 105 | 14.8205 | 0.3314 | 0.0229 | 0.3350 | 0.3426 |
21.6796 | 0.5528 | 110 | 14.5375 | 0.3310 | 0.0229 | 0.3348 | 0.3413 |
21.0313 | 0.5779 | 115 | 14.2323 | 0.3313 | 0.0229 | 0.3349 | 0.3425 |
20.1509 | 0.6030 | 120 | 13.9040 | 0.3306 | 0.0229 | 0.3340 | 0.3412 |
20.8036 | 0.6281 | 125 | 13.5365 | 0.3903 | 0.0493 | 0.3953 | 0.3977 |
18.9977 | 0.6533 | 130 | 13.2251 | 0.3972 | 0.0493 | 0.3996 | 0.4052 |
21.1749 | 0.6784 | 135 | 13.0306 | 0.3848 | 0.0493 | 0.3826 | 0.3906 |
18.1424 | 0.7035 | 140 | 12.7417 | 0.3568 | 0.0493 | 0.3589 | 0.3579 |
17.3758 | 0.7286 | 145 | 12.5348 | 0.3567 | 0.0493 | 0.3590 | 0.3577 |
21.9186 | 0.7538 | 150 | 12.2754 | 0.3735 | 0.0493 | 0.3773 | 0.3764 |
18.0612 | 0.7789 | 155 | 11.9966 | 0.3858 | 0.0493 | 0.3916 | 0.3914 |
16.8307 | 0.8040 | 160 | 11.8330 | 0.3154 | 0.0493 | 0.3172 | 0.3202 |
17.5778 | 0.8291 | 165 | 11.6714 | 0.2914 | 0.0493 | 0.2913 | 0.2967 |
18.1477 | 0.8543 | 170 | 11.4841 | 0.2913 | 0.0493 | 0.2913 | 0.2964 |
15.9704 | 0.8794 | 175 | 11.3850 | 0.3049 | 0.0493 | 0.3017 | 0.3097 |
17.1034 | 0.9045 | 180 | 11.2041 | 0.2991 | 0.0619 | 0.2978 | 0.3027 |
19.7897 | 0.9296 | 185 | 11.0869 | 0.3345 | 0.0662 | 0.3265 | 0.3309 |
16.608 | 0.9548 | 190 | 10.9445 | 0.3345 | 0.0662 | 0.3265 | 0.3309 |
17.1781 | 0.9799 | 195 | 10.8493 | 0.3345 | 0.0662 | 0.3264 | 0.3308 |
14.5506 | 1.0050 | 200 | 10.7443 | 0.3557 | 0.0662 | 0.3475 | 0.3524 |
15.1794 | 1.0302 | 205 | 10.6058 | 0.3699 | 0.0662 | 0.3608 | 0.3684 |
14.1433 | 1.0553 | 210 | 10.4693 | 0.3699 | 0.0662 | 0.3608 | 0.3684 |
15.3501 | 1.0804 | 215 | 10.2548 | 0.3697 | 0.0662 | 0.3600 | 0.3678 |
14.2343 | 1.1055 | 220 | 10.0423 | 0.3992 | 0.0662 | 0.3840 | 0.3902 |
13.6561 | 1.1307 | 225 | 9.8362 | 0.3637 | 0.0594 | 0.3538 | 0.3593 |
14.1522 | 1.1558 | 230 | 9.6526 | 0.4294 | 0.0662 | 0.4139 | 0.4260 |
12.2793 | 1.1809 | 235 | 9.4753 | 0.4294 | 0.0662 | 0.4139 | 0.4260 |
12.999 | 1.2060 | 240 | 9.3080 | 0.4818 | 0.0662 | 0.4527 | 0.4640 |
12.6114 | 1.2312 | 245 | 9.1514 | 0.5574 | 0.0843 | 0.5411 | 0.5470 |
12.6979 | 1.2563 | 250 | 9.0131 | 0.5651 | 0.0843 | 0.5186 | 0.5269 |
11.6085 | 1.2814 | 255 | 8.8960 | 0.5875 | 0.0761 | 0.5317 | 0.5412 |
11.9352 | 1.3065 | 260 | 8.7639 | 0.6295 | 0.1163 | 0.5637 | 0.5745 |
12.1973 | 1.3317 | 265 | 8.6219 | 0.6309 | 0.1163 | 0.5660 | 0.5754 |
11.8386 | 1.3568 | 270 | 8.5034 | 0.6541 | 0.1163 | 0.5877 | 0.5997 |
11.1427 | 1.3819 | 275 | 8.3913 | 0.7308 | 0.1302 | 0.6666 | 0.6784 |
11.6074 | 1.4070 | 280 | 8.2927 | 0.7541 | 0.1298 | 0.6892 | 0.7015 |
11.879 | 1.4322 | 285 | 8.1927 | 0.7048 | 0.0662 | 0.6537 | 0.6663 |
12.5362 | 1.4573 | 290 | 8.0887 | 0.7321 | 0.0662 | 0.6832 | 0.6939 |
11.1034 | 1.4824 | 295 | 7.9519 | 0.7805 | 0.0742 | 0.7167 | 0.7279 |
10.742 | 1.5075 | 300 | 7.7980 | 0.9069 | 0.1048 | 0.8357 | 0.8377 |
10.3591 | 1.5327 | 305 | 7.6581 | 1.1354 | 0.2100 | 0.9999 | 1.0172 |
11.0729 | 1.5578 | 310 | 7.5439 | 1.2665 | 0.2698 | 1.1212 | 1.1410 |
11.0979 | 1.5829 | 315 | 7.4250 | 1.2760 | 0.2269 | 1.1590 | 1.1749 |
10.7504 | 1.6080 | 320 | 7.3066 | 1.0842 | 0.1685 | 1.0249 | 1.0238 |
11.2598 | 1.6332 | 325 | 7.2042 | 1.1594 | 0.1687 | 1.0796 | 1.0846 |
10.4366 | 1.6583 | 330 | 7.1160 | 1.1594 | 0.1687 | 1.0796 | 1.0846 |
10.0824 | 1.6834 | 335 | 7.0343 | 1.1730 | 0.1684 | 1.0898 | 1.0982 |
9.9589 | 1.7085 | 340 | 6.9468 | 1.1306 | 0.1686 | 1.0485 | 1.0545 |
10.3309 | 1.7337 | 345 | 6.8646 | 1.1736 | 0.1824 | 1.0809 | 1.0876 |
9.6166 | 1.7588 | 350 | 6.7984 | 1.1165 | 0.1690 | 1.0478 | 1.0482 |
9.3742 | 1.7839 | 355 | 6.7329 | 1.2620 | 0.2014 | 1.1354 | 1.1476 |
9.853 | 1.8090 | 360 | 6.6669 | 1.2627 | 0.2014 | 1.1455 | 1.1582 |
10.1404 | 1.8342 | 365 | 6.6068 | 1.3100 | 0.2156 | 1.2083 | 1.2155 |
9.3509 | 1.8593 | 370 | 6.5692 | 1.2194 | 0.2020 | 1.1225 | 1.1152 |
8.8801 | 1.8844 | 375 | 6.5346 | 1.1335 | 0.1431 | 1.0131 | 1.0107 |
9.3656 | 1.9095 | 380 | 6.5026 | 1.1119 | 0.1291 | 1.0170 | 1.0141 |
9.0491 | 1.9347 | 385 | 6.4711 | 1.2375 | 0.1293 | 1.1102 | 1.1153 |
9.6425 | 1.9598 | 390 | 6.4447 | 1.2243 | 0.1409 | 1.1258 | 1.1284 |
8.8074 | 1.9849 | 395 | 6.4136 | 1.3684 | 0.2034 | 1.2323 | 1.2465 |
8.6168 | 2.0101 | 400 | 6.3833 | 1.4884 | 0.1787 | 1.3723 | 1.3762 |
8.9557 | 2.0352 | 405 | 6.3572 | 1.4520 | 0.1638 | 1.2990 | 1.2947 |
9.101 | 2.0603 | 410 | 6.3413 | 1.6343 | 0.1641 | 1.4365 | 1.4336 |
8.438 | 2.0854 | 415 | 6.3290 | 1.6232 | 0.1505 | 1.4573 | 1.4618 |
8.6262 | 2.1106 | 420 | 6.3048 | 1.6377 | 0.1245 | 1.5029 | 1.5195 |
8.9535 | 2.1357 | 425 | 6.2767 | 1.6318 | 0.1749 | 1.5081 | 1.5353 |
8.3392 | 2.1608 | 430 | 6.2523 | 1.6524 | 0.1743 | 1.5073 | 1.5367 |
8.6226 | 2.1859 | 435 | 6.2420 | 1.6597 | 0.1743 | 1.5340 | 1.5374 |
8.5399 | 2.2111 | 440 | 6.2366 | 1.6552 | 0.1743 | 1.5178 | 1.5262 |
8.4814 | 2.2362 | 445 | 6.2269 | 1.6997 | 0.1599 | 1.5507 | 1.5492 |
8.402 | 2.2613 | 450 | 6.2206 | 1.7743 | 0.1729 | 1.6229 | 1.6207 |
8.1715 | 2.2864 | 455 | 6.2157 | 1.7005 | 0.1731 | 1.5580 | 1.5519 |
8.3982 | 2.3116 | 460 | 6.2149 | 1.7912 | 0.1731 | 1.6301 | 1.6303 |
8.2935 | 2.3367 | 465 | 6.2051 | 1.8151 | 0.1813 | 1.6772 | 1.6708 |
8.1023 | 2.3618 | 470 | 6.1992 | 1.7187 | 0.1813 | 1.6023 | 1.5988 |
8.4083 | 2.3869 | 475 | 6.1925 | 1.7018 | 0.1816 | 1.5594 | 1.5567 |
8.3179 | 2.4121 | 480 | 6.1839 | 1.7161 | 0.1941 | 1.5709 | 1.5685 |
7.8477 | 2.4372 | 485 | 6.1699 | 1.6932 | 0.1816 | 1.5782 | 1.5846 |
7.9573 | 2.4623 | 490 | 6.1573 | 1.7737 | 0.1941 | 1.6399 | 1.6485 |
8.3412 | 2.4874 | 495 | 6.1501 | 1.6200 | 0.1553 | 1.5142 | 1.5160 |
8.2275 | 2.5126 | 500 | 6.1420 | 1.5973 | 0.1553 | 1.4930 | 1.4994 |
7.7802 | 2.5377 | 505 | 6.1371 | 1.5843 | 0.1553 | 1.4519 | 1.4570 |
8.2208 | 2.5628 | 510 | 6.1301 | 1.5734 | 0.1552 | 1.4672 | 1.4746 |
7.988 | 2.5879 | 515 | 6.1250 | 1.6141 | 0.1552 | 1.5196 | 1.5243 |
8.0406 | 2.6131 | 520 | 6.1216 | 1.5612 | 0.1317 | 1.4781 | 1.4902 |
7.6537 | 2.6382 | 525 | 6.1177 | 1.5042 | 0.1174 | 1.4341 | 1.4512 |
7.7706 | 2.6633 | 530 | 6.1124 | 1.5480 | 0.1110 | 1.4766 | 1.4870 |
7.7587 | 2.6884 | 535 | 6.1041 | 1.6054 | 0.0975 | 1.5301 | 1.5384 |
7.5912 | 2.7136 | 540 | 6.0947 | 1.6413 | 0.0975 | 1.5722 | 1.5747 |
7.6195 | 2.7387 | 545 | 6.0872 | 1.6897 | 0.0975 | 1.6322 | 1.6231 |
7.9719 | 2.7638 | 550 | 6.0840 | 1.6390 | 0.0980 | 1.5458 | 1.5456 |
7.5861 | 2.7889 | 555 | 6.0818 | 1.7055 | 0.0984 | 1.6106 | 1.6002 |
7.3751 | 2.8141 | 560 | 6.0693 | 1.7887 | 0.0984 | 1.7099 | 1.6969 |
7.4287 | 2.8392 | 565 | 6.0521 | 1.9438 | 0.0984 | 1.8707 | 1.8458 |
7.8715 | 2.8643 | 570 | 6.0418 | 1.9864 | 0.0742 | 1.9169 | 1.9069 |
7.5668 | 2.8894 | 575 | 6.0371 | 2.0494 | 0.0742 | 1.9751 | 1.9491 |
7.5644 | 2.9146 | 580 | 6.0284 | 2.0795 | 0.0742 | 2.0033 | 1.9955 |
7.5837 | 2.9397 | 585 | 6.0187 | 2.0435 | 0.0593 | 1.9903 | 1.9873 |
7.8794 | 2.9648 | 590 | 6.0076 | 2.1343 | 0.0593 | 2.0976 | 2.0879 |
7.4229 | 2.9899 | 595 | 5.9940 | 2.1421 | 0.0593 | 2.1090 | 2.0987 |
7.3116 | 3.0151 | 600 | 5.9697 | 2.2915 | 0.0593 | 2.2485 | 2.2434 |
7.237 | 3.0402 | 605 | 5.9432 | 2.2761 | 0.0593 | 2.2335 | 2.2297 |
7.5251 | 3.0653 | 610 | 5.9066 | 2.3241 | 0.0593 | 2.2857 | 2.2795 |
7.5311 | 3.0905 | 615 | 5.8749 | 2.3968 | 0.0593 | 2.3618 | 2.3526 |
7.3948 | 3.1156 | 620 | 5.8503 | 2.4292 | 0.0722 | 2.3934 | 2.3859 |
7.4102 | 3.1407 | 625 | 5.8441 | 2.5045 | 0.0722 | 2.4562 | 2.4443 |
7.3152 | 3.1658 | 630 | 5.8373 | 2.5838 | 0.0722 | 2.5309 | 2.5271 |
7.2793 | 3.1910 | 635 | 5.8287 | 2.5969 | 0.0722 | 2.5425 | 2.5405 |
7.2854 | 3.2161 | 640 | 5.8204 | 2.6641 | 0.0722 | 2.6240 | 2.6098 |
7.2151 | 3.2412 | 645 | 5.8081 | 2.7296 | 0.0722 | 2.6823 | 2.6686 |
7.1616 | 3.2663 | 650 | 5.7995 | 2.8340 | 0.0721 | 2.7816 | 2.7651 |
7.2671 | 3.2915 | 655 | 5.7911 | 2.9706 | 0.0721 | 2.9038 | 2.8953 |
7.3364 | 3.3166 | 660 | 5.7806 | 3.0656 | 0.0721 | 2.9978 | 2.9924 |
7.345 | 3.3417 | 665 | 5.7695 | 3.1378 | 0.0721 | 3.0744 | 3.0664 |
7.3118 | 3.3668 | 670 | 5.7532 | 3.1238 | 0.0722 | 3.0706 | 3.0618 |
7.4469 | 3.3920 | 675 | 5.7453 | 3.1653 | 0.0722 | 3.1293 | 3.1149 |
7.2567 | 3.4171 | 680 | 5.7376 | 3.1821 | 0.0722 | 3.1314 | 3.1141 |
7.1828 | 3.4422 | 685 | 5.7268 | 3.2855 | 0.0848 | 3.2291 | 3.1953 |
7.3317 | 3.4673 | 690 | 5.7070 | 3.3113 | 0.0848 | 3.2480 | 3.2158 |
7.1762 | 3.4925 | 695 | 5.6925 | 3.3213 | 0.0848 | 3.2595 | 3.2338 |
7.0286 | 3.5176 | 700 | 5.6794 | 3.3345 | 0.0848 | 3.2713 | 3.2443 |
7.1958 | 3.5427 | 705 | 5.6638 | 3.3834 | 0.0848 | 3.3169 | 3.2835 |
7.2112 | 3.5678 | 710 | 5.6573 | 3.4198 | 0.0744 | 3.3372 | 3.3030 |
7.0299 | 3.5930 | 715 | 5.6404 | 3.5031 | 0.0744 | 3.4199 | 3.3898 |
7.4005 | 3.6181 | 720 | 5.6231 | 3.5545 | 0.0744 | 3.4659 | 3.4403 |
7.2407 | 3.6432 | 725 | 5.6160 | 3.6875 | 0.0744 | 3.6277 | 3.6033 |
7.1189 | 3.6683 | 730 | 5.6075 | 3.7917 | 0.0744 | 3.7315 | 3.7144 |
7.0044 | 3.6935 | 735 | 5.5928 | 3.9431 | 0.0744 | 3.8972 | 3.8828 |
7.0864 | 3.7186 | 740 | 5.5823 | 3.9375 | 0.0593 | 3.8940 | 3.8878 |
7.3772 | 3.7437 | 745 | 5.5713 | 3.9630 | 0.0593 | 3.9203 | 3.9155 |
7.0098 | 3.7688 | 750 | 5.5583 | 4.1243 | 0.0744 | 4.0620 | 4.0602 |
6.8234 | 3.7940 | 755 | 5.5445 | 4.1046 | 0.0593 | 4.0478 | 4.0421 |
7.1442 | 3.8191 | 760 | 5.5222 | 4.0768 | 0.0593 | 4.0170 | 4.0034 |
6.9834 | 3.8442 | 765 | 5.5042 | 4.0013 | 0.0 | 3.9340 | 3.9286 |
7.0864 | 3.8693 | 770 | 5.4785 | 4.0012 | 0.0 | 3.9335 | 3.9288 |
6.863 | 3.8945 | 775 | 5.4519 | 4.0494 | 0.0 | 3.9829 | 3.9836 |
6.8511 | 3.9196 | 780 | 5.4235 | 4.1077 | 0.0 | 4.0382 | 4.0420 |
6.8788 | 3.9447 | 785 | 5.3975 | 4.2017 | 0.0 | 4.1333 | 4.1334 |
6.6429 | 3.9698 | 790 | 5.3782 | 4.2539 | 0.0 | 4.2007 | 4.1935 |
6.8546 | 3.9950 | 795 | 5.3626 | 4.3106 | 0.0 | 4.2576 | 4.2563 |
6.8145 | 4.0201 | 800 | 5.3410 | 4.3560 | 0.0 | 4.3242 | 4.3170 |
6.7826 | 4.0452 | 805 | 5.3263 | 4.3417 | 0.0 | 4.3236 | 4.3164 |
6.9502 | 4.0704 | 810 | 5.3144 | 4.3921 | 0.0 | 4.3602 | 4.3603 |
6.6682 | 4.0955 | 815 | 5.3021 | 4.3912 | 0.0 | 4.3575 | 4.3582 |
6.7195 | 4.1206 | 820 | 5.2895 | 4.3898 | 0.0 | 4.3564 | 4.3573 |
6.8389 | 4.1457 | 825 | 5.2787 | 4.3896 | 0.0 | 4.3561 | 4.3573 |
6.9199 | 4.1709 | 830 | 5.2683 | 4.4240 | 0.0 | 4.3939 | 4.3847 |
6.8859 | 4.1960 | 835 | 5.2588 | 4.4237 | 0.0 | 4.3934 | 4.3838 |
6.7521 | 4.2211 | 840 | 5.2438 | 4.5064 | 0.0 | 4.4407 | 4.4426 |
6.7168 | 4.2462 | 845 | 5.2284 | 4.5129 | 0.0 | 4.4405 | 4.4425 |
7.2573 | 4.2714 | 850 | 5.2204 | 4.4857 | 0.0 | 4.4131 | 4.4064 |
6.5104 | 4.2965 | 855 | 5.2126 | 4.5147 | 0.0 | 4.4269 | 4.4225 |
6.6178 | 4.3216 | 860 | 5.2034 | 4.5454 | 0.0 | 4.4468 | 4.4405 |
6.5719 | 4.3467 | 865 | 5.1942 | 4.5156 | 0.0 | 4.4389 | 4.4297 |
6.7698 | 4.3719 | 870 | 5.1824 | 4.5155 | 0.0 | 4.4383 | 4.4294 |
6.5936 | 4.3970 | 875 | 5.1708 | 4.5155 | 0.0 | 4.4383 | 4.4294 |
6.6705 | 4.4221 | 880 | 5.1564 | 4.5968 | 0.0 | 4.4961 | 4.5006 |
6.8366 | 4.4472 | 885 | 5.1465 | 4.5968 | 0.0 | 4.4961 | 4.5006 |
6.8101 | 4.4724 | 890 | 5.1397 | 4.5968 | 0.0 | 4.4960 | 4.5006 |
6.6961 | 4.4975 | 895 | 5.1324 | 4.5962 | 0.0 | 4.4958 | 4.5003 |
6.8763 | 4.5226 | 900 | 5.1243 | 4.6352 | 0.0 | 4.5400 | 4.5474 |
6.7891 | 4.5477 | 905 | 5.1166 | 4.6352 | 0.0 | 4.5400 | 4.5474 |
6.6563 | 4.5729 | 910 | 5.1063 | 4.6352 | 0.0 | 4.5400 | 4.5474 |
6.7201 | 4.5980 | 915 | 5.0973 | 4.6352 | 0.0 | 4.5395 | 4.5464 |
6.6108 | 4.6231 | 920 | 5.0880 | 4.6472 | 0.0 | 4.5387 | 4.5463 |
6.5714 | 4.6482 | 925 | 5.0786 | 4.6379 | 0.0 | 4.5388 | 4.5456 |
6.5445 | 4.6734 | 930 | 5.0695 | 4.6765 | 0.0 | 4.5586 | 4.5678 |
6.6799 | 4.6985 | 935 | 5.0614 | 4.6765 | 0.0 | 4.5586 | 4.5678 |
6.568 | 4.7236 | 940 | 5.0531 | 4.6765 | 0.0 | 4.5579 | 4.5675 |
6.2814 | 4.7487 | 945 | 5.0416 | 4.6892 | 0.0 | 4.5728 | 4.5799 |
6.8206 | 4.7739 | 950 | 5.0313 | 4.6892 | 0.0 | 4.5728 | 4.5799 |
6.5936 | 4.7990 | 955 | 5.0224 | 4.6957 | 0.0 | 4.5869 | 4.5870 |
6.662 | 4.8241 | 960 | 5.0125 | 4.6957 | 0.0 | 4.5869 | 4.5870 |
6.6761 | 4.8492 | 965 | 5.0031 | 4.6957 | 0.0 | 4.5869 | 4.5870 |
6.8252 | 4.8744 | 970 | 4.9960 | 4.6957 | 0.0 | 4.5868 | 4.5869 |
6.4136 | 4.8995 | 975 | 4.9888 | 4.6957 | 0.0 | 4.5868 | 4.5869 |
6.6854 | 4.9246 | 980 | 4.9813 | 4.6957 | 0.0 | 4.5868 | 4.5869 |
6.3622 | 4.9497 | 985 | 4.9766 | 4.6957 | 0.0 | 4.5868 | 4.5869 |
6.6554 | 4.9749 | 990 | 4.9696 | 4.6960 | 0.0 | 4.5868 | 4.5870 |
6.4508 | 5.0 | 995 | 4.9637 | 4.7116 | 0.0 | 4.6024 | 4.6028 |
6.701 | 5.0251 | 1000 | 4.9589 | 4.7116 | 0.0 | 4.6024 | 4.6028 |
6.2751 | 5.0503 | 1005 | 4.9519 | 4.7116 | 0.0 | 4.6024 | 4.6029 |
6.5376 | 5.0754 | 1010 | 4.9458 | 4.7116 | 0.0 | 4.6024 | 4.6029 |
6.4412 | 5.1005 | 1015 | 4.9417 | 4.7115 | 0.0 | 4.6020 | 4.6028 |
6.5644 | 5.1256 | 1020 | 4.9374 | 4.6979 | 0.0 | 4.6030 | 4.6028 |
6.1549 | 5.1508 | 1025 | 4.9295 | 4.6909 | 0.0 | 4.5909 | 4.5974 |
6.4149 | 5.1759 | 1030 | 4.9197 | 4.6675 | 0.0 | 4.5708 | 4.5716 |
6.5379 | 5.2010 | 1035 | 4.9123 | 4.6675 | 0.0 | 4.5585 | 4.5552 |
6.3613 | 5.2261 | 1040 | 4.9056 | 4.6791 | 0.0 | 4.5692 | 4.5656 |
6.5305 | 5.2513 | 1045 | 4.9001 | 4.6791 | 0.0 | 4.5692 | 4.5656 |
6.5593 | 5.2764 | 1050 | 4.8975 | 4.6661 | 0.0143 | 4.5695 | 4.5663 |
6.529 | 5.3015 | 1055 | 4.8906 | 4.6550 | 0.0143 | 4.5524 | 4.5508 |
6.4264 | 5.3266 | 1060 | 4.8837 | 4.6550 | 0.0143 | 4.5524 | 4.5508 |
6.679 | 5.3518 | 1065 | 4.8797 | 4.6444 | 0.0143 | 4.5417 | 4.5448 |
6.4163 | 5.3769 | 1070 | 4.8762 | 4.6444 | 0.0143 | 4.5417 | 4.5448 |
6.5349 | 5.4020 | 1075 | 4.8723 | 4.6444 | 0.0143 | 4.5417 | 4.5448 |
6.469 | 5.4271 | 1080 | 4.8693 | 4.6444 | 0.0143 | 4.5417 | 4.5448 |
6.3743 | 5.4523 | 1085 | 4.8668 | 4.6444 | 0.0143 | 4.5417 | 4.5448 |
6.3293 | 5.4774 | 1090 | 4.8648 | 4.6442 | 0.0143 | 4.5415 | 4.5447 |
6.3905 | 5.5025 | 1095 | 4.8615 | 4.6442 | 0.0143 | 4.5415 | 4.5447 |
6.6543 | 5.5276 | 1100 | 4.8589 | 4.6442 | 0.0143 | 4.5415 | 4.5447 |
6.2526 | 5.5528 | 1105 | 4.8557 | 4.6442 | 0.0143 | 4.5415 | 4.5447 |
6.5861 | 5.5779 | 1110 | 4.8521 | 4.6586 | 0.0143 | 4.5602 | 4.5559 |
6.6042 | 5.6030 | 1115 | 4.8498 | 4.6586 | 0.0143 | 4.5602 | 4.5559 |
6.5273 | 5.6281 | 1120 | 4.8485 | 4.6586 | 0.0143 | 4.5602 | 4.5559 |
6.3963 | 5.6533 | 1125 | 4.8477 | 4.6586 | 0.0143 | 4.5602 | 4.5559 |
6.3541 | 5.6784 | 1130 | 4.8468 | 4.6586 | 0.0143 | 4.5602 | 4.5559 |
6.2128 | 5.7035 | 1135 | 4.8460 | 4.6586 | 0.0143 | 4.5602 | 4.5559 |
6.6066 | 5.7286 | 1140 | 4.8454 | 4.6586 | 0.0143 | 4.5602 | 4.5559 |
6.366 | 5.7538 | 1145 | 4.8448 | 4.6590 | 0.0143 | 4.5612 | 4.5562 |
6.4843 | 5.7789 | 1150 | 4.8440 | 4.6590 | 0.0143 | 4.5612 | 4.5562 |
6.8107 | 5.8040 | 1155 | 4.8434 | 4.6590 | 0.0143 | 4.5612 | 4.5562 |
6.2873 | 5.8291 | 1160 | 4.8429 | 4.6590 | 0.0143 | 4.5612 | 4.5562 |
6.5391 | 5.8543 | 1165 | 4.8426 | 4.6911 | 0.0143 | 4.5972 | 4.5916 |
6.7077 | 5.8794 | 1170 | 4.8425 | 4.6911 | 0.0143 | 4.5972 | 4.5916 |
6.5323 | 5.9045 | 1175 | 4.8424 | 4.6911 | 0.0143 | 4.5972 | 4.5916 |
6.1429 | 5.9296 | 1180 | 4.8422 | 4.6911 | 0.0143 | 4.5972 | 4.5916 |
6.457 | 5.9548 | 1185 | 4.8419 | 4.6911 | 0.0143 | 4.5972 | 4.5916 |
6.1296 | 5.9799 | 1190 | 4.8417 | 4.6911 | 0.0143 | 4.5972 | 4.5916 |
Framework versions
- PEFT 0.14.0
- Transformers 4.49.0
- Pytorch 2.6.0+cu124
- Datasets 3.3.2
- Tokenizers 0.21.0
- Downloads last month
- 0
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for benitoals/mt5-lora-hf
Base model
google/mt5-small