my-lora-hf-sum

This model is a fine-tuned version of google/mt5-small on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 8.8834
Rouge1: 0.7694
Rouge2: 0.2147
Rougel: 0.6887
Rougelsum: 0.7056

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 4
eval_batch_size: 4
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 2

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum
26.8	0.0251	5	17.7235	0.4479	0.0229	0.4381	0.4408
22.7614	0.0503	10	17.5699	0.5096	0.0229	0.4851	0.4837
25.4291	0.0754	15	17.4004	0.4895	0.0229	0.4760	0.4770
24.2329	0.1005	20	17.3355	0.4011	0.0229	0.4091	0.4097
27.2941	0.1256	25	17.3291	0.3714	0.0229	0.3764	0.3806
26.2486	0.1508	30	17.2167	0.3714	0.0229	0.3764	0.3806
22.6219	0.1759	35	17.0234	0.3714	0.0229	0.3764	0.3806
22.8832	0.2010	40	16.9310	0.3492	0.0229	0.3543	0.3580
24.3708	0.2261	45	16.8126	0.3722	0.0229	0.3798	0.3821
24.1446	0.2513	50	16.7162	0.3442	0.0229	0.3526	0.3520
24.4794	0.2764	55	16.5429	0.3442	0.0229	0.3526	0.3520
24.2086	0.3015	60	16.3896	0.3442	0.0229	0.3526	0.3520
23.4245	0.3266	65	16.2550	0.3218	0.0229	0.3278	0.3254
21.0837	0.3518	70	16.0502	0.3223	0.0229	0.3304	0.3275
23.4064	0.3769	75	15.9576	0.3223	0.0229	0.3304	0.3275
20.5418	0.4020	80	15.7345	0.3086	0.0229	0.3166	0.3160
22.4424	0.4271	85	15.5302	0.2892	0.0229	0.2941	0.2983
20.0642	0.4523	90	15.3498	0.3296	0.0493	0.3358	0.3386
21.9881	0.4774	95	15.1718	0.3165	0.0493	0.3215	0.3253
22.4813	0.5025	100	15.0168	0.3169	0.0493	0.3216	0.3257
24.2963	0.5276	105	14.7464	0.3309	0.0493	0.3337	0.3370
23.2294	0.5528	110	14.4756	0.3597	0.0493	0.3608	0.3642
20.3992	0.5779	115	14.2512	0.3714	0.0493	0.3584	0.3635
20.2597	0.6030	120	13.9527	0.3771	0.0493	0.3697	0.3743
20.6987	0.6281	125	13.7137	0.4282	0.0743	0.4172	0.4245
19.3361	0.6533	130	13.5206	0.4030	0.0743	0.4052	0.4107
21.4137	0.6784	135	13.3524	0.4030	0.0743	0.4052	0.4107
19.3918	0.7035	140	13.2160	0.3961	0.0852	0.3963	0.4077
18.2876	0.7286	145	13.0702	0.3970	0.0852	0.3976	0.4095
22.2212	0.7538	150	12.8667	0.3457	0.0852	0.3437	0.3524
19.2252	0.7789	155	12.6627	0.3457	0.0852	0.3437	0.3524
17.9289	0.8040	160	12.5312	0.3457	0.0852	0.3437	0.3524
19.3069	0.8291	165	12.3408	0.3457	0.0852	0.3437	0.3524
20.2723	0.8543	170	12.1137	0.3457	0.0852	0.3437	0.3524
17.534	0.8794	175	11.8986	0.3458	0.0852	0.3442	0.3524
19.06	0.9045	180	11.6703	0.3458	0.0852	0.3442	0.3524
21.1059	0.9296	185	11.4621	0.3613	0.0852	0.3553	0.3673
18.3575	0.9548	190	11.2626	0.3611	0.0852	0.3551	0.3670
18.8256	0.9799	195	11.0890	0.3705	0.0852	0.3655	0.3783
16.6283	1.0050	200	10.9119	0.3705	0.0852	0.3655	0.3783
17.0705	1.0302	205	10.7746	0.3705	0.0852	0.3655	0.3783
16.583	1.0553	210	10.6290	0.3596	0.0852	0.3531	0.3684
17.4136	1.0804	215	10.4930	0.3811	0.0852	0.3752	0.3898
16.53	1.1055	220	10.3394	0.3811	0.0852	0.3752	0.3898
16.3147	1.1307	225	10.2160	0.3811	0.0852	0.3752	0.3898
17.313	1.1558	230	10.0941	0.3811	0.0852	0.3752	0.3898
14.9139	1.1809	235	9.9897	0.3956	0.0852	0.3896	0.4050
15.4727	1.2060	240	9.9119	0.4320	0.0852	0.4258	0.4437
16.2121	1.2312	245	9.8533	0.4467	0.0852	0.4426	0.4585
15.7117	1.2563	250	9.8013	0.4467	0.0852	0.4426	0.4585
14.3507	1.2814	255	9.7481	0.4467	0.0852	0.4426	0.4585
14.6657	1.3065	260	9.6891	0.4467	0.0852	0.4426	0.4585
15.2625	1.3317	265	9.6280	0.4463	0.0852	0.4420	0.4580
16.0463	1.3568	270	9.5713	0.4465	0.0852	0.4429	0.4584
15.1612	1.3819	275	9.5196	0.4465	0.0852	0.4429	0.4584
14.8572	1.4070	280	9.4743	0.4971	0.0979	0.4862	0.5034
14.3652	1.4322	285	9.4267	0.4971	0.0979	0.4862	0.5034
16.1322	1.4573	290	9.3751	0.4971	0.0979	0.4862	0.5034
14.4116	1.4824	295	9.3180	0.4975	0.0979	0.4864	0.5034
14.166	1.5075	300	9.2584	0.5281	0.0979	0.5003	0.5185
14.4904	1.5327	305	9.2126	0.5281	0.0979	0.5002	0.5185
15.3989	1.5578	310	9.1765	0.5281	0.0979	0.5002	0.5185
14.7175	1.5829	315	9.1373	0.5280	0.0979	0.5001	0.5184
15.1778	1.6080	320	9.1008	0.5710	0.1109	0.5275	0.5473
15.717	1.6332	325	9.0625	0.6370	0.1429	0.5776	0.5899
15.786	1.6583	330	9.0358	0.6370	0.1429	0.5776	0.5899
13.8109	1.6834	335	9.0115	0.6781	0.1772	0.6099	0.6218
13.7277	1.7085	340	8.9898	0.6781	0.1772	0.6099	0.6218
14.6217	1.7337	345	8.9701	0.7030	0.1891	0.6333	0.6474
13.9167	1.7588	350	8.9508	0.7236	0.1891	0.6544	0.6707
14.0584	1.7839	355	8.9358	0.7696	0.2147	0.6896	0.7060
15.5297	1.8090	360	8.9224	0.7695	0.2147	0.6887	0.7057
15.3239	1.8342	365	8.9141	0.7695	0.2147	0.6887	0.7057
14.1664	1.8593	370	8.9070	0.7695	0.2147	0.6887	0.7057
13.7965	1.8844	375	8.9005	0.7695	0.2147	0.6887	0.7057
13.9088	1.9095	380	8.8938	0.7694	0.2147	0.6887	0.7056
14.4487	1.9347	385	8.8886	0.7694	0.2147	0.6887	0.7056
16.2926	1.9598	390	8.8851	0.7694	0.2147	0.6887	0.7056
13.9456	1.9849	395	8.8834	0.7694	0.2147	0.6887	0.7056

Framework versions

PEFT 0.14.0
Transformers 4.49.0
Pytorch 2.6.0+cu124
Datasets 3.3.2
Tokenizers 0.21.0

benitoals
/

my-lora-hf-sum

my-lora-hf-sum

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for benitoals/my-lora-hf-sum

Evaluation results