====== Perplexity statistics ====== | |
Mean PPL(Q) : 8.459379 ± 0.053550 | |
Mean PPL(base) : 7.237090 ± 0.045539 | |
Cor(ln(PPL(Q)), ln(PPL(base))): 97.26% | |
Mean ln(PPL(Q)/PPL(base)) : 0.156057 ± 0.001477 | |
Mean PPL(Q)/PPL(base) : 1.168892 ± 0.001727 | |
Mean PPL(Q)-PPL(base) : 1.222289 ± 0.014061 | |
====== KL divergence statistics ====== | |
Mean KLD: 0.131196 ± 0.000539 | |
Maximum KLD: 7.898368 | |
99.9% KLD: 2.475934 | |
99.0% KLD: 0.894390 | |
99.0% KLD: 0.894390 | |
Median KLD: 0.089346 | |
10.0% KLD: 0.006670 | |
5.0% KLD: 0.002250 | |
1.0% KLD: 0.000392 | |
Minimum KLD: 0.000001 | |
====== Token probability statistics ====== | |
Mean Δp: -3.913 ± 0.027 % | |
Maximum Δp: 64.023% | |
99.9% Δp: 33.075% | |
99.0% Δp: 17.214% | |
95.0% Δp: 7.245% | |
90.0% Δp: 3.301% | |
75.0% Δp: 0.096% | |
Median Δp: -0.875% | |
25.0% Δp: -6.342% | |
10.0% Δp: -15.578% | |
5.0% Δp: -22.649% | |
1.0% Δp: -41.943% | |
0.1% Δp: -75.852% | |
Minimum Δp: -98.926% | |
RMS Δp : 10.892 ± 0.050 % | |
Same top p: 82.438 ± 0.100 % | |