stefan-it's picture
Upload folder using huggingface_hub
f3fc8e2
2023-10-13 09:04:20,676 ----------------------------------------------------------------------------------------------------
2023-10-13 09:04:20,677 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 09:04:20,677 ----------------------------------------------------------------------------------------------------
2023-10-13 09:04:20,677 MultiCorpus: 1214 train + 266 dev + 251 test sentences
- NER_HIPE_2022 Corpus: 1214 train + 266 dev + 251 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/en/with_doc_seperator
2023-10-13 09:04:20,677 ----------------------------------------------------------------------------------------------------
2023-10-13 09:04:20,677 Train: 1214 sentences
2023-10-13 09:04:20,677 (train_with_dev=False, train_with_test=False)
2023-10-13 09:04:20,677 ----------------------------------------------------------------------------------------------------
2023-10-13 09:04:20,677 Training Params:
2023-10-13 09:04:20,678 - learning_rate: "5e-05"
2023-10-13 09:04:20,678 - mini_batch_size: "4"
2023-10-13 09:04:20,678 - max_epochs: "10"
2023-10-13 09:04:20,678 - shuffle: "True"
2023-10-13 09:04:20,678 ----------------------------------------------------------------------------------------------------
2023-10-13 09:04:20,678 Plugins:
2023-10-13 09:04:20,678 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 09:04:20,678 ----------------------------------------------------------------------------------------------------
2023-10-13 09:04:20,678 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 09:04:20,678 - metric: "('micro avg', 'f1-score')"
2023-10-13 09:04:20,678 ----------------------------------------------------------------------------------------------------
2023-10-13 09:04:20,678 Computation:
2023-10-13 09:04:20,678 - compute on device: cuda:0
2023-10-13 09:04:20,678 - embedding storage: none
2023-10-13 09:04:20,678 ----------------------------------------------------------------------------------------------------
2023-10-13 09:04:20,678 Model training base path: "hmbench-ajmc/en-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-13 09:04:20,678 ----------------------------------------------------------------------------------------------------
2023-10-13 09:04:20,678 ----------------------------------------------------------------------------------------------------
2023-10-13 09:04:22,176 epoch 1 - iter 30/304 - loss 3.31425647 - time (sec): 1.50 - samples/sec: 2161.50 - lr: 0.000005 - momentum: 0.000000
2023-10-13 09:04:23,708 epoch 1 - iter 60/304 - loss 2.53179623 - time (sec): 3.03 - samples/sec: 2106.27 - lr: 0.000010 - momentum: 0.000000
2023-10-13 09:04:25,220 epoch 1 - iter 90/304 - loss 1.89558294 - time (sec): 4.54 - samples/sec: 2141.58 - lr: 0.000015 - momentum: 0.000000
2023-10-13 09:04:26,744 epoch 1 - iter 120/304 - loss 1.58723990 - time (sec): 6.06 - samples/sec: 2119.98 - lr: 0.000020 - momentum: 0.000000
2023-10-13 09:04:28,296 epoch 1 - iter 150/304 - loss 1.36936470 - time (sec): 7.62 - samples/sec: 2087.56 - lr: 0.000025 - momentum: 0.000000
2023-10-13 09:04:29,830 epoch 1 - iter 180/304 - loss 1.21901354 - time (sec): 9.15 - samples/sec: 2038.01 - lr: 0.000029 - momentum: 0.000000
2023-10-13 09:04:31,359 epoch 1 - iter 210/304 - loss 1.07903038 - time (sec): 10.68 - samples/sec: 2051.23 - lr: 0.000034 - momentum: 0.000000
2023-10-13 09:04:32,814 epoch 1 - iter 240/304 - loss 0.98215090 - time (sec): 12.13 - samples/sec: 2041.84 - lr: 0.000039 - momentum: 0.000000
2023-10-13 09:04:34,195 epoch 1 - iter 270/304 - loss 0.90752162 - time (sec): 13.52 - samples/sec: 2040.06 - lr: 0.000044 - momentum: 0.000000
2023-10-13 09:04:35,739 epoch 1 - iter 300/304 - loss 0.84165624 - time (sec): 15.06 - samples/sec: 2033.56 - lr: 0.000049 - momentum: 0.000000
2023-10-13 09:04:35,946 ----------------------------------------------------------------------------------------------------
2023-10-13 09:04:35,946 EPOCH 1 done: loss 0.8341 - lr: 0.000049
2023-10-13 09:04:36,776 DEV : loss 0.202301487326622 - f1-score (micro avg) 0.5797
2023-10-13 09:04:36,782 saving best model
2023-10-13 09:04:37,219 ----------------------------------------------------------------------------------------------------
2023-10-13 09:04:38,808 epoch 2 - iter 30/304 - loss 0.19650663 - time (sec): 1.59 - samples/sec: 1945.46 - lr: 0.000049 - momentum: 0.000000
2023-10-13 09:04:40,388 epoch 2 - iter 60/304 - loss 0.20528328 - time (sec): 3.17 - samples/sec: 1890.30 - lr: 0.000049 - momentum: 0.000000
2023-10-13 09:04:41,946 epoch 2 - iter 90/304 - loss 0.17434151 - time (sec): 4.73 - samples/sec: 1951.81 - lr: 0.000048 - momentum: 0.000000
2023-10-13 09:04:43,541 epoch 2 - iter 120/304 - loss 0.16749938 - time (sec): 6.32 - samples/sec: 1910.32 - lr: 0.000048 - momentum: 0.000000
2023-10-13 09:04:45,159 epoch 2 - iter 150/304 - loss 0.15293309 - time (sec): 7.94 - samples/sec: 1938.35 - lr: 0.000047 - momentum: 0.000000
2023-10-13 09:04:46,685 epoch 2 - iter 180/304 - loss 0.15217789 - time (sec): 9.46 - samples/sec: 1955.49 - lr: 0.000047 - momentum: 0.000000
2023-10-13 09:04:48,205 epoch 2 - iter 210/304 - loss 0.15455635 - time (sec): 10.98 - samples/sec: 1985.53 - lr: 0.000046 - momentum: 0.000000
2023-10-13 09:04:49,776 epoch 2 - iter 240/304 - loss 0.15812685 - time (sec): 12.56 - samples/sec: 1968.72 - lr: 0.000046 - momentum: 0.000000
2023-10-13 09:04:51,276 epoch 2 - iter 270/304 - loss 0.15344707 - time (sec): 14.06 - samples/sec: 1958.21 - lr: 0.000045 - momentum: 0.000000
2023-10-13 09:04:52,734 epoch 2 - iter 300/304 - loss 0.14666209 - time (sec): 15.51 - samples/sec: 1981.15 - lr: 0.000045 - momentum: 0.000000
2023-10-13 09:04:52,904 ----------------------------------------------------------------------------------------------------
2023-10-13 09:04:52,904 EPOCH 2 done: loss 0.1462 - lr: 0.000045
2023-10-13 09:04:54,001 DEV : loss 0.161252960562706 - f1-score (micro avg) 0.7976
2023-10-13 09:04:54,007 saving best model
2023-10-13 09:04:54,486 ----------------------------------------------------------------------------------------------------
2023-10-13 09:04:55,859 epoch 3 - iter 30/304 - loss 0.06096573 - time (sec): 1.37 - samples/sec: 2198.32 - lr: 0.000044 - momentum: 0.000000
2023-10-13 09:04:57,252 epoch 3 - iter 60/304 - loss 0.06899069 - time (sec): 2.76 - samples/sec: 2198.66 - lr: 0.000043 - momentum: 0.000000
2023-10-13 09:04:58,630 epoch 3 - iter 90/304 - loss 0.08640571 - time (sec): 4.14 - samples/sec: 2179.58 - lr: 0.000043 - momentum: 0.000000
2023-10-13 09:04:59,930 epoch 3 - iter 120/304 - loss 0.08755042 - time (sec): 5.44 - samples/sec: 2222.55 - lr: 0.000042 - momentum: 0.000000
2023-10-13 09:05:01,227 epoch 3 - iter 150/304 - loss 0.10222509 - time (sec): 6.74 - samples/sec: 2224.09 - lr: 0.000042 - momentum: 0.000000
2023-10-13 09:05:02,524 epoch 3 - iter 180/304 - loss 0.09486876 - time (sec): 8.03 - samples/sec: 2247.32 - lr: 0.000041 - momentum: 0.000000
2023-10-13 09:05:03,856 epoch 3 - iter 210/304 - loss 0.09811310 - time (sec): 9.37 - samples/sec: 2299.44 - lr: 0.000041 - momentum: 0.000000
2023-10-13 09:05:05,167 epoch 3 - iter 240/304 - loss 0.09881164 - time (sec): 10.68 - samples/sec: 2308.84 - lr: 0.000040 - momentum: 0.000000
2023-10-13 09:05:06,463 epoch 3 - iter 270/304 - loss 0.09401562 - time (sec): 11.97 - samples/sec: 2297.62 - lr: 0.000040 - momentum: 0.000000
2023-10-13 09:05:07,787 epoch 3 - iter 300/304 - loss 0.09134469 - time (sec): 13.30 - samples/sec: 2307.52 - lr: 0.000039 - momentum: 0.000000
2023-10-13 09:05:07,957 ----------------------------------------------------------------------------------------------------
2023-10-13 09:05:07,957 EPOCH 3 done: loss 0.0929 - lr: 0.000039
2023-10-13 09:05:08,860 DEV : loss 0.2110685259103775 - f1-score (micro avg) 0.7938
2023-10-13 09:05:08,866 ----------------------------------------------------------------------------------------------------
2023-10-13 09:05:10,177 epoch 4 - iter 30/304 - loss 0.02405006 - time (sec): 1.31 - samples/sec: 2298.53 - lr: 0.000038 - momentum: 0.000000
2023-10-13 09:05:11,527 epoch 4 - iter 60/304 - loss 0.07462171 - time (sec): 2.66 - samples/sec: 2303.08 - lr: 0.000038 - momentum: 0.000000
2023-10-13 09:05:12,844 epoch 4 - iter 90/304 - loss 0.06704429 - time (sec): 3.98 - samples/sec: 2314.71 - lr: 0.000037 - momentum: 0.000000
2023-10-13 09:05:14,143 epoch 4 - iter 120/304 - loss 0.06888938 - time (sec): 5.28 - samples/sec: 2308.40 - lr: 0.000037 - momentum: 0.000000
2023-10-13 09:05:15,460 epoch 4 - iter 150/304 - loss 0.07104858 - time (sec): 6.59 - samples/sec: 2288.83 - lr: 0.000036 - momentum: 0.000000
2023-10-13 09:05:16,819 epoch 4 - iter 180/304 - loss 0.06984543 - time (sec): 7.95 - samples/sec: 2291.86 - lr: 0.000036 - momentum: 0.000000
2023-10-13 09:05:18,148 epoch 4 - iter 210/304 - loss 0.06550416 - time (sec): 9.28 - samples/sec: 2289.44 - lr: 0.000035 - momentum: 0.000000
2023-10-13 09:05:19,465 epoch 4 - iter 240/304 - loss 0.06294752 - time (sec): 10.60 - samples/sec: 2272.78 - lr: 0.000035 - momentum: 0.000000
2023-10-13 09:05:20,792 epoch 4 - iter 270/304 - loss 0.06091572 - time (sec): 11.93 - samples/sec: 2304.00 - lr: 0.000034 - momentum: 0.000000
2023-10-13 09:05:22,112 epoch 4 - iter 300/304 - loss 0.06786459 - time (sec): 13.25 - samples/sec: 2311.04 - lr: 0.000033 - momentum: 0.000000
2023-10-13 09:05:22,284 ----------------------------------------------------------------------------------------------------
2023-10-13 09:05:22,284 EPOCH 4 done: loss 0.0689 - lr: 0.000033
2023-10-13 09:05:23,200 DEV : loss 0.19346584379673004 - f1-score (micro avg) 0.8168
2023-10-13 09:05:23,206 saving best model
2023-10-13 09:05:23,706 ----------------------------------------------------------------------------------------------------
2023-10-13 09:05:25,132 epoch 5 - iter 30/304 - loss 0.05275074 - time (sec): 1.42 - samples/sec: 2346.92 - lr: 0.000033 - momentum: 0.000000
2023-10-13 09:05:26,496 epoch 5 - iter 60/304 - loss 0.04726932 - time (sec): 2.78 - samples/sec: 2216.65 - lr: 0.000032 - momentum: 0.000000
2023-10-13 09:05:27,823 epoch 5 - iter 90/304 - loss 0.04916465 - time (sec): 4.11 - samples/sec: 2303.28 - lr: 0.000032 - momentum: 0.000000
2023-10-13 09:05:29,201 epoch 5 - iter 120/304 - loss 0.04542877 - time (sec): 5.49 - samples/sec: 2289.47 - lr: 0.000031 - momentum: 0.000000
2023-10-13 09:05:30,568 epoch 5 - iter 150/304 - loss 0.04512704 - time (sec): 6.86 - samples/sec: 2265.53 - lr: 0.000031 - momentum: 0.000000
2023-10-13 09:05:31,949 epoch 5 - iter 180/304 - loss 0.04701365 - time (sec): 8.24 - samples/sec: 2265.97 - lr: 0.000030 - momentum: 0.000000
2023-10-13 09:05:33,285 epoch 5 - iter 210/304 - loss 0.04782925 - time (sec): 9.57 - samples/sec: 2257.83 - lr: 0.000030 - momentum: 0.000000
2023-10-13 09:05:34,626 epoch 5 - iter 240/304 - loss 0.04537712 - time (sec): 10.91 - samples/sec: 2288.50 - lr: 0.000029 - momentum: 0.000000
2023-10-13 09:05:35,973 epoch 5 - iter 270/304 - loss 0.04648129 - time (sec): 12.26 - samples/sec: 2271.88 - lr: 0.000028 - momentum: 0.000000
2023-10-13 09:05:37,299 epoch 5 - iter 300/304 - loss 0.05241658 - time (sec): 13.59 - samples/sec: 2259.52 - lr: 0.000028 - momentum: 0.000000
2023-10-13 09:05:37,468 ----------------------------------------------------------------------------------------------------
2023-10-13 09:05:37,469 EPOCH 5 done: loss 0.0519 - lr: 0.000028
2023-10-13 09:05:38,383 DEV : loss 0.18774794042110443 - f1-score (micro avg) 0.831
2023-10-13 09:05:38,389 saving best model
2023-10-13 09:05:38,883 ----------------------------------------------------------------------------------------------------
2023-10-13 09:05:40,202 epoch 6 - iter 30/304 - loss 0.02969662 - time (sec): 1.32 - samples/sec: 2536.21 - lr: 0.000027 - momentum: 0.000000
2023-10-13 09:05:41,491 epoch 6 - iter 60/304 - loss 0.02657460 - time (sec): 2.60 - samples/sec: 2336.38 - lr: 0.000027 - momentum: 0.000000
2023-10-13 09:05:42,788 epoch 6 - iter 90/304 - loss 0.02967375 - time (sec): 3.90 - samples/sec: 2354.17 - lr: 0.000026 - momentum: 0.000000
2023-10-13 09:05:44,139 epoch 6 - iter 120/304 - loss 0.03422691 - time (sec): 5.25 - samples/sec: 2414.54 - lr: 0.000026 - momentum: 0.000000
2023-10-13 09:05:45,525 epoch 6 - iter 150/304 - loss 0.03521797 - time (sec): 6.64 - samples/sec: 2344.38 - lr: 0.000025 - momentum: 0.000000
2023-10-13 09:05:46,921 epoch 6 - iter 180/304 - loss 0.03531895 - time (sec): 8.03 - samples/sec: 2308.09 - lr: 0.000025 - momentum: 0.000000
2023-10-13 09:05:48,220 epoch 6 - iter 210/304 - loss 0.03429962 - time (sec): 9.33 - samples/sec: 2272.86 - lr: 0.000024 - momentum: 0.000000
2023-10-13 09:05:49,545 epoch 6 - iter 240/304 - loss 0.03350496 - time (sec): 10.66 - samples/sec: 2286.37 - lr: 0.000023 - momentum: 0.000000
2023-10-13 09:05:50,900 epoch 6 - iter 270/304 - loss 0.03633404 - time (sec): 12.01 - samples/sec: 2295.78 - lr: 0.000023 - momentum: 0.000000
2023-10-13 09:05:52,254 epoch 6 - iter 300/304 - loss 0.03609254 - time (sec): 13.37 - samples/sec: 2288.90 - lr: 0.000022 - momentum: 0.000000
2023-10-13 09:05:52,424 ----------------------------------------------------------------------------------------------------
2023-10-13 09:05:52,424 EPOCH 6 done: loss 0.0360 - lr: 0.000022
2023-10-13 09:05:53,341 DEV : loss 0.20084916055202484 - f1-score (micro avg) 0.8353
2023-10-13 09:05:53,348 saving best model
2023-10-13 09:05:53,828 ----------------------------------------------------------------------------------------------------
2023-10-13 09:05:55,131 epoch 7 - iter 30/304 - loss 0.02132131 - time (sec): 1.30 - samples/sec: 2162.59 - lr: 0.000022 - momentum: 0.000000
2023-10-13 09:05:56,449 epoch 7 - iter 60/304 - loss 0.01941644 - time (sec): 2.62 - samples/sec: 2243.89 - lr: 0.000021 - momentum: 0.000000
2023-10-13 09:05:57,761 epoch 7 - iter 90/304 - loss 0.02483462 - time (sec): 3.93 - samples/sec: 2262.16 - lr: 0.000021 - momentum: 0.000000
2023-10-13 09:05:59,104 epoch 7 - iter 120/304 - loss 0.02458270 - time (sec): 5.27 - samples/sec: 2347.95 - lr: 0.000020 - momentum: 0.000000
2023-10-13 09:06:00,428 epoch 7 - iter 150/304 - loss 0.02301087 - time (sec): 6.60 - samples/sec: 2301.18 - lr: 0.000020 - momentum: 0.000000
2023-10-13 09:06:01,764 epoch 7 - iter 180/304 - loss 0.02219513 - time (sec): 7.93 - samples/sec: 2316.48 - lr: 0.000019 - momentum: 0.000000
2023-10-13 09:06:03,008 epoch 7 - iter 210/304 - loss 0.02073113 - time (sec): 9.18 - samples/sec: 2302.92 - lr: 0.000018 - momentum: 0.000000
2023-10-13 09:06:04,318 epoch 7 - iter 240/304 - loss 0.02757178 - time (sec): 10.49 - samples/sec: 2300.42 - lr: 0.000018 - momentum: 0.000000
2023-10-13 09:06:05,660 epoch 7 - iter 270/304 - loss 0.02714443 - time (sec): 11.83 - samples/sec: 2307.53 - lr: 0.000017 - momentum: 0.000000
2023-10-13 09:06:06,961 epoch 7 - iter 300/304 - loss 0.02645846 - time (sec): 13.13 - samples/sec: 2328.18 - lr: 0.000017 - momentum: 0.000000
2023-10-13 09:06:07,150 ----------------------------------------------------------------------------------------------------
2023-10-13 09:06:07,150 EPOCH 7 done: loss 0.0261 - lr: 0.000017
2023-10-13 09:06:08,090 DEV : loss 0.20895600318908691 - f1-score (micro avg) 0.8424
2023-10-13 09:06:08,096 saving best model
2023-10-13 09:06:08,606 ----------------------------------------------------------------------------------------------------
2023-10-13 09:06:09,966 epoch 8 - iter 30/304 - loss 0.01056526 - time (sec): 1.36 - samples/sec: 2142.94 - lr: 0.000016 - momentum: 0.000000
2023-10-13 09:06:11,404 epoch 8 - iter 60/304 - loss 0.00730266 - time (sec): 2.80 - samples/sec: 2092.65 - lr: 0.000016 - momentum: 0.000000
2023-10-13 09:06:12,738 epoch 8 - iter 90/304 - loss 0.00608566 - time (sec): 4.13 - samples/sec: 2171.41 - lr: 0.000015 - momentum: 0.000000
2023-10-13 09:06:14,066 epoch 8 - iter 120/304 - loss 0.01279583 - time (sec): 5.46 - samples/sec: 2218.58 - lr: 0.000015 - momentum: 0.000000
2023-10-13 09:06:15,385 epoch 8 - iter 150/304 - loss 0.01346908 - time (sec): 6.78 - samples/sec: 2257.75 - lr: 0.000014 - momentum: 0.000000
2023-10-13 09:06:16,696 epoch 8 - iter 180/304 - loss 0.01285972 - time (sec): 8.09 - samples/sec: 2234.68 - lr: 0.000013 - momentum: 0.000000
2023-10-13 09:06:18,013 epoch 8 - iter 210/304 - loss 0.01281828 - time (sec): 9.40 - samples/sec: 2254.16 - lr: 0.000013 - momentum: 0.000000
2023-10-13 09:06:19,337 epoch 8 - iter 240/304 - loss 0.01223233 - time (sec): 10.73 - samples/sec: 2270.97 - lr: 0.000012 - momentum: 0.000000
2023-10-13 09:06:20,662 epoch 8 - iter 270/304 - loss 0.01625321 - time (sec): 12.05 - samples/sec: 2296.37 - lr: 0.000012 - momentum: 0.000000
2023-10-13 09:06:21,972 epoch 8 - iter 300/304 - loss 0.01736944 - time (sec): 13.36 - samples/sec: 2294.13 - lr: 0.000011 - momentum: 0.000000
2023-10-13 09:06:22,147 ----------------------------------------------------------------------------------------------------
2023-10-13 09:06:22,148 EPOCH 8 done: loss 0.0172 - lr: 0.000011
2023-10-13 09:06:23,062 DEV : loss 0.22081097960472107 - f1-score (micro avg) 0.8281
2023-10-13 09:06:23,068 ----------------------------------------------------------------------------------------------------
2023-10-13 09:06:24,379 epoch 9 - iter 30/304 - loss 0.01196464 - time (sec): 1.31 - samples/sec: 2456.12 - lr: 0.000011 - momentum: 0.000000
2023-10-13 09:06:25,683 epoch 9 - iter 60/304 - loss 0.00801280 - time (sec): 2.61 - samples/sec: 2383.27 - lr: 0.000010 - momentum: 0.000000
2023-10-13 09:06:27,006 epoch 9 - iter 90/304 - loss 0.00586371 - time (sec): 3.94 - samples/sec: 2321.29 - lr: 0.000010 - momentum: 0.000000
2023-10-13 09:06:28,374 epoch 9 - iter 120/304 - loss 0.01255957 - time (sec): 5.31 - samples/sec: 2339.86 - lr: 0.000009 - momentum: 0.000000
2023-10-13 09:06:29,718 epoch 9 - iter 150/304 - loss 0.01240447 - time (sec): 6.65 - samples/sec: 2347.40 - lr: 0.000008 - momentum: 0.000000
2023-10-13 09:06:31,015 epoch 9 - iter 180/304 - loss 0.01195576 - time (sec): 7.95 - samples/sec: 2337.16 - lr: 0.000008 - momentum: 0.000000
2023-10-13 09:06:32,320 epoch 9 - iter 210/304 - loss 0.01295077 - time (sec): 9.25 - samples/sec: 2323.20 - lr: 0.000007 - momentum: 0.000000
2023-10-13 09:06:33,681 epoch 9 - iter 240/304 - loss 0.01168237 - time (sec): 10.61 - samples/sec: 2336.96 - lr: 0.000007 - momentum: 0.000000
2023-10-13 09:06:35,017 epoch 9 - iter 270/304 - loss 0.01202743 - time (sec): 11.95 - samples/sec: 2320.88 - lr: 0.000006 - momentum: 0.000000
2023-10-13 09:06:36,350 epoch 9 - iter 300/304 - loss 0.01178066 - time (sec): 13.28 - samples/sec: 2314.81 - lr: 0.000006 - momentum: 0.000000
2023-10-13 09:06:36,522 ----------------------------------------------------------------------------------------------------
2023-10-13 09:06:36,523 EPOCH 9 done: loss 0.0117 - lr: 0.000006
2023-10-13 09:06:37,425 DEV : loss 0.2193765640258789 - f1-score (micro avg) 0.8458
2023-10-13 09:06:37,431 saving best model
2023-10-13 09:06:37,926 ----------------------------------------------------------------------------------------------------
2023-10-13 09:06:39,259 epoch 10 - iter 30/304 - loss 0.00979948 - time (sec): 1.33 - samples/sec: 2350.60 - lr: 0.000005 - momentum: 0.000000
2023-10-13 09:06:40,577 epoch 10 - iter 60/304 - loss 0.00536257 - time (sec): 2.65 - samples/sec: 2333.35 - lr: 0.000005 - momentum: 0.000000
2023-10-13 09:06:41,871 epoch 10 - iter 90/304 - loss 0.00609703 - time (sec): 3.94 - samples/sec: 2317.60 - lr: 0.000004 - momentum: 0.000000
2023-10-13 09:06:43,182 epoch 10 - iter 120/304 - loss 0.00504606 - time (sec): 5.25 - samples/sec: 2292.91 - lr: 0.000003 - momentum: 0.000000
2023-10-13 09:06:44,555 epoch 10 - iter 150/304 - loss 0.00679310 - time (sec): 6.63 - samples/sec: 2261.60 - lr: 0.000003 - momentum: 0.000000
2023-10-13 09:06:45,866 epoch 10 - iter 180/304 - loss 0.00710765 - time (sec): 7.94 - samples/sec: 2290.19 - lr: 0.000002 - momentum: 0.000000
2023-10-13 09:06:47,144 epoch 10 - iter 210/304 - loss 0.00724926 - time (sec): 9.22 - samples/sec: 2318.40 - lr: 0.000002 - momentum: 0.000000
2023-10-13 09:06:48,455 epoch 10 - iter 240/304 - loss 0.00770948 - time (sec): 10.53 - samples/sec: 2323.53 - lr: 0.000001 - momentum: 0.000000
2023-10-13 09:06:49,867 epoch 10 - iter 270/304 - loss 0.00810010 - time (sec): 11.94 - samples/sec: 2302.68 - lr: 0.000001 - momentum: 0.000000
2023-10-13 09:06:51,172 epoch 10 - iter 300/304 - loss 0.00800972 - time (sec): 13.24 - samples/sec: 2308.48 - lr: 0.000000 - momentum: 0.000000
2023-10-13 09:06:51,342 ----------------------------------------------------------------------------------------------------
2023-10-13 09:06:51,343 EPOCH 10 done: loss 0.0080 - lr: 0.000000
2023-10-13 09:06:52,272 DEV : loss 0.21843381226062775 - f1-score (micro avg) 0.8498
2023-10-13 09:06:52,278 saving best model
2023-10-13 09:06:53,181 ----------------------------------------------------------------------------------------------------
2023-10-13 09:06:53,183 Loading model from best epoch ...
2023-10-13 09:06:55,102 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-date, B-date, E-date, I-date, S-object, B-object, E-object, I-object
2023-10-13 09:06:55,791
Results:
- F-score (micro) 0.7826
- F-score (macro) 0.5753
- Accuracy 0.6513
By class:
precision recall f1-score support
scope 0.7455 0.8146 0.7785 151
pers 0.7063 0.9271 0.8018 96
work 0.7241 0.8842 0.7962 95
date 0.0000 0.0000 0.0000 3
loc 1.0000 0.3333 0.5000 3
micro avg 0.7226 0.8534 0.7826 348
macro avg 0.6352 0.5918 0.5753 348
weighted avg 0.7246 0.8534 0.7806 348
2023-10-13 09:06:55,791 ----------------------------------------------------------------------------------------------------