stefan-it's picture
Upload folder using huggingface_hub
e552db2
2023-10-13 09:07:23,541 ----------------------------------------------------------------------------------------------------
2023-10-13 09:07:23,542 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 09:07:23,542 ----------------------------------------------------------------------------------------------------
2023-10-13 09:07:23,543 MultiCorpus: 1214 train + 266 dev + 251 test sentences
- NER_HIPE_2022 Corpus: 1214 train + 266 dev + 251 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/en/with_doc_seperator
2023-10-13 09:07:23,543 ----------------------------------------------------------------------------------------------------
2023-10-13 09:07:23,543 Train: 1214 sentences
2023-10-13 09:07:23,543 (train_with_dev=False, train_with_test=False)
2023-10-13 09:07:23,543 ----------------------------------------------------------------------------------------------------
2023-10-13 09:07:23,543 Training Params:
2023-10-13 09:07:23,543 - learning_rate: "3e-05"
2023-10-13 09:07:23,543 - mini_batch_size: "8"
2023-10-13 09:07:23,543 - max_epochs: "10"
2023-10-13 09:07:23,543 - shuffle: "True"
2023-10-13 09:07:23,543 ----------------------------------------------------------------------------------------------------
2023-10-13 09:07:23,543 Plugins:
2023-10-13 09:07:23,543 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 09:07:23,543 ----------------------------------------------------------------------------------------------------
2023-10-13 09:07:23,543 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 09:07:23,543 - metric: "('micro avg', 'f1-score')"
2023-10-13 09:07:23,543 ----------------------------------------------------------------------------------------------------
2023-10-13 09:07:23,543 Computation:
2023-10-13 09:07:23,543 - compute on device: cuda:0
2023-10-13 09:07:23,543 - embedding storage: none
2023-10-13 09:07:23,543 ----------------------------------------------------------------------------------------------------
2023-10-13 09:07:23,543 Model training base path: "hmbench-ajmc/en-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-13 09:07:23,543 ----------------------------------------------------------------------------------------------------
2023-10-13 09:07:23,543 ----------------------------------------------------------------------------------------------------
2023-10-13 09:07:24,390 epoch 1 - iter 15/152 - loss 3.40566329 - time (sec): 0.85 - samples/sec: 3826.98 - lr: 0.000003 - momentum: 0.000000
2023-10-13 09:07:25,268 epoch 1 - iter 30/152 - loss 3.18432802 - time (sec): 1.72 - samples/sec: 3702.31 - lr: 0.000006 - momentum: 0.000000
2023-10-13 09:07:26,155 epoch 1 - iter 45/152 - loss 2.73456322 - time (sec): 2.61 - samples/sec: 3725.54 - lr: 0.000009 - momentum: 0.000000
2023-10-13 09:07:27,044 epoch 1 - iter 60/152 - loss 2.26722564 - time (sec): 3.50 - samples/sec: 3674.24 - lr: 0.000012 - momentum: 0.000000
2023-10-13 09:07:27,893 epoch 1 - iter 75/152 - loss 1.97343150 - time (sec): 4.35 - samples/sec: 3656.59 - lr: 0.000015 - momentum: 0.000000
2023-10-13 09:07:28,710 epoch 1 - iter 90/152 - loss 1.78199051 - time (sec): 5.17 - samples/sec: 3610.25 - lr: 0.000018 - momentum: 0.000000
2023-10-13 09:07:29,590 epoch 1 - iter 105/152 - loss 1.59061250 - time (sec): 6.05 - samples/sec: 3623.48 - lr: 0.000021 - momentum: 0.000000
2023-10-13 09:07:30,388 epoch 1 - iter 120/152 - loss 1.45635461 - time (sec): 6.84 - samples/sec: 3620.21 - lr: 0.000023 - momentum: 0.000000
2023-10-13 09:07:31,255 epoch 1 - iter 135/152 - loss 1.35049222 - time (sec): 7.71 - samples/sec: 3576.02 - lr: 0.000026 - momentum: 0.000000
2023-10-13 09:07:32,096 epoch 1 - iter 150/152 - loss 1.25214594 - time (sec): 8.55 - samples/sec: 3581.36 - lr: 0.000029 - momentum: 0.000000
2023-10-13 09:07:32,208 ----------------------------------------------------------------------------------------------------
2023-10-13 09:07:32,209 EPOCH 1 done: loss 1.2412 - lr: 0.000029
2023-10-13 09:07:33,067 DEV : loss 0.2970544397830963 - f1-score (micro avg) 0.4374
2023-10-13 09:07:33,073 saving best model
2023-10-13 09:07:33,417 ----------------------------------------------------------------------------------------------------
2023-10-13 09:07:34,256 epoch 2 - iter 15/152 - loss 0.27318260 - time (sec): 0.84 - samples/sec: 3685.80 - lr: 0.000030 - momentum: 0.000000
2023-10-13 09:07:35,080 epoch 2 - iter 30/152 - loss 0.29477857 - time (sec): 1.66 - samples/sec: 3603.42 - lr: 0.000029 - momentum: 0.000000
2023-10-13 09:07:35,984 epoch 2 - iter 45/152 - loss 0.25516486 - time (sec): 2.57 - samples/sec: 3594.95 - lr: 0.000029 - momentum: 0.000000
2023-10-13 09:07:36,774 epoch 2 - iter 60/152 - loss 0.24379369 - time (sec): 3.36 - samples/sec: 3598.22 - lr: 0.000029 - momentum: 0.000000
2023-10-13 09:07:37,646 epoch 2 - iter 75/152 - loss 0.22531096 - time (sec): 4.23 - samples/sec: 3639.84 - lr: 0.000028 - momentum: 0.000000
2023-10-13 09:07:38,484 epoch 2 - iter 90/152 - loss 0.21717476 - time (sec): 5.07 - samples/sec: 3653.41 - lr: 0.000028 - momentum: 0.000000
2023-10-13 09:07:39,376 epoch 2 - iter 105/152 - loss 0.21405748 - time (sec): 5.96 - samples/sec: 3661.24 - lr: 0.000028 - momentum: 0.000000
2023-10-13 09:07:40,223 epoch 2 - iter 120/152 - loss 0.21384996 - time (sec): 6.81 - samples/sec: 3632.40 - lr: 0.000027 - momentum: 0.000000
2023-10-13 09:07:41,065 epoch 2 - iter 135/152 - loss 0.20845651 - time (sec): 7.65 - samples/sec: 3599.52 - lr: 0.000027 - momentum: 0.000000
2023-10-13 09:07:41,928 epoch 2 - iter 150/152 - loss 0.19834878 - time (sec): 8.51 - samples/sec: 3611.72 - lr: 0.000027 - momentum: 0.000000
2023-10-13 09:07:42,023 ----------------------------------------------------------------------------------------------------
2023-10-13 09:07:42,024 EPOCH 2 done: loss 0.1978 - lr: 0.000027
2023-10-13 09:07:42,931 DEV : loss 0.17218922078609467 - f1-score (micro avg) 0.6974
2023-10-13 09:07:42,937 saving best model
2023-10-13 09:07:43,397 ----------------------------------------------------------------------------------------------------
2023-10-13 09:07:44,260 epoch 3 - iter 15/152 - loss 0.08307837 - time (sec): 0.86 - samples/sec: 3510.04 - lr: 0.000026 - momentum: 0.000000
2023-10-13 09:07:45,168 epoch 3 - iter 30/152 - loss 0.09240108 - time (sec): 1.76 - samples/sec: 3440.09 - lr: 0.000026 - momentum: 0.000000
2023-10-13 09:07:46,003 epoch 3 - iter 45/152 - loss 0.10337295 - time (sec): 2.60 - samples/sec: 3469.01 - lr: 0.000026 - momentum: 0.000000
2023-10-13 09:07:46,891 epoch 3 - iter 60/152 - loss 0.10160162 - time (sec): 3.49 - samples/sec: 3466.01 - lr: 0.000025 - momentum: 0.000000
2023-10-13 09:07:47,760 epoch 3 - iter 75/152 - loss 0.11399711 - time (sec): 4.36 - samples/sec: 3438.64 - lr: 0.000025 - momentum: 0.000000
2023-10-13 09:07:48,614 epoch 3 - iter 90/152 - loss 0.10915553 - time (sec): 5.21 - samples/sec: 3464.37 - lr: 0.000025 - momentum: 0.000000
2023-10-13 09:07:49,535 epoch 3 - iter 105/152 - loss 0.11030199 - time (sec): 6.13 - samples/sec: 3511.93 - lr: 0.000024 - momentum: 0.000000
2023-10-13 09:07:50,336 epoch 3 - iter 120/152 - loss 0.11136644 - time (sec): 6.93 - samples/sec: 3555.21 - lr: 0.000024 - momentum: 0.000000
2023-10-13 09:07:51,159 epoch 3 - iter 135/152 - loss 0.10674892 - time (sec): 7.76 - samples/sec: 3546.66 - lr: 0.000024 - momentum: 0.000000
2023-10-13 09:07:52,021 epoch 3 - iter 150/152 - loss 0.10428374 - time (sec): 8.62 - samples/sec: 3559.96 - lr: 0.000023 - momentum: 0.000000
2023-10-13 09:07:52,124 ----------------------------------------------------------------------------------------------------
2023-10-13 09:07:52,124 EPOCH 3 done: loss 0.1064 - lr: 0.000023
2023-10-13 09:07:53,038 DEV : loss 0.14199090003967285 - f1-score (micro avg) 0.8029
2023-10-13 09:07:53,044 saving best model
2023-10-13 09:07:53,478 ----------------------------------------------------------------------------------------------------
2023-10-13 09:07:54,287 epoch 4 - iter 15/152 - loss 0.02904280 - time (sec): 0.81 - samples/sec: 3730.54 - lr: 0.000023 - momentum: 0.000000
2023-10-13 09:07:55,131 epoch 4 - iter 30/152 - loss 0.06484752 - time (sec): 1.65 - samples/sec: 3707.39 - lr: 0.000023 - momentum: 0.000000
2023-10-13 09:07:55,975 epoch 4 - iter 45/152 - loss 0.06690710 - time (sec): 2.50 - samples/sec: 3687.69 - lr: 0.000022 - momentum: 0.000000
2023-10-13 09:07:56,769 epoch 4 - iter 60/152 - loss 0.06903870 - time (sec): 3.29 - samples/sec: 3702.20 - lr: 0.000022 - momentum: 0.000000
2023-10-13 09:07:57,583 epoch 4 - iter 75/152 - loss 0.07081261 - time (sec): 4.10 - samples/sec: 3677.20 - lr: 0.000022 - momentum: 0.000000
2023-10-13 09:07:58,442 epoch 4 - iter 90/152 - loss 0.06884582 - time (sec): 4.96 - samples/sec: 3672.33 - lr: 0.000021 - momentum: 0.000000
2023-10-13 09:07:59,255 epoch 4 - iter 105/152 - loss 0.06524950 - time (sec): 5.78 - samples/sec: 3678.79 - lr: 0.000021 - momentum: 0.000000
2023-10-13 09:08:00,090 epoch 4 - iter 120/152 - loss 0.06474971 - time (sec): 6.61 - samples/sec: 3643.35 - lr: 0.000021 - momentum: 0.000000
2023-10-13 09:08:00,948 epoch 4 - iter 135/152 - loss 0.06312834 - time (sec): 7.47 - samples/sec: 3678.86 - lr: 0.000020 - momentum: 0.000000
2023-10-13 09:08:01,791 epoch 4 - iter 150/152 - loss 0.06782112 - time (sec): 8.31 - samples/sec: 3682.72 - lr: 0.000020 - momentum: 0.000000
2023-10-13 09:08:01,897 ----------------------------------------------------------------------------------------------------
2023-10-13 09:08:01,897 EPOCH 4 done: loss 0.0680 - lr: 0.000020
2023-10-13 09:08:02,805 DEV : loss 0.1484554409980774 - f1-score (micro avg) 0.8135
2023-10-13 09:08:02,811 saving best model
2023-10-13 09:08:03,238 ----------------------------------------------------------------------------------------------------
2023-10-13 09:08:04,147 epoch 5 - iter 15/152 - loss 0.05958663 - time (sec): 0.91 - samples/sec: 3673.77 - lr: 0.000020 - momentum: 0.000000
2023-10-13 09:08:05,012 epoch 5 - iter 30/152 - loss 0.05337856 - time (sec): 1.77 - samples/sec: 3483.10 - lr: 0.000019 - momentum: 0.000000
2023-10-13 09:08:06,048 epoch 5 - iter 45/152 - loss 0.05133641 - time (sec): 2.81 - samples/sec: 3371.51 - lr: 0.000019 - momentum: 0.000000
2023-10-13 09:08:06,858 epoch 5 - iter 60/152 - loss 0.04925455 - time (sec): 3.62 - samples/sec: 3472.88 - lr: 0.000019 - momentum: 0.000000
2023-10-13 09:08:07,694 epoch 5 - iter 75/152 - loss 0.04764550 - time (sec): 4.45 - samples/sec: 3487.25 - lr: 0.000018 - momentum: 0.000000
2023-10-13 09:08:08,506 epoch 5 - iter 90/152 - loss 0.04909598 - time (sec): 5.27 - samples/sec: 3544.47 - lr: 0.000018 - momentum: 0.000000
2023-10-13 09:08:09,318 epoch 5 - iter 105/152 - loss 0.04778141 - time (sec): 6.08 - samples/sec: 3556.08 - lr: 0.000018 - momentum: 0.000000
2023-10-13 09:08:10,153 epoch 5 - iter 120/152 - loss 0.04581216 - time (sec): 6.91 - samples/sec: 3612.78 - lr: 0.000017 - momentum: 0.000000
2023-10-13 09:08:10,940 epoch 5 - iter 135/152 - loss 0.04684558 - time (sec): 7.70 - samples/sec: 3617.53 - lr: 0.000017 - momentum: 0.000000
2023-10-13 09:08:11,753 epoch 5 - iter 150/152 - loss 0.05287046 - time (sec): 8.51 - samples/sec: 3606.44 - lr: 0.000017 - momentum: 0.000000
2023-10-13 09:08:11,846 ----------------------------------------------------------------------------------------------------
2023-10-13 09:08:11,846 EPOCH 5 done: loss 0.0528 - lr: 0.000017
2023-10-13 09:08:12,777 DEV : loss 0.16037791967391968 - f1-score (micro avg) 0.822
2023-10-13 09:08:12,783 saving best model
2023-10-13 09:08:13,203 ----------------------------------------------------------------------------------------------------
2023-10-13 09:08:14,078 epoch 6 - iter 15/152 - loss 0.03099565 - time (sec): 0.87 - samples/sec: 3833.17 - lr: 0.000016 - momentum: 0.000000
2023-10-13 09:08:14,868 epoch 6 - iter 30/152 - loss 0.03155409 - time (sec): 1.66 - samples/sec: 3663.52 - lr: 0.000016 - momentum: 0.000000
2023-10-13 09:08:15,682 epoch 6 - iter 45/152 - loss 0.03410084 - time (sec): 2.47 - samples/sec: 3711.69 - lr: 0.000016 - momentum: 0.000000
2023-10-13 09:08:16,537 epoch 6 - iter 60/152 - loss 0.03370562 - time (sec): 3.33 - samples/sec: 3809.43 - lr: 0.000015 - momentum: 0.000000
2023-10-13 09:08:17,352 epoch 6 - iter 75/152 - loss 0.03623556 - time (sec): 4.14 - samples/sec: 3755.41 - lr: 0.000015 - momentum: 0.000000
2023-10-13 09:08:18,161 epoch 6 - iter 90/152 - loss 0.03728025 - time (sec): 4.95 - samples/sec: 3744.16 - lr: 0.000015 - momentum: 0.000000
2023-10-13 09:08:19,036 epoch 6 - iter 105/152 - loss 0.03447885 - time (sec): 5.83 - samples/sec: 3639.48 - lr: 0.000014 - momentum: 0.000000
2023-10-13 09:08:19,926 epoch 6 - iter 120/152 - loss 0.03432150 - time (sec): 6.72 - samples/sec: 3627.54 - lr: 0.000014 - momentum: 0.000000
2023-10-13 09:08:20,827 epoch 6 - iter 135/152 - loss 0.03771134 - time (sec): 7.62 - samples/sec: 3619.90 - lr: 0.000014 - momentum: 0.000000
2023-10-13 09:08:21,673 epoch 6 - iter 150/152 - loss 0.03806320 - time (sec): 8.47 - samples/sec: 3614.21 - lr: 0.000013 - momentum: 0.000000
2023-10-13 09:08:21,769 ----------------------------------------------------------------------------------------------------
2023-10-13 09:08:21,770 EPOCH 6 done: loss 0.0380 - lr: 0.000013
2023-10-13 09:08:22,697 DEV : loss 0.16604389250278473 - f1-score (micro avg) 0.8201
2023-10-13 09:08:22,703 ----------------------------------------------------------------------------------------------------
2023-10-13 09:08:23,505 epoch 7 - iter 15/152 - loss 0.02004090 - time (sec): 0.80 - samples/sec: 3508.73 - lr: 0.000013 - momentum: 0.000000
2023-10-13 09:08:24,369 epoch 7 - iter 30/152 - loss 0.02865142 - time (sec): 1.66 - samples/sec: 3528.96 - lr: 0.000013 - momentum: 0.000000
2023-10-13 09:08:25,204 epoch 7 - iter 45/152 - loss 0.04041764 - time (sec): 2.50 - samples/sec: 3556.35 - lr: 0.000012 - momentum: 0.000000
2023-10-13 09:08:26,082 epoch 7 - iter 60/152 - loss 0.03524078 - time (sec): 3.38 - samples/sec: 3665.53 - lr: 0.000012 - momentum: 0.000000
2023-10-13 09:08:26,904 epoch 7 - iter 75/152 - loss 0.03407011 - time (sec): 4.20 - samples/sec: 3614.62 - lr: 0.000012 - momentum: 0.000000
2023-10-13 09:08:27,825 epoch 7 - iter 90/152 - loss 0.03072402 - time (sec): 5.12 - samples/sec: 3588.49 - lr: 0.000011 - momentum: 0.000000
2023-10-13 09:08:28,656 epoch 7 - iter 105/152 - loss 0.02920528 - time (sec): 5.95 - samples/sec: 3551.07 - lr: 0.000011 - momentum: 0.000000
2023-10-13 09:08:29,500 epoch 7 - iter 120/152 - loss 0.03208956 - time (sec): 6.80 - samples/sec: 3550.35 - lr: 0.000011 - momentum: 0.000000
2023-10-13 09:08:30,390 epoch 7 - iter 135/152 - loss 0.03186570 - time (sec): 7.69 - samples/sec: 3551.82 - lr: 0.000010 - momentum: 0.000000
2023-10-13 09:08:31,202 epoch 7 - iter 150/152 - loss 0.03292206 - time (sec): 8.50 - samples/sec: 3597.57 - lr: 0.000010 - momentum: 0.000000
2023-10-13 09:08:31,327 ----------------------------------------------------------------------------------------------------
2023-10-13 09:08:31,327 EPOCH 7 done: loss 0.0325 - lr: 0.000010
2023-10-13 09:08:32,239 DEV : loss 0.17990894615650177 - f1-score (micro avg) 0.8473
2023-10-13 09:08:32,245 saving best model
2023-10-13 09:08:32,672 ----------------------------------------------------------------------------------------------------
2023-10-13 09:08:33,459 epoch 8 - iter 15/152 - loss 0.01792334 - time (sec): 0.78 - samples/sec: 3709.73 - lr: 0.000010 - momentum: 0.000000
2023-10-13 09:08:34,333 epoch 8 - iter 30/152 - loss 0.01411895 - time (sec): 1.66 - samples/sec: 3527.74 - lr: 0.000009 - momentum: 0.000000
2023-10-13 09:08:35,175 epoch 8 - iter 45/152 - loss 0.01501876 - time (sec): 2.50 - samples/sec: 3587.00 - lr: 0.000009 - momentum: 0.000000
2023-10-13 09:08:36,029 epoch 8 - iter 60/152 - loss 0.02067936 - time (sec): 3.35 - samples/sec: 3609.56 - lr: 0.000009 - momentum: 0.000000
2023-10-13 09:08:36,904 epoch 8 - iter 75/152 - loss 0.01995176 - time (sec): 4.23 - samples/sec: 3618.22 - lr: 0.000008 - momentum: 0.000000
2023-10-13 09:08:37,750 epoch 8 - iter 90/152 - loss 0.01867516 - time (sec): 5.08 - samples/sec: 3561.32 - lr: 0.000008 - momentum: 0.000000
2023-10-13 09:08:38,580 epoch 8 - iter 105/152 - loss 0.01915858 - time (sec): 5.91 - samples/sec: 3590.07 - lr: 0.000008 - momentum: 0.000000
2023-10-13 09:08:39,484 epoch 8 - iter 120/152 - loss 0.01759270 - time (sec): 6.81 - samples/sec: 3578.25 - lr: 0.000007 - momentum: 0.000000
2023-10-13 09:08:40,351 epoch 8 - iter 135/152 - loss 0.02335082 - time (sec): 7.68 - samples/sec: 3606.09 - lr: 0.000007 - momentum: 0.000000
2023-10-13 09:08:41,196 epoch 8 - iter 150/152 - loss 0.02625407 - time (sec): 8.52 - samples/sec: 3597.79 - lr: 0.000007 - momentum: 0.000000
2023-10-13 09:08:41,295 ----------------------------------------------------------------------------------------------------
2023-10-13 09:08:41,295 EPOCH 8 done: loss 0.0260 - lr: 0.000007
2023-10-13 09:08:42,188 DEV : loss 0.18495041131973267 - f1-score (micro avg) 0.8359
2023-10-13 09:08:42,194 ----------------------------------------------------------------------------------------------------
2023-10-13 09:08:42,993 epoch 9 - iter 15/152 - loss 0.02396427 - time (sec): 0.80 - samples/sec: 4028.98 - lr: 0.000006 - momentum: 0.000000
2023-10-13 09:08:43,824 epoch 9 - iter 30/152 - loss 0.01479938 - time (sec): 1.63 - samples/sec: 3823.56 - lr: 0.000006 - momentum: 0.000000
2023-10-13 09:08:44,667 epoch 9 - iter 45/152 - loss 0.01197354 - time (sec): 2.47 - samples/sec: 3697.08 - lr: 0.000006 - momentum: 0.000000
2023-10-13 09:08:45,538 epoch 9 - iter 60/152 - loss 0.02206406 - time (sec): 3.34 - samples/sec: 3713.68 - lr: 0.000005 - momentum: 0.000000
2023-10-13 09:08:46,409 epoch 9 - iter 75/152 - loss 0.02341684 - time (sec): 4.21 - samples/sec: 3703.94 - lr: 0.000005 - momentum: 0.000000
2023-10-13 09:08:47,217 epoch 9 - iter 90/152 - loss 0.02270359 - time (sec): 5.02 - samples/sec: 3697.99 - lr: 0.000005 - momentum: 0.000000
2023-10-13 09:08:48,041 epoch 9 - iter 105/152 - loss 0.02303371 - time (sec): 5.85 - samples/sec: 3676.32 - lr: 0.000004 - momentum: 0.000000
2023-10-13 09:08:48,950 epoch 9 - iter 120/152 - loss 0.02189268 - time (sec): 6.76 - samples/sec: 3671.08 - lr: 0.000004 - momentum: 0.000000
2023-10-13 09:08:49,760 epoch 9 - iter 135/152 - loss 0.02187071 - time (sec): 7.57 - samples/sec: 3665.70 - lr: 0.000004 - momentum: 0.000000
2023-10-13 09:08:50,610 epoch 9 - iter 150/152 - loss 0.02281646 - time (sec): 8.41 - samples/sec: 3653.29 - lr: 0.000004 - momentum: 0.000000
2023-10-13 09:08:50,701 ----------------------------------------------------------------------------------------------------
2023-10-13 09:08:50,701 EPOCH 9 done: loss 0.0226 - lr: 0.000004
2023-10-13 09:08:51,602 DEV : loss 0.19286410510540009 - f1-score (micro avg) 0.8343
2023-10-13 09:08:51,608 ----------------------------------------------------------------------------------------------------
2023-10-13 09:08:52,434 epoch 10 - iter 15/152 - loss 0.01360042 - time (sec): 0.82 - samples/sec: 3795.60 - lr: 0.000003 - momentum: 0.000000
2023-10-13 09:08:53,296 epoch 10 - iter 30/152 - loss 0.00893190 - time (sec): 1.69 - samples/sec: 3666.43 - lr: 0.000003 - momentum: 0.000000
2023-10-13 09:08:54,113 epoch 10 - iter 45/152 - loss 0.01103150 - time (sec): 2.50 - samples/sec: 3649.97 - lr: 0.000002 - momentum: 0.000000
2023-10-13 09:08:54,932 epoch 10 - iter 60/152 - loss 0.01025132 - time (sec): 3.32 - samples/sec: 3626.51 - lr: 0.000002 - momentum: 0.000000
2023-10-13 09:08:55,757 epoch 10 - iter 75/152 - loss 0.01395742 - time (sec): 4.15 - samples/sec: 3613.59 - lr: 0.000002 - momentum: 0.000000
2023-10-13 09:08:56,613 epoch 10 - iter 90/152 - loss 0.01418562 - time (sec): 5.00 - samples/sec: 3633.52 - lr: 0.000002 - momentum: 0.000000
2023-10-13 09:08:57,436 epoch 10 - iter 105/152 - loss 0.01456598 - time (sec): 5.83 - samples/sec: 3667.17 - lr: 0.000001 - momentum: 0.000000
2023-10-13 09:08:58,272 epoch 10 - iter 120/152 - loss 0.01501693 - time (sec): 6.66 - samples/sec: 3671.50 - lr: 0.000001 - momentum: 0.000000
2023-10-13 09:08:59,141 epoch 10 - iter 135/152 - loss 0.01754617 - time (sec): 7.53 - samples/sec: 3650.02 - lr: 0.000001 - momentum: 0.000000
2023-10-13 09:08:59,966 epoch 10 - iter 150/152 - loss 0.01803580 - time (sec): 8.36 - samples/sec: 3658.82 - lr: 0.000000 - momentum: 0.000000
2023-10-13 09:09:00,071 ----------------------------------------------------------------------------------------------------
2023-10-13 09:09:00,071 EPOCH 10 done: loss 0.0180 - lr: 0.000000
2023-10-13 09:09:01,010 DEV : loss 0.19247612357139587 - f1-score (micro avg) 0.837
2023-10-13 09:09:01,348 ----------------------------------------------------------------------------------------------------
2023-10-13 09:09:01,349 Loading model from best epoch ...
2023-10-13 09:09:02,755 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-date, B-date, E-date, I-date, S-object, B-object, E-object, I-object
2023-10-13 09:09:03,606
Results:
- F-score (micro) 0.7871
- F-score (macro) 0.4757
- Accuracy 0.6622
By class:
precision recall f1-score support
scope 0.7871 0.8079 0.7974 151
pers 0.7236 0.9271 0.8128 96
work 0.6860 0.8737 0.7685 95
loc 0.0000 0.0000 0.0000 3
date 0.0000 0.0000 0.0000 3
micro avg 0.7368 0.8448 0.7871 348
macro avg 0.4393 0.5217 0.4757 348
weighted avg 0.7284 0.8448 0.7800 348
2023-10-13 09:09:03,606 ----------------------------------------------------------------------------------------------------