model: add fine-tuned model
Browse files- training.log +384 -0
training.log
ADDED
@@ -0,0 +1,384 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
2024-09-03 22:08:14,383 ----------------------------------------------------------------------------------------------------
|
2 |
+
2024-09-03 22:08:14,384 Model: "SequenceTagger(
|
3 |
+
(embeddings): TransformerWordEmbeddings(
|
4 |
+
(model): BertModel(
|
5 |
+
(embeddings): BertEmbeddings(
|
6 |
+
(word_embeddings): Embedding(31103, 1024, padding_idx=0)
|
7 |
+
(position_embeddings): Embedding(512, 1024)
|
8 |
+
(token_type_embeddings): Embedding(2, 1024)
|
9 |
+
(LayerNorm): LayerNorm((1024,), eps=1e-12, elementwise_affine=True)
|
10 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
11 |
+
)
|
12 |
+
(encoder): BertEncoder(
|
13 |
+
(layer): ModuleList(
|
14 |
+
(0-23): 24 x BertLayer(
|
15 |
+
(attention): BertAttention(
|
16 |
+
(self): BertSdpaSelfAttention(
|
17 |
+
(query): Linear(in_features=1024, out_features=1024, bias=True)
|
18 |
+
(key): Linear(in_features=1024, out_features=1024, bias=True)
|
19 |
+
(value): Linear(in_features=1024, out_features=1024, bias=True)
|
20 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
21 |
+
)
|
22 |
+
(output): BertSelfOutput(
|
23 |
+
(dense): Linear(in_features=1024, out_features=1024, bias=True)
|
24 |
+
(LayerNorm): LayerNorm((1024,), eps=1e-12, elementwise_affine=True)
|
25 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
26 |
+
)
|
27 |
+
)
|
28 |
+
(intermediate): BertIntermediate(
|
29 |
+
(dense): Linear(in_features=1024, out_features=4096, bias=True)
|
30 |
+
(intermediate_act_fn): GELUActivation()
|
31 |
+
)
|
32 |
+
(output): BertOutput(
|
33 |
+
(dense): Linear(in_features=4096, out_features=1024, bias=True)
|
34 |
+
(LayerNorm): LayerNorm((1024,), eps=1e-12, elementwise_affine=True)
|
35 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
36 |
+
)
|
37 |
+
)
|
38 |
+
)
|
39 |
+
)
|
40 |
+
(pooler): BertPooler(
|
41 |
+
(dense): Linear(in_features=1024, out_features=1024, bias=True)
|
42 |
+
(activation): Tanh()
|
43 |
+
)
|
44 |
+
)
|
45 |
+
)
|
46 |
+
(locked_dropout): LockedDropout(p=0.5)
|
47 |
+
(linear): Linear(in_features=1024, out_features=17, bias=True)
|
48 |
+
(loss_function): CrossEntropyLoss()
|
49 |
+
)"
|
50 |
+
2024-09-03 22:08:14,384 ----------------------------------------------------------------------------------------------------
|
51 |
+
2024-09-03 22:08:14,384 Corpus: 2869 train + 338 dev + 370 test sentences
|
52 |
+
2024-09-03 22:08:14,384 ----------------------------------------------------------------------------------------------------
|
53 |
+
2024-09-03 22:08:14,384 Train: 2869 sentences
|
54 |
+
2024-09-03 22:08:14,384 (train_with_dev=False, train_with_test=False)
|
55 |
+
2024-09-03 22:08:14,384 ----------------------------------------------------------------------------------------------------
|
56 |
+
2024-09-03 22:08:14,384 Training Params:
|
57 |
+
2024-09-03 22:08:14,384 - learning_rate: "1e-05"
|
58 |
+
2024-09-03 22:08:14,384 - mini_batch_size: "32"
|
59 |
+
2024-09-03 22:08:14,384 - max_epochs: "20"
|
60 |
+
2024-09-03 22:08:14,384 - shuffle: "True"
|
61 |
+
2024-09-03 22:08:14,384 ----------------------------------------------------------------------------------------------------
|
62 |
+
2024-09-03 22:08:14,384 Plugins:
|
63 |
+
2024-09-03 22:08:14,384 - TensorboardLogger
|
64 |
+
2024-09-03 22:08:14,384 - LinearScheduler | warmup_fraction: '0.1'
|
65 |
+
2024-09-03 22:08:14,384 ----------------------------------------------------------------------------------------------------
|
66 |
+
2024-09-03 22:08:14,384 Final evaluation on model from best epoch (best-model.pt)
|
67 |
+
2024-09-03 22:08:14,384 - metric: "('micro avg', 'f1-score')"
|
68 |
+
2024-09-03 22:08:14,384 ----------------------------------------------------------------------------------------------------
|
69 |
+
2024-09-03 22:08:14,384 Computation:
|
70 |
+
2024-09-03 22:08:14,384 - compute on device: cuda:0
|
71 |
+
2024-09-03 22:08:14,384 - embedding storage: none
|
72 |
+
2024-09-03 22:08:14,384 ----------------------------------------------------------------------------------------------------
|
73 |
+
2024-09-03 22:08:14,384 Model training base path: "flair-barner-coarse-grained-gbert_large-bs32-e20-lr1e-05-2"
|
74 |
+
2024-09-03 22:08:14,384 ----------------------------------------------------------------------------------------------------
|
75 |
+
2024-09-03 22:08:14,385 ----------------------------------------------------------------------------------------------------
|
76 |
+
2024-09-03 22:08:14,385 Logging anything other than scalars to TensorBoard is currently not supported.
|
77 |
+
2024-09-03 22:08:18,474 epoch 1 - iter 9/90 - loss 3.07625880 - time (sec): 4.09 - samples/sec: 1570.98 - lr: 0.000000 - momentum: 0.000000
|
78 |
+
2024-09-03 22:08:22,848 epoch 1 - iter 18/90 - loss 3.03830863 - time (sec): 8.46 - samples/sec: 1486.84 - lr: 0.000001 - momentum: 0.000000
|
79 |
+
2024-09-03 22:08:27,463 epoch 1 - iter 27/90 - loss 2.92345816 - time (sec): 13.08 - samples/sec: 1448.18 - lr: 0.000001 - momentum: 0.000000
|
80 |
+
2024-09-03 22:08:31,615 epoch 1 - iter 36/90 - loss 2.76830934 - time (sec): 17.23 - samples/sec: 1445.16 - lr: 0.000002 - momentum: 0.000000
|
81 |
+
2024-09-03 22:08:35,829 epoch 1 - iter 45/90 - loss 2.55828324 - time (sec): 21.44 - samples/sec: 1445.58 - lr: 0.000002 - momentum: 0.000000
|
82 |
+
2024-09-03 22:08:40,281 epoch 1 - iter 54/90 - loss 2.27911363 - time (sec): 25.90 - samples/sec: 1443.03 - lr: 0.000003 - momentum: 0.000000
|
83 |
+
2024-09-03 22:08:43,966 epoch 1 - iter 63/90 - loss 2.04649360 - time (sec): 29.58 - samples/sec: 1451.91 - lr: 0.000003 - momentum: 0.000000
|
84 |
+
2024-09-03 22:08:48,754 epoch 1 - iter 72/90 - loss 1.85199452 - time (sec): 34.37 - samples/sec: 1430.44 - lr: 0.000004 - momentum: 0.000000
|
85 |
+
2024-09-03 22:08:52,698 epoch 1 - iter 81/90 - loss 1.70122297 - time (sec): 38.31 - samples/sec: 1439.88 - lr: 0.000004 - momentum: 0.000000
|
86 |
+
2024-09-03 22:08:56,719 epoch 1 - iter 90/90 - loss 1.57972824 - time (sec): 42.33 - samples/sec: 1450.22 - lr: 0.000005 - momentum: 0.000000
|
87 |
+
2024-09-03 22:08:56,720 ----------------------------------------------------------------------------------------------------
|
88 |
+
2024-09-03 22:08:56,720 EPOCH 1 done: loss 1.5797 - lr: 0.000005
|
89 |
+
2024-09-03 22:08:58,183 DEV : loss 0.46851032972335815 - f1-score (micro avg) 0.0
|
90 |
+
2024-09-03 22:08:58,187 ----------------------------------------------------------------------------------------------------
|
91 |
+
2024-09-03 22:09:02,729 epoch 2 - iter 9/90 - loss 0.39452797 - time (sec): 4.54 - samples/sec: 1372.01 - lr: 0.000005 - momentum: 0.000000
|
92 |
+
2024-09-03 22:09:07,298 epoch 2 - iter 18/90 - loss 0.34943089 - time (sec): 9.11 - samples/sec: 1363.54 - lr: 0.000006 - momentum: 0.000000
|
93 |
+
2024-09-03 22:09:11,191 epoch 2 - iter 27/90 - loss 0.33048569 - time (sec): 13.00 - samples/sec: 1440.78 - lr: 0.000006 - momentum: 0.000000
|
94 |
+
2024-09-03 22:09:14,919 epoch 2 - iter 36/90 - loss 0.33742609 - time (sec): 16.73 - samples/sec: 1473.34 - lr: 0.000007 - momentum: 0.000000
|
95 |
+
2024-09-03 22:09:19,119 epoch 2 - iter 45/90 - loss 0.32836169 - time (sec): 20.93 - samples/sec: 1473.38 - lr: 0.000007 - momentum: 0.000000
|
96 |
+
2024-09-03 22:09:23,366 epoch 2 - iter 54/90 - loss 0.32386979 - time (sec): 25.18 - samples/sec: 1468.88 - lr: 0.000008 - momentum: 0.000000
|
97 |
+
2024-09-03 22:09:27,760 epoch 2 - iter 63/90 - loss 0.31846694 - time (sec): 29.57 - samples/sec: 1451.24 - lr: 0.000008 - momentum: 0.000000
|
98 |
+
2024-09-03 22:09:32,410 epoch 2 - iter 72/90 - loss 0.30906101 - time (sec): 34.22 - samples/sec: 1428.99 - lr: 0.000009 - momentum: 0.000000
|
99 |
+
2024-09-03 22:09:37,250 epoch 2 - iter 81/90 - loss 0.30259718 - time (sec): 39.06 - samples/sec: 1417.97 - lr: 0.000009 - momentum: 0.000000
|
100 |
+
2024-09-03 22:09:41,763 epoch 2 - iter 90/90 - loss 0.29697888 - time (sec): 43.57 - samples/sec: 1408.93 - lr: 0.000010 - momentum: 0.000000
|
101 |
+
2024-09-03 22:09:41,763 ----------------------------------------------------------------------------------------------------
|
102 |
+
2024-09-03 22:09:41,763 EPOCH 2 done: loss 0.2970 - lr: 0.000010
|
103 |
+
2024-09-03 22:09:43,313 DEV : loss 0.32844364643096924 - f1-score (micro avg) 0.4444
|
104 |
+
2024-09-03 22:09:43,318 saving best model
|
105 |
+
2024-09-03 22:09:44,704 ----------------------------------------------------------------------------------------------------
|
106 |
+
2024-09-03 22:09:48,810 epoch 3 - iter 9/90 - loss 0.18021120 - time (sec): 4.10 - samples/sec: 1451.01 - lr: 0.000010 - momentum: 0.000000
|
107 |
+
2024-09-03 22:09:52,605 epoch 3 - iter 18/90 - loss 0.18710233 - time (sec): 7.90 - samples/sec: 1551.86 - lr: 0.000010 - momentum: 0.000000
|
108 |
+
2024-09-03 22:09:57,165 epoch 3 - iter 27/90 - loss 0.19642900 - time (sec): 12.46 - samples/sec: 1444.72 - lr: 0.000010 - momentum: 0.000000
|
109 |
+
2024-09-03 22:10:01,325 epoch 3 - iter 36/90 - loss 0.19534522 - time (sec): 16.62 - samples/sec: 1462.08 - lr: 0.000010 - momentum: 0.000000
|
110 |
+
2024-09-03 22:10:05,793 epoch 3 - iter 45/90 - loss 0.19526541 - time (sec): 21.09 - samples/sec: 1449.57 - lr: 0.000010 - momentum: 0.000000
|
111 |
+
2024-09-03 22:10:10,319 epoch 3 - iter 54/90 - loss 0.18898441 - time (sec): 25.61 - samples/sec: 1435.75 - lr: 0.000010 - momentum: 0.000000
|
112 |
+
2024-09-03 22:10:15,005 epoch 3 - iter 63/90 - loss 0.18734274 - time (sec): 30.30 - samples/sec: 1425.75 - lr: 0.000010 - momentum: 0.000000
|
113 |
+
2024-09-03 22:10:19,484 epoch 3 - iter 72/90 - loss 0.19035033 - time (sec): 34.78 - samples/sec: 1425.46 - lr: 0.000010 - momentum: 0.000000
|
114 |
+
2024-09-03 22:10:23,866 epoch 3 - iter 81/90 - loss 0.18770810 - time (sec): 39.16 - samples/sec: 1422.17 - lr: 0.000010 - momentum: 0.000000
|
115 |
+
2024-09-03 22:10:27,920 epoch 3 - iter 90/90 - loss 0.18703645 - time (sec): 43.22 - samples/sec: 1420.65 - lr: 0.000009 - momentum: 0.000000
|
116 |
+
2024-09-03 22:10:27,920 ----------------------------------------------------------------------------------------------------
|
117 |
+
2024-09-03 22:10:27,921 EPOCH 3 done: loss 0.1870 - lr: 0.000009
|
118 |
+
2024-09-03 22:10:29,473 DEV : loss 0.23561017215251923 - f1-score (micro avg) 0.6408
|
119 |
+
2024-09-03 22:10:29,478 saving best model
|
120 |
+
2024-09-03 22:10:31,205 ----------------------------------------------------------------------------------------------------
|
121 |
+
2024-09-03 22:10:35,117 epoch 4 - iter 9/90 - loss 0.17246494 - time (sec): 3.91 - samples/sec: 1531.46 - lr: 0.000009 - momentum: 0.000000
|
122 |
+
2024-09-03 22:10:39,356 epoch 4 - iter 18/90 - loss 0.15134387 - time (sec): 8.15 - samples/sec: 1475.32 - lr: 0.000009 - momentum: 0.000000
|
123 |
+
2024-09-03 22:10:43,735 epoch 4 - iter 27/90 - loss 0.13746185 - time (sec): 12.53 - samples/sec: 1472.62 - lr: 0.000009 - momentum: 0.000000
|
124 |
+
2024-09-03 22:10:47,797 epoch 4 - iter 36/90 - loss 0.13417880 - time (sec): 16.59 - samples/sec: 1476.25 - lr: 0.000009 - momentum: 0.000000
|
125 |
+
2024-09-03 22:10:52,926 epoch 4 - iter 45/90 - loss 0.12898354 - time (sec): 21.72 - samples/sec: 1426.82 - lr: 0.000009 - momentum: 0.000000
|
126 |
+
2024-09-03 22:10:57,315 epoch 4 - iter 54/90 - loss 0.12655651 - time (sec): 26.11 - samples/sec: 1426.80 - lr: 0.000009 - momentum: 0.000000
|
127 |
+
2024-09-03 22:11:01,590 epoch 4 - iter 63/90 - loss 0.12590330 - time (sec): 30.38 - samples/sec: 1426.07 - lr: 0.000009 - momentum: 0.000000
|
128 |
+
2024-09-03 22:11:06,149 epoch 4 - iter 72/90 - loss 0.12273481 - time (sec): 34.94 - samples/sec: 1423.24 - lr: 0.000009 - momentum: 0.000000
|
129 |
+
2024-09-03 22:11:10,823 epoch 4 - iter 81/90 - loss 0.12123292 - time (sec): 39.62 - samples/sec: 1408.31 - lr: 0.000009 - momentum: 0.000000
|
130 |
+
2024-09-03 22:11:14,547 epoch 4 - iter 90/90 - loss 0.11969701 - time (sec): 43.34 - samples/sec: 1416.55 - lr: 0.000009 - momentum: 0.000000
|
131 |
+
2024-09-03 22:11:14,547 ----------------------------------------------------------------------------------------------------
|
132 |
+
2024-09-03 22:11:14,547 EPOCH 4 done: loss 0.1197 - lr: 0.000009
|
133 |
+
2024-09-03 22:11:16,092 DEV : loss 0.19088450074195862 - f1-score (micro avg) 0.7163
|
134 |
+
2024-09-03 22:11:16,096 saving best model
|
135 |
+
2024-09-03 22:11:17,838 ----------------------------------------------------------------------------------------------------
|
136 |
+
2024-09-03 22:11:22,005 epoch 5 - iter 9/90 - loss 0.07796958 - time (sec): 4.17 - samples/sec: 1427.37 - lr: 0.000009 - momentum: 0.000000
|
137 |
+
2024-09-03 22:11:26,793 epoch 5 - iter 18/90 - loss 0.09505982 - time (sec): 8.95 - samples/sec: 1347.83 - lr: 0.000009 - momentum: 0.000000
|
138 |
+
2024-09-03 22:11:30,524 epoch 5 - iter 27/90 - loss 0.09038601 - time (sec): 12.69 - samples/sec: 1419.11 - lr: 0.000009 - momentum: 0.000000
|
139 |
+
2024-09-03 22:11:34,896 epoch 5 - iter 36/90 - loss 0.08880525 - time (sec): 17.06 - samples/sec: 1408.49 - lr: 0.000009 - momentum: 0.000000
|
140 |
+
2024-09-03 22:11:39,545 epoch 5 - iter 45/90 - loss 0.08665835 - time (sec): 21.71 - samples/sec: 1396.96 - lr: 0.000009 - momentum: 0.000000
|
141 |
+
2024-09-03 22:11:43,572 epoch 5 - iter 54/90 - loss 0.08590365 - time (sec): 25.73 - samples/sec: 1416.95 - lr: 0.000009 - momentum: 0.000000
|
142 |
+
2024-09-03 22:11:47,888 epoch 5 - iter 63/90 - loss 0.08303076 - time (sec): 30.05 - samples/sec: 1426.28 - lr: 0.000009 - momentum: 0.000000
|
143 |
+
2024-09-03 22:11:52,321 epoch 5 - iter 72/90 - loss 0.08046962 - time (sec): 34.48 - samples/sec: 1433.50 - lr: 0.000008 - momentum: 0.000000
|
144 |
+
2024-09-03 22:11:56,489 epoch 5 - iter 81/90 - loss 0.07746895 - time (sec): 38.65 - samples/sec: 1432.76 - lr: 0.000008 - momentum: 0.000000
|
145 |
+
2024-09-03 22:12:00,972 epoch 5 - iter 90/90 - loss 0.07524486 - time (sec): 43.13 - samples/sec: 1423.36 - lr: 0.000008 - momentum: 0.000000
|
146 |
+
2024-09-03 22:12:00,972 ----------------------------------------------------------------------------------------------------
|
147 |
+
2024-09-03 22:12:00,972 EPOCH 5 done: loss 0.0752 - lr: 0.000008
|
148 |
+
2024-09-03 22:12:02,521 DEV : loss 0.1980796456336975 - f1-score (micro avg) 0.7235
|
149 |
+
2024-09-03 22:12:02,525 saving best model
|
150 |
+
2024-09-03 22:12:04,280 ----------------------------------------------------------------------------------------------------
|
151 |
+
2024-09-03 22:12:09,200 epoch 6 - iter 9/90 - loss 0.05086435 - time (sec): 4.92 - samples/sec: 1307.95 - lr: 0.000008 - momentum: 0.000000
|
152 |
+
2024-09-03 22:12:13,315 epoch 6 - iter 18/90 - loss 0.05627511 - time (sec): 9.03 - samples/sec: 1388.10 - lr: 0.000008 - momentum: 0.000000
|
153 |
+
2024-09-03 22:12:17,684 epoch 6 - iter 27/90 - loss 0.05226291 - time (sec): 13.40 - samples/sec: 1401.62 - lr: 0.000008 - momentum: 0.000000
|
154 |
+
2024-09-03 22:12:21,743 epoch 6 - iter 36/90 - loss 0.04931367 - time (sec): 17.46 - samples/sec: 1410.58 - lr: 0.000008 - momentum: 0.000000
|
155 |
+
2024-09-03 22:12:26,185 epoch 6 - iter 45/90 - loss 0.04709793 - time (sec): 21.90 - samples/sec: 1403.82 - lr: 0.000008 - momentum: 0.000000
|
156 |
+
2024-09-03 22:12:30,399 epoch 6 - iter 54/90 - loss 0.05018191 - time (sec): 26.12 - samples/sec: 1416.40 - lr: 0.000008 - momentum: 0.000000
|
157 |
+
2024-09-03 22:12:34,633 epoch 6 - iter 63/90 - loss 0.04949260 - time (sec): 30.35 - samples/sec: 1410.00 - lr: 0.000008 - momentum: 0.000000
|
158 |
+
2024-09-03 22:12:39,250 epoch 6 - iter 72/90 - loss 0.05163362 - time (sec): 34.97 - samples/sec: 1394.74 - lr: 0.000008 - momentum: 0.000000
|
159 |
+
2024-09-03 22:12:43,749 epoch 6 - iter 81/90 - loss 0.04998041 - time (sec): 39.47 - samples/sec: 1401.93 - lr: 0.000008 - momentum: 0.000000
|
160 |
+
2024-09-03 22:12:47,550 epoch 6 - iter 90/90 - loss 0.04991602 - time (sec): 43.27 - samples/sec: 1418.90 - lr: 0.000008 - momentum: 0.000000
|
161 |
+
2024-09-03 22:12:47,550 ----------------------------------------------------------------------------------------------------
|
162 |
+
2024-09-03 22:12:47,550 EPOCH 6 done: loss 0.0499 - lr: 0.000008
|
163 |
+
2024-09-03 22:12:49,108 DEV : loss 0.17487134039402008 - f1-score (micro avg) 0.7658
|
164 |
+
2024-09-03 22:12:49,113 saving best model
|
165 |
+
2024-09-03 22:12:50,858 ----------------------------------------------------------------------------------------------------
|
166 |
+
2024-09-03 22:12:54,868 epoch 7 - iter 9/90 - loss 0.02538037 - time (sec): 4.01 - samples/sec: 1477.46 - lr: 0.000008 - momentum: 0.000000
|
167 |
+
2024-09-03 22:12:59,226 epoch 7 - iter 18/90 - loss 0.03372476 - time (sec): 8.37 - samples/sec: 1442.49 - lr: 0.000008 - momentum: 0.000000
|
168 |
+
2024-09-03 22:13:03,761 epoch 7 - iter 27/90 - loss 0.03282378 - time (sec): 12.90 - samples/sec: 1432.08 - lr: 0.000008 - momentum: 0.000000
|
169 |
+
2024-09-03 22:13:07,554 epoch 7 - iter 36/90 - loss 0.03431105 - time (sec): 16.69 - samples/sec: 1446.04 - lr: 0.000008 - momentum: 0.000000
|
170 |
+
2024-09-03 22:13:12,006 epoch 7 - iter 45/90 - loss 0.03225077 - time (sec): 21.15 - samples/sec: 1452.28 - lr: 0.000008 - momentum: 0.000000
|
171 |
+
2024-09-03 22:13:16,824 epoch 7 - iter 54/90 - loss 0.03304733 - time (sec): 25.96 - samples/sec: 1419.88 - lr: 0.000007 - momentum: 0.000000
|
172 |
+
2024-09-03 22:13:21,125 epoch 7 - iter 63/90 - loss 0.03408886 - time (sec): 30.27 - samples/sec: 1428.43 - lr: 0.000007 - momentum: 0.000000
|
173 |
+
2024-09-03 22:13:25,202 epoch 7 - iter 72/90 - loss 0.03776201 - time (sec): 34.34 - samples/sec: 1434.95 - lr: 0.000007 - momentum: 0.000000
|
174 |
+
2024-09-03 22:13:29,135 epoch 7 - iter 81/90 - loss 0.03710028 - time (sec): 38.28 - samples/sec: 1448.37 - lr: 0.000007 - momentum: 0.000000
|
175 |
+
2024-09-03 22:13:33,099 epoch 7 - iter 90/90 - loss 0.03726568 - time (sec): 42.24 - samples/sec: 1453.46 - lr: 0.000007 - momentum: 0.000000
|
176 |
+
2024-09-03 22:13:33,099 ----------------------------------------------------------------------------------------------------
|
177 |
+
2024-09-03 22:13:33,099 EPOCH 7 done: loss 0.0373 - lr: 0.000007
|
178 |
+
2024-09-03 22:13:34,650 DEV : loss 0.18173354864120483 - f1-score (micro avg) 0.7627
|
179 |
+
2024-09-03 22:13:34,654 ----------------------------------------------------------------------------------------------------
|
180 |
+
2024-09-03 22:13:38,876 epoch 8 - iter 9/90 - loss 0.02092667 - time (sec): 4.22 - samples/sec: 1404.86 - lr: 0.000007 - momentum: 0.000000
|
181 |
+
2024-09-03 22:13:42,977 epoch 8 - iter 18/90 - loss 0.02438277 - time (sec): 8.32 - samples/sec: 1452.84 - lr: 0.000007 - momentum: 0.000000
|
182 |
+
2024-09-03 22:13:47,147 epoch 8 - iter 27/90 - loss 0.02262059 - time (sec): 12.49 - samples/sec: 1462.61 - lr: 0.000007 - momentum: 0.000000
|
183 |
+
2024-09-03 22:13:51,474 epoch 8 - iter 36/90 - loss 0.02522541 - time (sec): 16.82 - samples/sec: 1484.42 - lr: 0.000007 - momentum: 0.000000
|
184 |
+
2024-09-03 22:13:55,997 epoch 8 - iter 45/90 - loss 0.02481957 - time (sec): 21.34 - samples/sec: 1457.11 - lr: 0.000007 - momentum: 0.000000
|
185 |
+
2024-09-03 22:14:00,761 epoch 8 - iter 54/90 - loss 0.02359039 - time (sec): 26.11 - samples/sec: 1438.57 - lr: 0.000007 - momentum: 0.000000
|
186 |
+
2024-09-03 22:14:05,146 epoch 8 - iter 63/90 - loss 0.02355433 - time (sec): 30.49 - samples/sec: 1435.74 - lr: 0.000007 - momentum: 0.000000
|
187 |
+
2024-09-03 22:14:09,274 epoch 8 - iter 72/90 - loss 0.02419780 - time (sec): 34.62 - samples/sec: 1421.47 - lr: 0.000007 - momentum: 0.000000
|
188 |
+
2024-09-03 22:14:13,543 epoch 8 - iter 81/90 - loss 0.02375625 - time (sec): 38.89 - samples/sec: 1422.79 - lr: 0.000007 - momentum: 0.000000
|
189 |
+
2024-09-03 22:14:17,829 epoch 8 - iter 90/90 - loss 0.02349011 - time (sec): 43.17 - samples/sec: 1422.01 - lr: 0.000007 - momentum: 0.000000
|
190 |
+
2024-09-03 22:14:17,829 ----------------------------------------------------------------------------------------------------
|
191 |
+
2024-09-03 22:14:17,829 EPOCH 8 done: loss 0.0235 - lr: 0.000007
|
192 |
+
2024-09-03 22:14:19,380 DEV : loss 0.20718424022197723 - f1-score (micro avg) 0.7504
|
193 |
+
2024-09-03 22:14:19,384 ----------------------------------------------------------------------------------------------------
|
194 |
+
2024-09-03 22:14:23,485 epoch 9 - iter 9/90 - loss 0.02255550 - time (sec): 4.10 - samples/sec: 1531.81 - lr: 0.000007 - momentum: 0.000000
|
195 |
+
2024-09-03 22:14:27,762 epoch 9 - iter 18/90 - loss 0.01867817 - time (sec): 8.38 - samples/sec: 1470.40 - lr: 0.000007 - momentum: 0.000000
|
196 |
+
2024-09-03 22:14:32,041 epoch 9 - iter 27/90 - loss 0.01821711 - time (sec): 12.66 - samples/sec: 1460.17 - lr: 0.000007 - momentum: 0.000000
|
197 |
+
2024-09-03 22:14:36,587 epoch 9 - iter 36/90 - loss 0.01817603 - time (sec): 17.20 - samples/sec: 1439.66 - lr: 0.000006 - momentum: 0.000000
|
198 |
+
2024-09-03 22:14:40,717 epoch 9 - iter 45/90 - loss 0.01812448 - time (sec): 21.33 - samples/sec: 1437.53 - lr: 0.000006 - momentum: 0.000000
|
199 |
+
2024-09-03 22:14:44,676 epoch 9 - iter 54/90 - loss 0.01706299 - time (sec): 25.29 - samples/sec: 1452.34 - lr: 0.000006 - momentum: 0.000000
|
200 |
+
2024-09-03 22:14:49,248 epoch 9 - iter 63/90 - loss 0.01744359 - time (sec): 29.86 - samples/sec: 1440.48 - lr: 0.000006 - momentum: 0.000000
|
201 |
+
2024-09-03 22:14:53,395 epoch 9 - iter 72/90 - loss 0.01723729 - time (sec): 34.01 - samples/sec: 1439.52 - lr: 0.000006 - momentum: 0.000000
|
202 |
+
2024-09-03 22:14:58,196 epoch 9 - iter 81/90 - loss 0.01710673 - time (sec): 38.81 - samples/sec: 1422.73 - lr: 0.000006 - momentum: 0.000000
|
203 |
+
2024-09-03 22:15:02,430 epoch 9 - iter 90/90 - loss 0.01713626 - time (sec): 43.05 - samples/sec: 1426.26 - lr: 0.000006 - momentum: 0.000000
|
204 |
+
2024-09-03 22:15:02,431 ----------------------------------------------------------------------------------------------------
|
205 |
+
2024-09-03 22:15:02,431 EPOCH 9 done: loss 0.0171 - lr: 0.000006
|
206 |
+
2024-09-03 22:15:03,983 DEV : loss 0.18645833432674408 - f1-score (micro avg) 0.7696
|
207 |
+
2024-09-03 22:15:03,987 saving best model
|
208 |
+
2024-09-03 22:15:05,710 ----------------------------------------------------------------------------------------------------
|
209 |
+
2024-09-03 22:15:10,114 epoch 10 - iter 9/90 - loss 0.01487332 - time (sec): 4.40 - samples/sec: 1371.66 - lr: 0.000006 - momentum: 0.000000
|
210 |
+
2024-09-03 22:15:14,752 epoch 10 - iter 18/90 - loss 0.01588230 - time (sec): 9.04 - samples/sec: 1351.00 - lr: 0.000006 - momentum: 0.000000
|
211 |
+
2024-09-03 22:15:18,960 epoch 10 - iter 27/90 - loss 0.01385364 - time (sec): 13.25 - samples/sec: 1368.66 - lr: 0.000006 - momentum: 0.000000
|
212 |
+
2024-09-03 22:15:23,340 epoch 10 - iter 36/90 - loss 0.01439257 - time (sec): 17.63 - samples/sec: 1377.16 - lr: 0.000006 - momentum: 0.000000
|
213 |
+
2024-09-03 22:15:28,062 epoch 10 - iter 45/90 - loss 0.01401395 - time (sec): 22.35 - samples/sec: 1361.45 - lr: 0.000006 - momentum: 0.000000
|
214 |
+
2024-09-03 22:15:32,318 epoch 10 - iter 54/90 - loss 0.01346382 - time (sec): 26.61 - samples/sec: 1377.73 - lr: 0.000006 - momentum: 0.000000
|
215 |
+
2024-09-03 22:15:36,703 epoch 10 - iter 63/90 - loss 0.01382917 - time (sec): 30.99 - samples/sec: 1382.62 - lr: 0.000006 - momentum: 0.000000
|
216 |
+
2024-09-03 22:15:41,034 epoch 10 - iter 72/90 - loss 0.01399135 - time (sec): 35.32 - samples/sec: 1397.26 - lr: 0.000006 - momentum: 0.000000
|
217 |
+
2024-09-03 22:15:45,489 epoch 10 - iter 81/90 - loss 0.01359549 - time (sec): 39.78 - samples/sec: 1400.47 - lr: 0.000006 - momentum: 0.000000
|
218 |
+
2024-09-03 22:15:49,317 epoch 10 - iter 90/90 - loss 0.01459325 - time (sec): 43.60 - samples/sec: 1407.96 - lr: 0.000006 - momentum: 0.000000
|
219 |
+
2024-09-03 22:15:49,317 ----------------------------------------------------------------------------------------------------
|
220 |
+
2024-09-03 22:15:49,317 EPOCH 10 done: loss 0.0146 - lr: 0.000006
|
221 |
+
2024-09-03 22:15:50,868 DEV : loss 0.2126888781785965 - f1-score (micro avg) 0.7771
|
222 |
+
2024-09-03 22:15:50,872 saving best model
|
223 |
+
2024-09-03 22:15:52,597 ----------------------------------------------------------------------------------------------------
|
224 |
+
2024-09-03 22:15:56,519 epoch 11 - iter 9/90 - loss 0.01083532 - time (sec): 3.92 - samples/sec: 1537.40 - lr: 0.000006 - momentum: 0.000000
|
225 |
+
2024-09-03 22:16:00,719 epoch 11 - iter 18/90 - loss 0.01369081 - time (sec): 8.12 - samples/sec: 1519.59 - lr: 0.000005 - momentum: 0.000000
|
226 |
+
2024-09-03 22:16:04,964 epoch 11 - iter 27/90 - loss 0.01182479 - time (sec): 12.37 - samples/sec: 1520.67 - lr: 0.000005 - momentum: 0.000000
|
227 |
+
2024-09-03 22:16:09,825 epoch 11 - iter 36/90 - loss 0.01111172 - time (sec): 17.23 - samples/sec: 1467.37 - lr: 0.000005 - momentum: 0.000000
|
228 |
+
2024-09-03 22:16:14,205 epoch 11 - iter 45/90 - loss 0.00999335 - time (sec): 21.61 - samples/sec: 1456.64 - lr: 0.000005 - momentum: 0.000000
|
229 |
+
2024-09-03 22:16:18,570 epoch 11 - iter 54/90 - loss 0.00959961 - time (sec): 25.97 - samples/sec: 1445.48 - lr: 0.000005 - momentum: 0.000000
|
230 |
+
2024-09-03 22:16:23,295 epoch 11 - iter 63/90 - loss 0.00971446 - time (sec): 30.70 - samples/sec: 1422.94 - lr: 0.000005 - momentum: 0.000000
|
231 |
+
2024-09-03 22:16:27,451 epoch 11 - iter 72/90 - loss 0.00968067 - time (sec): 34.85 - samples/sec: 1427.69 - lr: 0.000005 - momentum: 0.000000
|
232 |
+
2024-09-03 22:16:31,509 epoch 11 - iter 81/90 - loss 0.00976322 - time (sec): 38.91 - samples/sec: 1433.25 - lr: 0.000005 - momentum: 0.000000
|
233 |
+
2024-09-03 22:16:35,289 epoch 11 - iter 90/90 - loss 0.01012837 - time (sec): 42.69 - samples/sec: 1438.13 - lr: 0.000005 - momentum: 0.000000
|
234 |
+
2024-09-03 22:16:35,289 ----------------------------------------------------------------------------------------------------
|
235 |
+
2024-09-03 22:16:35,289 EPOCH 11 done: loss 0.0101 - lr: 0.000005
|
236 |
+
2024-09-03 22:16:36,843 DEV : loss 0.23299568891525269 - f1-score (micro avg) 0.7622
|
237 |
+
2024-09-03 22:16:36,848 ----------------------------------------------------------------------------------------------------
|
238 |
+
2024-09-03 22:16:41,085 epoch 12 - iter 9/90 - loss 0.00751245 - time (sec): 4.24 - samples/sec: 1389.69 - lr: 0.000005 - momentum: 0.000000
|
239 |
+
2024-09-03 22:16:45,480 epoch 12 - iter 18/90 - loss 0.01017883 - time (sec): 8.63 - samples/sec: 1372.66 - lr: 0.000005 - momentum: 0.000000
|
240 |
+
2024-09-03 22:16:49,650 epoch 12 - iter 27/90 - loss 0.01071932 - time (sec): 12.80 - samples/sec: 1398.85 - lr: 0.000005 - momentum: 0.000000
|
241 |
+
2024-09-03 22:16:53,924 epoch 12 - iter 36/90 - loss 0.01041932 - time (sec): 17.08 - samples/sec: 1426.95 - lr: 0.000005 - momentum: 0.000000
|
242 |
+
2024-09-03 22:16:58,901 epoch 12 - iter 45/90 - loss 0.01116460 - time (sec): 22.05 - samples/sec: 1390.56 - lr: 0.000005 - momentum: 0.000000
|
243 |
+
2024-09-03 22:17:03,101 epoch 12 - iter 54/90 - loss 0.01020009 - time (sec): 26.25 - samples/sec: 1409.48 - lr: 0.000005 - momentum: 0.000000
|
244 |
+
2024-09-03 22:17:07,242 epoch 12 - iter 63/90 - loss 0.01021957 - time (sec): 30.39 - samples/sec: 1421.30 - lr: 0.000005 - momentum: 0.000000
|
245 |
+
2024-09-03 22:17:11,517 epoch 12 - iter 72/90 - loss 0.01010492 - time (sec): 34.67 - samples/sec: 1430.57 - lr: 0.000005 - momentum: 0.000000
|
246 |
+
2024-09-03 22:17:16,084 epoch 12 - iter 81/90 - loss 0.00985138 - time (sec): 39.24 - samples/sec: 1415.36 - lr: 0.000005 - momentum: 0.000000
|
247 |
+
2024-09-03 22:17:20,410 epoch 12 - iter 90/90 - loss 0.00982517 - time (sec): 43.56 - samples/sec: 1409.35 - lr: 0.000004 - momentum: 0.000000
|
248 |
+
2024-09-03 22:17:20,411 ----------------------------------------------------------------------------------------------------
|
249 |
+
2024-09-03 22:17:20,411 EPOCH 12 done: loss 0.0098 - lr: 0.000004
|
250 |
+
2024-09-03 22:17:21,963 DEV : loss 0.24243620038032532 - f1-score (micro avg) 0.777
|
251 |
+
2024-09-03 22:17:21,967 ----------------------------------------------------------------------------------------------------
|
252 |
+
2024-09-03 22:17:26,137 epoch 13 - iter 9/90 - loss 0.00363056 - time (sec): 4.17 - samples/sec: 1524.30 - lr: 0.000004 - momentum: 0.000000
|
253 |
+
2024-09-03 22:17:30,774 epoch 13 - iter 18/90 - loss 0.00653768 - time (sec): 8.81 - samples/sec: 1458.55 - lr: 0.000004 - momentum: 0.000000
|
254 |
+
2024-09-03 22:17:34,943 epoch 13 - iter 27/90 - loss 0.00636075 - time (sec): 12.98 - samples/sec: 1434.60 - lr: 0.000004 - momentum: 0.000000
|
255 |
+
2024-09-03 22:17:39,974 epoch 13 - iter 36/90 - loss 0.00665004 - time (sec): 18.01 - samples/sec: 1404.74 - lr: 0.000004 - momentum: 0.000000
|
256 |
+
2024-09-03 22:17:43,874 epoch 13 - iter 45/90 - loss 0.00650295 - time (sec): 21.91 - samples/sec: 1436.15 - lr: 0.000004 - momentum: 0.000000
|
257 |
+
2024-09-03 22:17:48,318 epoch 13 - iter 54/90 - loss 0.00639820 - time (sec): 26.35 - samples/sec: 1437.45 - lr: 0.000004 - momentum: 0.000000
|
258 |
+
2024-09-03 22:17:52,663 epoch 13 - iter 63/90 - loss 0.00598547 - time (sec): 30.70 - samples/sec: 1433.65 - lr: 0.000004 - momentum: 0.000000
|
259 |
+
2024-09-03 22:17:56,496 epoch 13 - iter 72/90 - loss 0.00643427 - time (sec): 34.53 - samples/sec: 1438.07 - lr: 0.000004 - momentum: 0.000000
|
260 |
+
2024-09-03 22:18:01,189 epoch 13 - iter 81/90 - loss 0.00685379 - time (sec): 39.22 - samples/sec: 1418.92 - lr: 0.000004 - momentum: 0.000000
|
261 |
+
2024-09-03 22:18:05,134 epoch 13 - iter 90/90 - loss 0.00733702 - time (sec): 43.17 - samples/sec: 1422.30 - lr: 0.000004 - momentum: 0.000000
|
262 |
+
2024-09-03 22:18:05,134 ----------------------------------------------------------------------------------------------------
|
263 |
+
2024-09-03 22:18:05,134 EPOCH 13 done: loss 0.0073 - lr: 0.000004
|
264 |
+
2024-09-03 22:18:06,691 DEV : loss 0.2638837397098541 - f1-score (micro avg) 0.7644
|
265 |
+
2024-09-03 22:18:06,695 ----------------------------------------------------------------------------------------------------
|
266 |
+
2024-09-03 22:18:11,175 epoch 14 - iter 9/90 - loss 0.00829397 - time (sec): 4.48 - samples/sec: 1385.83 - lr: 0.000004 - momentum: 0.000000
|
267 |
+
2024-09-03 22:18:15,704 epoch 14 - iter 18/90 - loss 0.00671472 - time (sec): 9.01 - samples/sec: 1377.56 - lr: 0.000004 - momentum: 0.000000
|
268 |
+
2024-09-03 22:18:20,260 epoch 14 - iter 27/90 - loss 0.00751253 - time (sec): 13.56 - samples/sec: 1386.83 - lr: 0.000004 - momentum: 0.000000
|
269 |
+
2024-09-03 22:18:24,624 epoch 14 - iter 36/90 - loss 0.00876323 - time (sec): 17.93 - samples/sec: 1412.73 - lr: 0.000004 - momentum: 0.000000
|
270 |
+
2024-09-03 22:18:28,316 epoch 14 - iter 45/90 - loss 0.00794374 - time (sec): 21.62 - samples/sec: 1441.03 - lr: 0.000004 - momentum: 0.000000
|
271 |
+
2024-09-03 22:18:32,587 epoch 14 - iter 54/90 - loss 0.00775956 - time (sec): 25.89 - samples/sec: 1439.29 - lr: 0.000004 - momentum: 0.000000
|
272 |
+
2024-09-03 22:18:37,191 epoch 14 - iter 63/90 - loss 0.00802070 - time (sec): 30.49 - samples/sec: 1430.57 - lr: 0.000004 - momentum: 0.000000
|
273 |
+
2024-09-03 22:18:41,731 epoch 14 - iter 72/90 - loss 0.00771433 - time (sec): 35.03 - samples/sec: 1419.19 - lr: 0.000004 - momentum: 0.000000
|
274 |
+
2024-09-03 22:18:46,062 epoch 14 - iter 81/90 - loss 0.00721965 - time (sec): 39.37 - samples/sec: 1415.81 - lr: 0.000003 - momentum: 0.000000
|
275 |
+
2024-09-03 22:18:50,041 epoch 14 - iter 90/90 - loss 0.00707296 - time (sec): 43.35 - samples/sec: 1416.38 - lr: 0.000003 - momentum: 0.000000
|
276 |
+
2024-09-03 22:18:50,042 ----------------------------------------------------------------------------------------------------
|
277 |
+
2024-09-03 22:18:50,042 EPOCH 14 done: loss 0.0071 - lr: 0.000003
|
278 |
+
2024-09-03 22:18:51,597 DEV : loss 0.2589911222457886 - f1-score (micro avg) 0.7741
|
279 |
+
2024-09-03 22:18:51,601 ----------------------------------------------------------------------------------------------------
|
280 |
+
2024-09-03 22:18:55,754 epoch 15 - iter 9/90 - loss 0.00633966 - time (sec): 4.15 - samples/sec: 1415.98 - lr: 0.000003 - momentum: 0.000000
|
281 |
+
2024-09-03 22:18:59,824 epoch 15 - iter 18/90 - loss 0.00596012 - time (sec): 8.22 - samples/sec: 1455.43 - lr: 0.000003 - momentum: 0.000000
|
282 |
+
2024-09-03 22:19:04,258 epoch 15 - iter 27/90 - loss 0.00623353 - time (sec): 12.66 - samples/sec: 1417.82 - lr: 0.000003 - momentum: 0.000000
|
283 |
+
2024-09-03 22:19:09,005 epoch 15 - iter 36/90 - loss 0.00527114 - time (sec): 17.40 - samples/sec: 1380.87 - lr: 0.000003 - momentum: 0.000000
|
284 |
+
2024-09-03 22:19:13,130 epoch 15 - iter 45/90 - loss 0.00521481 - time (sec): 21.53 - samples/sec: 1422.56 - lr: 0.000003 - momentum: 0.000000
|
285 |
+
2024-09-03 22:19:17,904 epoch 15 - iter 54/90 - loss 0.00494592 - time (sec): 26.30 - samples/sec: 1397.10 - lr: 0.000003 - momentum: 0.000000
|
286 |
+
2024-09-03 22:19:22,027 epoch 15 - iter 63/90 - loss 0.00472614 - time (sec): 30.43 - samples/sec: 1409.77 - lr: 0.000003 - momentum: 0.000000
|
287 |
+
2024-09-03 22:19:26,657 epoch 15 - iter 72/90 - loss 0.00487358 - time (sec): 35.05 - samples/sec: 1405.03 - lr: 0.000003 - momentum: 0.000000
|
288 |
+
2024-09-03 22:19:30,886 epoch 15 - iter 81/90 - loss 0.00551019 - time (sec): 39.28 - samples/sec: 1412.28 - lr: 0.000003 - momentum: 0.000000
|
289 |
+
2024-09-03 22:19:34,800 epoch 15 - iter 90/90 - loss 0.00570350 - time (sec): 43.20 - samples/sec: 1421.22 - lr: 0.000003 - momentum: 0.000000
|
290 |
+
2024-09-03 22:19:34,800 ----------------------------------------------------------------------------------------------------
|
291 |
+
2024-09-03 22:19:34,800 EPOCH 15 done: loss 0.0057 - lr: 0.000003
|
292 |
+
2024-09-03 22:19:36,357 DEV : loss 0.27159348130226135 - f1-score (micro avg) 0.7665
|
293 |
+
2024-09-03 22:19:36,361 ----------------------------------------------------------------------------------------------------
|
294 |
+
2024-09-03 22:19:40,741 epoch 16 - iter 9/90 - loss 0.00459489 - time (sec): 4.38 - samples/sec: 1442.57 - lr: 0.000003 - momentum: 0.000000
|
295 |
+
2024-09-03 22:19:44,607 epoch 16 - iter 18/90 - loss 0.00618470 - time (sec): 8.25 - samples/sec: 1493.38 - lr: 0.000003 - momentum: 0.000000
|
296 |
+
2024-09-03 22:19:49,028 epoch 16 - iter 27/90 - loss 0.00510669 - time (sec): 12.67 - samples/sec: 1444.92 - lr: 0.000003 - momentum: 0.000000
|
297 |
+
2024-09-03 22:19:53,348 epoch 16 - iter 36/90 - loss 0.00570184 - time (sec): 16.99 - samples/sec: 1447.83 - lr: 0.000003 - momentum: 0.000000
|
298 |
+
2024-09-03 22:19:57,316 epoch 16 - iter 45/90 - loss 0.00552000 - time (sec): 20.95 - samples/sec: 1469.67 - lr: 0.000003 - momentum: 0.000000
|
299 |
+
2024-09-03 22:20:01,570 epoch 16 - iter 54/90 - loss 0.00581402 - time (sec): 25.21 - samples/sec: 1468.82 - lr: 0.000003 - momentum: 0.000000
|
300 |
+
2024-09-03 22:20:06,247 epoch 16 - iter 63/90 - loss 0.00559269 - time (sec): 29.89 - samples/sec: 1446.70 - lr: 0.000002 - momentum: 0.000000
|
301 |
+
2024-09-03 22:20:10,585 epoch 16 - iter 72/90 - loss 0.00515513 - time (sec): 34.22 - samples/sec: 1442.34 - lr: 0.000002 - momentum: 0.000000
|
302 |
+
2024-09-03 22:20:15,449 epoch 16 - iter 81/90 - loss 0.00474433 - time (sec): 39.09 - samples/sec: 1420.96 - lr: 0.000002 - momentum: 0.000000
|
303 |
+
2024-09-03 22:20:19,518 epoch 16 - iter 90/90 - loss 0.00474793 - time (sec): 43.16 - samples/sec: 1422.58 - lr: 0.000002 - momentum: 0.000000
|
304 |
+
2024-09-03 22:20:19,519 ----------------------------------------------------------------------------------------------------
|
305 |
+
2024-09-03 22:20:19,519 EPOCH 16 done: loss 0.0047 - lr: 0.000002
|
306 |
+
2024-09-03 22:20:21,072 DEV : loss 0.2844613194465637 - f1-score (micro avg) 0.7658
|
307 |
+
2024-09-03 22:20:21,077 ----------------------------------------------------------------------------------------------------
|
308 |
+
2024-09-03 22:20:25,936 epoch 17 - iter 9/90 - loss 0.00543864 - time (sec): 4.86 - samples/sec: 1376.13 - lr: 0.000002 - momentum: 0.000000
|
309 |
+
2024-09-03 22:20:30,131 epoch 17 - iter 18/90 - loss 0.00405570 - time (sec): 9.05 - samples/sec: 1424.00 - lr: 0.000002 - momentum: 0.000000
|
310 |
+
2024-09-03 22:20:34,391 epoch 17 - iter 27/90 - loss 0.00364881 - time (sec): 13.31 - samples/sec: 1433.67 - lr: 0.000002 - momentum: 0.000000
|
311 |
+
2024-09-03 22:20:38,389 epoch 17 - iter 36/90 - loss 0.00321257 - time (sec): 17.31 - samples/sec: 1454.74 - lr: 0.000002 - momentum: 0.000000
|
312 |
+
2024-09-03 22:20:43,237 epoch 17 - iter 45/90 - loss 0.00351433 - time (sec): 22.16 - samples/sec: 1419.70 - lr: 0.000002 - momentum: 0.000000
|
313 |
+
2024-09-03 22:20:47,371 epoch 17 - iter 54/90 - loss 0.00378463 - time (sec): 26.29 - samples/sec: 1428.58 - lr: 0.000002 - momentum: 0.000000
|
314 |
+
2024-09-03 22:20:51,524 epoch 17 - iter 63/90 - loss 0.00363362 - time (sec): 30.45 - samples/sec: 1431.58 - lr: 0.000002 - momentum: 0.000000
|
315 |
+
2024-09-03 22:20:55,811 epoch 17 - iter 72/90 - loss 0.00368783 - time (sec): 34.73 - samples/sec: 1430.27 - lr: 0.000002 - momentum: 0.000000
|
316 |
+
2024-09-03 22:20:59,926 epoch 17 - iter 81/90 - loss 0.00365053 - time (sec): 38.85 - samples/sec: 1431.33 - lr: 0.000002 - momentum: 0.000000
|
317 |
+
2024-09-03 22:21:03,944 epoch 17 - iter 90/90 - loss 0.00348700 - time (sec): 42.87 - samples/sec: 1432.22 - lr: 0.000002 - momentum: 0.000000
|
318 |
+
2024-09-03 22:21:03,944 ----------------------------------------------------------------------------------------------------
|
319 |
+
2024-09-03 22:21:03,944 EPOCH 17 done: loss 0.0035 - lr: 0.000002
|
320 |
+
2024-09-03 22:21:05,496 DEV : loss 0.2972029745578766 - f1-score (micro avg) 0.773
|
321 |
+
2024-09-03 22:21:05,500 ----------------------------------------------------------------------------------------------------
|
322 |
+
2024-09-03 22:21:10,081 epoch 18 - iter 9/90 - loss 0.00473264 - time (sec): 4.58 - samples/sec: 1372.21 - lr: 0.000002 - momentum: 0.000000
|
323 |
+
2024-09-03 22:21:14,856 epoch 18 - iter 18/90 - loss 0.00333959 - time (sec): 9.35 - samples/sec: 1333.79 - lr: 0.000002 - momentum: 0.000000
|
324 |
+
2024-09-03 22:21:19,144 epoch 18 - iter 27/90 - loss 0.00386776 - time (sec): 13.64 - samples/sec: 1362.51 - lr: 0.000002 - momentum: 0.000000
|
325 |
+
2024-09-03 22:21:23,545 epoch 18 - iter 36/90 - loss 0.00317074 - time (sec): 18.04 - samples/sec: 1375.56 - lr: 0.000002 - momentum: 0.000000
|
326 |
+
2024-09-03 22:21:27,632 epoch 18 - iter 45/90 - loss 0.00345585 - time (sec): 22.13 - samples/sec: 1401.58 - lr: 0.000001 - momentum: 0.000000
|
327 |
+
2024-09-03 22:21:31,759 epoch 18 - iter 54/90 - loss 0.00323514 - time (sec): 26.26 - samples/sec: 1406.92 - lr: 0.000001 - momentum: 0.000000
|
328 |
+
2024-09-03 22:21:36,213 epoch 18 - iter 63/90 - loss 0.00309887 - time (sec): 30.71 - samples/sec: 1406.77 - lr: 0.000001 - momentum: 0.000000
|
329 |
+
2024-09-03 22:21:40,796 epoch 18 - iter 72/90 - loss 0.00289858 - time (sec): 35.29 - samples/sec: 1405.24 - lr: 0.000001 - momentum: 0.000000
|
330 |
+
2024-09-03 22:21:44,868 epoch 18 - iter 81/90 - loss 0.00310884 - time (sec): 39.37 - samples/sec: 1418.45 - lr: 0.000001 - momentum: 0.000000
|
331 |
+
2024-09-03 22:21:48,652 epoch 18 - iter 90/90 - loss 0.00345453 - time (sec): 43.15 - samples/sec: 1422.77 - lr: 0.000001 - momentum: 0.000000
|
332 |
+
2024-09-03 22:21:48,653 ----------------------------------------------------------------------------------------------------
|
333 |
+
2024-09-03 22:21:48,653 EPOCH 18 done: loss 0.0035 - lr: 0.000001
|
334 |
+
2024-09-03 22:21:50,219 DEV : loss 0.3015914261341095 - f1-score (micro avg) 0.7741
|
335 |
+
2024-09-03 22:21:50,224 ----------------------------------------------------------------------------------------------------
|
336 |
+
2024-09-03 22:21:54,090 epoch 19 - iter 9/90 - loss 0.00418814 - time (sec): 3.87 - samples/sec: 1480.24 - lr: 0.000001 - momentum: 0.000000
|
337 |
+
2024-09-03 22:21:58,821 epoch 19 - iter 18/90 - loss 0.00225645 - time (sec): 8.60 - samples/sec: 1405.29 - lr: 0.000001 - momentum: 0.000000
|
338 |
+
2024-09-03 22:22:03,440 epoch 19 - iter 27/90 - loss 0.00205094 - time (sec): 13.22 - samples/sec: 1403.71 - lr: 0.000001 - momentum: 0.000000
|
339 |
+
2024-09-03 22:22:07,670 epoch 19 - iter 36/90 - loss 0.00184944 - time (sec): 17.45 - samples/sec: 1422.03 - lr: 0.000001 - momentum: 0.000000
|
340 |
+
2024-09-03 22:22:12,181 epoch 19 - iter 45/90 - loss 0.00180956 - time (sec): 21.96 - samples/sec: 1410.69 - lr: 0.000001 - momentum: 0.000000
|
341 |
+
2024-09-03 22:22:16,249 epoch 19 - iter 54/90 - loss 0.00175773 - time (sec): 26.02 - samples/sec: 1414.19 - lr: 0.000001 - momentum: 0.000000
|
342 |
+
2024-09-03 22:22:20,450 epoch 19 - iter 63/90 - loss 0.00194382 - time (sec): 30.23 - samples/sec: 1418.41 - lr: 0.000001 - momentum: 0.000000
|
343 |
+
2024-09-03 22:22:24,865 epoch 19 - iter 72/90 - loss 0.00192499 - time (sec): 34.64 - samples/sec: 1422.31 - lr: 0.000001 - momentum: 0.000000
|
344 |
+
2024-09-03 22:22:29,214 epoch 19 - iter 81/90 - loss 0.00199577 - time (sec): 38.99 - samples/sec: 1419.45 - lr: 0.000001 - momentum: 0.000000
|
345 |
+
2024-09-03 22:22:33,476 epoch 19 - iter 90/90 - loss 0.00197310 - time (sec): 43.25 - samples/sec: 1419.48 - lr: 0.000001 - momentum: 0.000000
|
346 |
+
2024-09-03 22:22:33,476 ----------------------------------------------------------------------------------------------------
|
347 |
+
2024-09-03 22:22:33,476 EPOCH 19 done: loss 0.0020 - lr: 0.000001
|
348 |
+
2024-09-03 22:22:35,029 DEV : loss 0.3135406970977783 - f1-score (micro avg) 0.7702
|
349 |
+
2024-09-03 22:22:35,033 ----------------------------------------------------------------------------------------------------
|
350 |
+
2024-09-03 22:22:39,086 epoch 20 - iter 9/90 - loss 0.00362040 - time (sec): 4.05 - samples/sec: 1462.60 - lr: 0.000001 - momentum: 0.000000
|
351 |
+
2024-09-03 22:22:43,561 epoch 20 - iter 18/90 - loss 0.00299681 - time (sec): 8.53 - samples/sec: 1423.89 - lr: 0.000001 - momentum: 0.000000
|
352 |
+
2024-09-03 22:22:47,733 epoch 20 - iter 27/90 - loss 0.00273788 - time (sec): 12.70 - samples/sec: 1465.52 - lr: 0.000000 - momentum: 0.000000
|
353 |
+
2024-09-03 22:22:51,996 epoch 20 - iter 36/90 - loss 0.00258476 - time (sec): 16.96 - samples/sec: 1454.16 - lr: 0.000000 - momentum: 0.000000
|
354 |
+
2024-09-03 22:22:56,328 epoch 20 - iter 45/90 - loss 0.00252421 - time (sec): 21.29 - samples/sec: 1447.19 - lr: 0.000000 - momentum: 0.000000
|
355 |
+
2024-09-03 22:23:00,763 epoch 20 - iter 54/90 - loss 0.00236808 - time (sec): 25.73 - samples/sec: 1445.51 - lr: 0.000000 - momentum: 0.000000
|
356 |
+
2024-09-03 22:23:05,247 epoch 20 - iter 63/90 - loss 0.00220282 - time (sec): 30.21 - samples/sec: 1433.87 - lr: 0.000000 - momentum: 0.000000
|
357 |
+
2024-09-03 22:23:10,243 epoch 20 - iter 72/90 - loss 0.00214946 - time (sec): 35.21 - samples/sec: 1412.84 - lr: 0.000000 - momentum: 0.000000
|
358 |
+
2024-09-03 22:23:14,194 epoch 20 - iter 81/90 - loss 0.00246307 - time (sec): 39.16 - samples/sec: 1418.42 - lr: 0.000000 - momentum: 0.000000
|
359 |
+
2024-09-03 22:23:18,525 epoch 20 - iter 90/90 - loss 0.00240450 - time (sec): 43.49 - samples/sec: 1411.64 - lr: 0.000000 - momentum: 0.000000
|
360 |
+
2024-09-03 22:23:18,526 ----------------------------------------------------------------------------------------------------
|
361 |
+
2024-09-03 22:23:18,526 EPOCH 20 done: loss 0.0024 - lr: 0.000000
|
362 |
+
2024-09-03 22:23:20,078 DEV : loss 0.3147675693035126 - f1-score (micro avg) 0.7702
|
363 |
+
2024-09-03 22:23:21,239 ----------------------------------------------------------------------------------------------------
|
364 |
+
2024-09-03 22:23:21,240 Loading model from best epoch ...
|
365 |
+
2024-09-03 22:23:25,153 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-MISC, B-MISC, E-MISC, I-MISC, S-ORG, B-ORG, E-ORG, I-ORG
|
366 |
+
2024-09-03 22:23:26,698
|
367 |
+
Results:
|
368 |
+
- F-score (micro) 0.7297
|
369 |
+
- F-score (macro) 0.6833
|
370 |
+
- Accuracy 0.6119
|
371 |
+
|
372 |
+
By class:
|
373 |
+
precision recall f1-score support
|
374 |
+
|
375 |
+
ORG 0.7266 0.7949 0.7592 117
|
376 |
+
PER 0.8158 0.9538 0.8794 65
|
377 |
+
LOC 0.7231 0.7581 0.7402 62
|
378 |
+
MISC 0.5600 0.2593 0.3544 54
|
379 |
+
|
380 |
+
micro avg 0.7347 0.7248 0.7297 298
|
381 |
+
macro avg 0.7064 0.6915 0.6833 298
|
382 |
+
weighted avg 0.7151 0.7248 0.7081 298
|
383 |
+
|
384 |
+
2024-09-03 22:23:26,698 ----------------------------------------------------------------------------------------------------
|