stefan-it commited on
Commit
8cb3d0b
1 Parent(s): 09e9de5

model: add fine-tuned model

Browse files
Files changed (1) hide show
  1. training.log +384 -0
training.log ADDED
@@ -0,0 +1,384 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2024-09-03 22:08:14,383 ----------------------------------------------------------------------------------------------------
2
+ 2024-09-03 22:08:14,384 Model: "SequenceTagger(
3
+ (embeddings): TransformerWordEmbeddings(
4
+ (model): BertModel(
5
+ (embeddings): BertEmbeddings(
6
+ (word_embeddings): Embedding(31103, 1024, padding_idx=0)
7
+ (position_embeddings): Embedding(512, 1024)
8
+ (token_type_embeddings): Embedding(2, 1024)
9
+ (LayerNorm): LayerNorm((1024,), eps=1e-12, elementwise_affine=True)
10
+ (dropout): Dropout(p=0.1, inplace=False)
11
+ )
12
+ (encoder): BertEncoder(
13
+ (layer): ModuleList(
14
+ (0-23): 24 x BertLayer(
15
+ (attention): BertAttention(
16
+ (self): BertSdpaSelfAttention(
17
+ (query): Linear(in_features=1024, out_features=1024, bias=True)
18
+ (key): Linear(in_features=1024, out_features=1024, bias=True)
19
+ (value): Linear(in_features=1024, out_features=1024, bias=True)
20
+ (dropout): Dropout(p=0.1, inplace=False)
21
+ )
22
+ (output): BertSelfOutput(
23
+ (dense): Linear(in_features=1024, out_features=1024, bias=True)
24
+ (LayerNorm): LayerNorm((1024,), eps=1e-12, elementwise_affine=True)
25
+ (dropout): Dropout(p=0.1, inplace=False)
26
+ )
27
+ )
28
+ (intermediate): BertIntermediate(
29
+ (dense): Linear(in_features=1024, out_features=4096, bias=True)
30
+ (intermediate_act_fn): GELUActivation()
31
+ )
32
+ (output): BertOutput(
33
+ (dense): Linear(in_features=4096, out_features=1024, bias=True)
34
+ (LayerNorm): LayerNorm((1024,), eps=1e-12, elementwise_affine=True)
35
+ (dropout): Dropout(p=0.1, inplace=False)
36
+ )
37
+ )
38
+ )
39
+ )
40
+ (pooler): BertPooler(
41
+ (dense): Linear(in_features=1024, out_features=1024, bias=True)
42
+ (activation): Tanh()
43
+ )
44
+ )
45
+ )
46
+ (locked_dropout): LockedDropout(p=0.5)
47
+ (linear): Linear(in_features=1024, out_features=17, bias=True)
48
+ (loss_function): CrossEntropyLoss()
49
+ )"
50
+ 2024-09-03 22:08:14,384 ----------------------------------------------------------------------------------------------------
51
+ 2024-09-03 22:08:14,384 Corpus: 2869 train + 338 dev + 370 test sentences
52
+ 2024-09-03 22:08:14,384 ----------------------------------------------------------------------------------------------------
53
+ 2024-09-03 22:08:14,384 Train: 2869 sentences
54
+ 2024-09-03 22:08:14,384 (train_with_dev=False, train_with_test=False)
55
+ 2024-09-03 22:08:14,384 ----------------------------------------------------------------------------------------------------
56
+ 2024-09-03 22:08:14,384 Training Params:
57
+ 2024-09-03 22:08:14,384 - learning_rate: "1e-05"
58
+ 2024-09-03 22:08:14,384 - mini_batch_size: "32"
59
+ 2024-09-03 22:08:14,384 - max_epochs: "20"
60
+ 2024-09-03 22:08:14,384 - shuffle: "True"
61
+ 2024-09-03 22:08:14,384 ----------------------------------------------------------------------------------------------------
62
+ 2024-09-03 22:08:14,384 Plugins:
63
+ 2024-09-03 22:08:14,384 - TensorboardLogger
64
+ 2024-09-03 22:08:14,384 - LinearScheduler | warmup_fraction: '0.1'
65
+ 2024-09-03 22:08:14,384 ----------------------------------------------------------------------------------------------------
66
+ 2024-09-03 22:08:14,384 Final evaluation on model from best epoch (best-model.pt)
67
+ 2024-09-03 22:08:14,384 - metric: "('micro avg', 'f1-score')"
68
+ 2024-09-03 22:08:14,384 ----------------------------------------------------------------------------------------------------
69
+ 2024-09-03 22:08:14,384 Computation:
70
+ 2024-09-03 22:08:14,384 - compute on device: cuda:0
71
+ 2024-09-03 22:08:14,384 - embedding storage: none
72
+ 2024-09-03 22:08:14,384 ----------------------------------------------------------------------------------------------------
73
+ 2024-09-03 22:08:14,384 Model training base path: "flair-barner-coarse-grained-gbert_large-bs32-e20-lr1e-05-2"
74
+ 2024-09-03 22:08:14,384 ----------------------------------------------------------------------------------------------------
75
+ 2024-09-03 22:08:14,385 ----------------------------------------------------------------------------------------------------
76
+ 2024-09-03 22:08:14,385 Logging anything other than scalars to TensorBoard is currently not supported.
77
+ 2024-09-03 22:08:18,474 epoch 1 - iter 9/90 - loss 3.07625880 - time (sec): 4.09 - samples/sec: 1570.98 - lr: 0.000000 - momentum: 0.000000
78
+ 2024-09-03 22:08:22,848 epoch 1 - iter 18/90 - loss 3.03830863 - time (sec): 8.46 - samples/sec: 1486.84 - lr: 0.000001 - momentum: 0.000000
79
+ 2024-09-03 22:08:27,463 epoch 1 - iter 27/90 - loss 2.92345816 - time (sec): 13.08 - samples/sec: 1448.18 - lr: 0.000001 - momentum: 0.000000
80
+ 2024-09-03 22:08:31,615 epoch 1 - iter 36/90 - loss 2.76830934 - time (sec): 17.23 - samples/sec: 1445.16 - lr: 0.000002 - momentum: 0.000000
81
+ 2024-09-03 22:08:35,829 epoch 1 - iter 45/90 - loss 2.55828324 - time (sec): 21.44 - samples/sec: 1445.58 - lr: 0.000002 - momentum: 0.000000
82
+ 2024-09-03 22:08:40,281 epoch 1 - iter 54/90 - loss 2.27911363 - time (sec): 25.90 - samples/sec: 1443.03 - lr: 0.000003 - momentum: 0.000000
83
+ 2024-09-03 22:08:43,966 epoch 1 - iter 63/90 - loss 2.04649360 - time (sec): 29.58 - samples/sec: 1451.91 - lr: 0.000003 - momentum: 0.000000
84
+ 2024-09-03 22:08:48,754 epoch 1 - iter 72/90 - loss 1.85199452 - time (sec): 34.37 - samples/sec: 1430.44 - lr: 0.000004 - momentum: 0.000000
85
+ 2024-09-03 22:08:52,698 epoch 1 - iter 81/90 - loss 1.70122297 - time (sec): 38.31 - samples/sec: 1439.88 - lr: 0.000004 - momentum: 0.000000
86
+ 2024-09-03 22:08:56,719 epoch 1 - iter 90/90 - loss 1.57972824 - time (sec): 42.33 - samples/sec: 1450.22 - lr: 0.000005 - momentum: 0.000000
87
+ 2024-09-03 22:08:56,720 ----------------------------------------------------------------------------------------------------
88
+ 2024-09-03 22:08:56,720 EPOCH 1 done: loss 1.5797 - lr: 0.000005
89
+ 2024-09-03 22:08:58,183 DEV : loss 0.46851032972335815 - f1-score (micro avg) 0.0
90
+ 2024-09-03 22:08:58,187 ----------------------------------------------------------------------------------------------------
91
+ 2024-09-03 22:09:02,729 epoch 2 - iter 9/90 - loss 0.39452797 - time (sec): 4.54 - samples/sec: 1372.01 - lr: 0.000005 - momentum: 0.000000
92
+ 2024-09-03 22:09:07,298 epoch 2 - iter 18/90 - loss 0.34943089 - time (sec): 9.11 - samples/sec: 1363.54 - lr: 0.000006 - momentum: 0.000000
93
+ 2024-09-03 22:09:11,191 epoch 2 - iter 27/90 - loss 0.33048569 - time (sec): 13.00 - samples/sec: 1440.78 - lr: 0.000006 - momentum: 0.000000
94
+ 2024-09-03 22:09:14,919 epoch 2 - iter 36/90 - loss 0.33742609 - time (sec): 16.73 - samples/sec: 1473.34 - lr: 0.000007 - momentum: 0.000000
95
+ 2024-09-03 22:09:19,119 epoch 2 - iter 45/90 - loss 0.32836169 - time (sec): 20.93 - samples/sec: 1473.38 - lr: 0.000007 - momentum: 0.000000
96
+ 2024-09-03 22:09:23,366 epoch 2 - iter 54/90 - loss 0.32386979 - time (sec): 25.18 - samples/sec: 1468.88 - lr: 0.000008 - momentum: 0.000000
97
+ 2024-09-03 22:09:27,760 epoch 2 - iter 63/90 - loss 0.31846694 - time (sec): 29.57 - samples/sec: 1451.24 - lr: 0.000008 - momentum: 0.000000
98
+ 2024-09-03 22:09:32,410 epoch 2 - iter 72/90 - loss 0.30906101 - time (sec): 34.22 - samples/sec: 1428.99 - lr: 0.000009 - momentum: 0.000000
99
+ 2024-09-03 22:09:37,250 epoch 2 - iter 81/90 - loss 0.30259718 - time (sec): 39.06 - samples/sec: 1417.97 - lr: 0.000009 - momentum: 0.000000
100
+ 2024-09-03 22:09:41,763 epoch 2 - iter 90/90 - loss 0.29697888 - time (sec): 43.57 - samples/sec: 1408.93 - lr: 0.000010 - momentum: 0.000000
101
+ 2024-09-03 22:09:41,763 ----------------------------------------------------------------------------------------------------
102
+ 2024-09-03 22:09:41,763 EPOCH 2 done: loss 0.2970 - lr: 0.000010
103
+ 2024-09-03 22:09:43,313 DEV : loss 0.32844364643096924 - f1-score (micro avg) 0.4444
104
+ 2024-09-03 22:09:43,318 saving best model
105
+ 2024-09-03 22:09:44,704 ----------------------------------------------------------------------------------------------------
106
+ 2024-09-03 22:09:48,810 epoch 3 - iter 9/90 - loss 0.18021120 - time (sec): 4.10 - samples/sec: 1451.01 - lr: 0.000010 - momentum: 0.000000
107
+ 2024-09-03 22:09:52,605 epoch 3 - iter 18/90 - loss 0.18710233 - time (sec): 7.90 - samples/sec: 1551.86 - lr: 0.000010 - momentum: 0.000000
108
+ 2024-09-03 22:09:57,165 epoch 3 - iter 27/90 - loss 0.19642900 - time (sec): 12.46 - samples/sec: 1444.72 - lr: 0.000010 - momentum: 0.000000
109
+ 2024-09-03 22:10:01,325 epoch 3 - iter 36/90 - loss 0.19534522 - time (sec): 16.62 - samples/sec: 1462.08 - lr: 0.000010 - momentum: 0.000000
110
+ 2024-09-03 22:10:05,793 epoch 3 - iter 45/90 - loss 0.19526541 - time (sec): 21.09 - samples/sec: 1449.57 - lr: 0.000010 - momentum: 0.000000
111
+ 2024-09-03 22:10:10,319 epoch 3 - iter 54/90 - loss 0.18898441 - time (sec): 25.61 - samples/sec: 1435.75 - lr: 0.000010 - momentum: 0.000000
112
+ 2024-09-03 22:10:15,005 epoch 3 - iter 63/90 - loss 0.18734274 - time (sec): 30.30 - samples/sec: 1425.75 - lr: 0.000010 - momentum: 0.000000
113
+ 2024-09-03 22:10:19,484 epoch 3 - iter 72/90 - loss 0.19035033 - time (sec): 34.78 - samples/sec: 1425.46 - lr: 0.000010 - momentum: 0.000000
114
+ 2024-09-03 22:10:23,866 epoch 3 - iter 81/90 - loss 0.18770810 - time (sec): 39.16 - samples/sec: 1422.17 - lr: 0.000010 - momentum: 0.000000
115
+ 2024-09-03 22:10:27,920 epoch 3 - iter 90/90 - loss 0.18703645 - time (sec): 43.22 - samples/sec: 1420.65 - lr: 0.000009 - momentum: 0.000000
116
+ 2024-09-03 22:10:27,920 ----------------------------------------------------------------------------------------------------
117
+ 2024-09-03 22:10:27,921 EPOCH 3 done: loss 0.1870 - lr: 0.000009
118
+ 2024-09-03 22:10:29,473 DEV : loss 0.23561017215251923 - f1-score (micro avg) 0.6408
119
+ 2024-09-03 22:10:29,478 saving best model
120
+ 2024-09-03 22:10:31,205 ----------------------------------------------------------------------------------------------------
121
+ 2024-09-03 22:10:35,117 epoch 4 - iter 9/90 - loss 0.17246494 - time (sec): 3.91 - samples/sec: 1531.46 - lr: 0.000009 - momentum: 0.000000
122
+ 2024-09-03 22:10:39,356 epoch 4 - iter 18/90 - loss 0.15134387 - time (sec): 8.15 - samples/sec: 1475.32 - lr: 0.000009 - momentum: 0.000000
123
+ 2024-09-03 22:10:43,735 epoch 4 - iter 27/90 - loss 0.13746185 - time (sec): 12.53 - samples/sec: 1472.62 - lr: 0.000009 - momentum: 0.000000
124
+ 2024-09-03 22:10:47,797 epoch 4 - iter 36/90 - loss 0.13417880 - time (sec): 16.59 - samples/sec: 1476.25 - lr: 0.000009 - momentum: 0.000000
125
+ 2024-09-03 22:10:52,926 epoch 4 - iter 45/90 - loss 0.12898354 - time (sec): 21.72 - samples/sec: 1426.82 - lr: 0.000009 - momentum: 0.000000
126
+ 2024-09-03 22:10:57,315 epoch 4 - iter 54/90 - loss 0.12655651 - time (sec): 26.11 - samples/sec: 1426.80 - lr: 0.000009 - momentum: 0.000000
127
+ 2024-09-03 22:11:01,590 epoch 4 - iter 63/90 - loss 0.12590330 - time (sec): 30.38 - samples/sec: 1426.07 - lr: 0.000009 - momentum: 0.000000
128
+ 2024-09-03 22:11:06,149 epoch 4 - iter 72/90 - loss 0.12273481 - time (sec): 34.94 - samples/sec: 1423.24 - lr: 0.000009 - momentum: 0.000000
129
+ 2024-09-03 22:11:10,823 epoch 4 - iter 81/90 - loss 0.12123292 - time (sec): 39.62 - samples/sec: 1408.31 - lr: 0.000009 - momentum: 0.000000
130
+ 2024-09-03 22:11:14,547 epoch 4 - iter 90/90 - loss 0.11969701 - time (sec): 43.34 - samples/sec: 1416.55 - lr: 0.000009 - momentum: 0.000000
131
+ 2024-09-03 22:11:14,547 ----------------------------------------------------------------------------------------------------
132
+ 2024-09-03 22:11:14,547 EPOCH 4 done: loss 0.1197 - lr: 0.000009
133
+ 2024-09-03 22:11:16,092 DEV : loss 0.19088450074195862 - f1-score (micro avg) 0.7163
134
+ 2024-09-03 22:11:16,096 saving best model
135
+ 2024-09-03 22:11:17,838 ----------------------------------------------------------------------------------------------------
136
+ 2024-09-03 22:11:22,005 epoch 5 - iter 9/90 - loss 0.07796958 - time (sec): 4.17 - samples/sec: 1427.37 - lr: 0.000009 - momentum: 0.000000
137
+ 2024-09-03 22:11:26,793 epoch 5 - iter 18/90 - loss 0.09505982 - time (sec): 8.95 - samples/sec: 1347.83 - lr: 0.000009 - momentum: 0.000000
138
+ 2024-09-03 22:11:30,524 epoch 5 - iter 27/90 - loss 0.09038601 - time (sec): 12.69 - samples/sec: 1419.11 - lr: 0.000009 - momentum: 0.000000
139
+ 2024-09-03 22:11:34,896 epoch 5 - iter 36/90 - loss 0.08880525 - time (sec): 17.06 - samples/sec: 1408.49 - lr: 0.000009 - momentum: 0.000000
140
+ 2024-09-03 22:11:39,545 epoch 5 - iter 45/90 - loss 0.08665835 - time (sec): 21.71 - samples/sec: 1396.96 - lr: 0.000009 - momentum: 0.000000
141
+ 2024-09-03 22:11:43,572 epoch 5 - iter 54/90 - loss 0.08590365 - time (sec): 25.73 - samples/sec: 1416.95 - lr: 0.000009 - momentum: 0.000000
142
+ 2024-09-03 22:11:47,888 epoch 5 - iter 63/90 - loss 0.08303076 - time (sec): 30.05 - samples/sec: 1426.28 - lr: 0.000009 - momentum: 0.000000
143
+ 2024-09-03 22:11:52,321 epoch 5 - iter 72/90 - loss 0.08046962 - time (sec): 34.48 - samples/sec: 1433.50 - lr: 0.000008 - momentum: 0.000000
144
+ 2024-09-03 22:11:56,489 epoch 5 - iter 81/90 - loss 0.07746895 - time (sec): 38.65 - samples/sec: 1432.76 - lr: 0.000008 - momentum: 0.000000
145
+ 2024-09-03 22:12:00,972 epoch 5 - iter 90/90 - loss 0.07524486 - time (sec): 43.13 - samples/sec: 1423.36 - lr: 0.000008 - momentum: 0.000000
146
+ 2024-09-03 22:12:00,972 ----------------------------------------------------------------------------------------------------
147
+ 2024-09-03 22:12:00,972 EPOCH 5 done: loss 0.0752 - lr: 0.000008
148
+ 2024-09-03 22:12:02,521 DEV : loss 0.1980796456336975 - f1-score (micro avg) 0.7235
149
+ 2024-09-03 22:12:02,525 saving best model
150
+ 2024-09-03 22:12:04,280 ----------------------------------------------------------------------------------------------------
151
+ 2024-09-03 22:12:09,200 epoch 6 - iter 9/90 - loss 0.05086435 - time (sec): 4.92 - samples/sec: 1307.95 - lr: 0.000008 - momentum: 0.000000
152
+ 2024-09-03 22:12:13,315 epoch 6 - iter 18/90 - loss 0.05627511 - time (sec): 9.03 - samples/sec: 1388.10 - lr: 0.000008 - momentum: 0.000000
153
+ 2024-09-03 22:12:17,684 epoch 6 - iter 27/90 - loss 0.05226291 - time (sec): 13.40 - samples/sec: 1401.62 - lr: 0.000008 - momentum: 0.000000
154
+ 2024-09-03 22:12:21,743 epoch 6 - iter 36/90 - loss 0.04931367 - time (sec): 17.46 - samples/sec: 1410.58 - lr: 0.000008 - momentum: 0.000000
155
+ 2024-09-03 22:12:26,185 epoch 6 - iter 45/90 - loss 0.04709793 - time (sec): 21.90 - samples/sec: 1403.82 - lr: 0.000008 - momentum: 0.000000
156
+ 2024-09-03 22:12:30,399 epoch 6 - iter 54/90 - loss 0.05018191 - time (sec): 26.12 - samples/sec: 1416.40 - lr: 0.000008 - momentum: 0.000000
157
+ 2024-09-03 22:12:34,633 epoch 6 - iter 63/90 - loss 0.04949260 - time (sec): 30.35 - samples/sec: 1410.00 - lr: 0.000008 - momentum: 0.000000
158
+ 2024-09-03 22:12:39,250 epoch 6 - iter 72/90 - loss 0.05163362 - time (sec): 34.97 - samples/sec: 1394.74 - lr: 0.000008 - momentum: 0.000000
159
+ 2024-09-03 22:12:43,749 epoch 6 - iter 81/90 - loss 0.04998041 - time (sec): 39.47 - samples/sec: 1401.93 - lr: 0.000008 - momentum: 0.000000
160
+ 2024-09-03 22:12:47,550 epoch 6 - iter 90/90 - loss 0.04991602 - time (sec): 43.27 - samples/sec: 1418.90 - lr: 0.000008 - momentum: 0.000000
161
+ 2024-09-03 22:12:47,550 ----------------------------------------------------------------------------------------------------
162
+ 2024-09-03 22:12:47,550 EPOCH 6 done: loss 0.0499 - lr: 0.000008
163
+ 2024-09-03 22:12:49,108 DEV : loss 0.17487134039402008 - f1-score (micro avg) 0.7658
164
+ 2024-09-03 22:12:49,113 saving best model
165
+ 2024-09-03 22:12:50,858 ----------------------------------------------------------------------------------------------------
166
+ 2024-09-03 22:12:54,868 epoch 7 - iter 9/90 - loss 0.02538037 - time (sec): 4.01 - samples/sec: 1477.46 - lr: 0.000008 - momentum: 0.000000
167
+ 2024-09-03 22:12:59,226 epoch 7 - iter 18/90 - loss 0.03372476 - time (sec): 8.37 - samples/sec: 1442.49 - lr: 0.000008 - momentum: 0.000000
168
+ 2024-09-03 22:13:03,761 epoch 7 - iter 27/90 - loss 0.03282378 - time (sec): 12.90 - samples/sec: 1432.08 - lr: 0.000008 - momentum: 0.000000
169
+ 2024-09-03 22:13:07,554 epoch 7 - iter 36/90 - loss 0.03431105 - time (sec): 16.69 - samples/sec: 1446.04 - lr: 0.000008 - momentum: 0.000000
170
+ 2024-09-03 22:13:12,006 epoch 7 - iter 45/90 - loss 0.03225077 - time (sec): 21.15 - samples/sec: 1452.28 - lr: 0.000008 - momentum: 0.000000
171
+ 2024-09-03 22:13:16,824 epoch 7 - iter 54/90 - loss 0.03304733 - time (sec): 25.96 - samples/sec: 1419.88 - lr: 0.000007 - momentum: 0.000000
172
+ 2024-09-03 22:13:21,125 epoch 7 - iter 63/90 - loss 0.03408886 - time (sec): 30.27 - samples/sec: 1428.43 - lr: 0.000007 - momentum: 0.000000
173
+ 2024-09-03 22:13:25,202 epoch 7 - iter 72/90 - loss 0.03776201 - time (sec): 34.34 - samples/sec: 1434.95 - lr: 0.000007 - momentum: 0.000000
174
+ 2024-09-03 22:13:29,135 epoch 7 - iter 81/90 - loss 0.03710028 - time (sec): 38.28 - samples/sec: 1448.37 - lr: 0.000007 - momentum: 0.000000
175
+ 2024-09-03 22:13:33,099 epoch 7 - iter 90/90 - loss 0.03726568 - time (sec): 42.24 - samples/sec: 1453.46 - lr: 0.000007 - momentum: 0.000000
176
+ 2024-09-03 22:13:33,099 ----------------------------------------------------------------------------------------------------
177
+ 2024-09-03 22:13:33,099 EPOCH 7 done: loss 0.0373 - lr: 0.000007
178
+ 2024-09-03 22:13:34,650 DEV : loss 0.18173354864120483 - f1-score (micro avg) 0.7627
179
+ 2024-09-03 22:13:34,654 ----------------------------------------------------------------------------------------------------
180
+ 2024-09-03 22:13:38,876 epoch 8 - iter 9/90 - loss 0.02092667 - time (sec): 4.22 - samples/sec: 1404.86 - lr: 0.000007 - momentum: 0.000000
181
+ 2024-09-03 22:13:42,977 epoch 8 - iter 18/90 - loss 0.02438277 - time (sec): 8.32 - samples/sec: 1452.84 - lr: 0.000007 - momentum: 0.000000
182
+ 2024-09-03 22:13:47,147 epoch 8 - iter 27/90 - loss 0.02262059 - time (sec): 12.49 - samples/sec: 1462.61 - lr: 0.000007 - momentum: 0.000000
183
+ 2024-09-03 22:13:51,474 epoch 8 - iter 36/90 - loss 0.02522541 - time (sec): 16.82 - samples/sec: 1484.42 - lr: 0.000007 - momentum: 0.000000
184
+ 2024-09-03 22:13:55,997 epoch 8 - iter 45/90 - loss 0.02481957 - time (sec): 21.34 - samples/sec: 1457.11 - lr: 0.000007 - momentum: 0.000000
185
+ 2024-09-03 22:14:00,761 epoch 8 - iter 54/90 - loss 0.02359039 - time (sec): 26.11 - samples/sec: 1438.57 - lr: 0.000007 - momentum: 0.000000
186
+ 2024-09-03 22:14:05,146 epoch 8 - iter 63/90 - loss 0.02355433 - time (sec): 30.49 - samples/sec: 1435.74 - lr: 0.000007 - momentum: 0.000000
187
+ 2024-09-03 22:14:09,274 epoch 8 - iter 72/90 - loss 0.02419780 - time (sec): 34.62 - samples/sec: 1421.47 - lr: 0.000007 - momentum: 0.000000
188
+ 2024-09-03 22:14:13,543 epoch 8 - iter 81/90 - loss 0.02375625 - time (sec): 38.89 - samples/sec: 1422.79 - lr: 0.000007 - momentum: 0.000000
189
+ 2024-09-03 22:14:17,829 epoch 8 - iter 90/90 - loss 0.02349011 - time (sec): 43.17 - samples/sec: 1422.01 - lr: 0.000007 - momentum: 0.000000
190
+ 2024-09-03 22:14:17,829 ----------------------------------------------------------------------------------------------------
191
+ 2024-09-03 22:14:17,829 EPOCH 8 done: loss 0.0235 - lr: 0.000007
192
+ 2024-09-03 22:14:19,380 DEV : loss 0.20718424022197723 - f1-score (micro avg) 0.7504
193
+ 2024-09-03 22:14:19,384 ----------------------------------------------------------------------------------------------------
194
+ 2024-09-03 22:14:23,485 epoch 9 - iter 9/90 - loss 0.02255550 - time (sec): 4.10 - samples/sec: 1531.81 - lr: 0.000007 - momentum: 0.000000
195
+ 2024-09-03 22:14:27,762 epoch 9 - iter 18/90 - loss 0.01867817 - time (sec): 8.38 - samples/sec: 1470.40 - lr: 0.000007 - momentum: 0.000000
196
+ 2024-09-03 22:14:32,041 epoch 9 - iter 27/90 - loss 0.01821711 - time (sec): 12.66 - samples/sec: 1460.17 - lr: 0.000007 - momentum: 0.000000
197
+ 2024-09-03 22:14:36,587 epoch 9 - iter 36/90 - loss 0.01817603 - time (sec): 17.20 - samples/sec: 1439.66 - lr: 0.000006 - momentum: 0.000000
198
+ 2024-09-03 22:14:40,717 epoch 9 - iter 45/90 - loss 0.01812448 - time (sec): 21.33 - samples/sec: 1437.53 - lr: 0.000006 - momentum: 0.000000
199
+ 2024-09-03 22:14:44,676 epoch 9 - iter 54/90 - loss 0.01706299 - time (sec): 25.29 - samples/sec: 1452.34 - lr: 0.000006 - momentum: 0.000000
200
+ 2024-09-03 22:14:49,248 epoch 9 - iter 63/90 - loss 0.01744359 - time (sec): 29.86 - samples/sec: 1440.48 - lr: 0.000006 - momentum: 0.000000
201
+ 2024-09-03 22:14:53,395 epoch 9 - iter 72/90 - loss 0.01723729 - time (sec): 34.01 - samples/sec: 1439.52 - lr: 0.000006 - momentum: 0.000000
202
+ 2024-09-03 22:14:58,196 epoch 9 - iter 81/90 - loss 0.01710673 - time (sec): 38.81 - samples/sec: 1422.73 - lr: 0.000006 - momentum: 0.000000
203
+ 2024-09-03 22:15:02,430 epoch 9 - iter 90/90 - loss 0.01713626 - time (sec): 43.05 - samples/sec: 1426.26 - lr: 0.000006 - momentum: 0.000000
204
+ 2024-09-03 22:15:02,431 ----------------------------------------------------------------------------------------------------
205
+ 2024-09-03 22:15:02,431 EPOCH 9 done: loss 0.0171 - lr: 0.000006
206
+ 2024-09-03 22:15:03,983 DEV : loss 0.18645833432674408 - f1-score (micro avg) 0.7696
207
+ 2024-09-03 22:15:03,987 saving best model
208
+ 2024-09-03 22:15:05,710 ----------------------------------------------------------------------------------------------------
209
+ 2024-09-03 22:15:10,114 epoch 10 - iter 9/90 - loss 0.01487332 - time (sec): 4.40 - samples/sec: 1371.66 - lr: 0.000006 - momentum: 0.000000
210
+ 2024-09-03 22:15:14,752 epoch 10 - iter 18/90 - loss 0.01588230 - time (sec): 9.04 - samples/sec: 1351.00 - lr: 0.000006 - momentum: 0.000000
211
+ 2024-09-03 22:15:18,960 epoch 10 - iter 27/90 - loss 0.01385364 - time (sec): 13.25 - samples/sec: 1368.66 - lr: 0.000006 - momentum: 0.000000
212
+ 2024-09-03 22:15:23,340 epoch 10 - iter 36/90 - loss 0.01439257 - time (sec): 17.63 - samples/sec: 1377.16 - lr: 0.000006 - momentum: 0.000000
213
+ 2024-09-03 22:15:28,062 epoch 10 - iter 45/90 - loss 0.01401395 - time (sec): 22.35 - samples/sec: 1361.45 - lr: 0.000006 - momentum: 0.000000
214
+ 2024-09-03 22:15:32,318 epoch 10 - iter 54/90 - loss 0.01346382 - time (sec): 26.61 - samples/sec: 1377.73 - lr: 0.000006 - momentum: 0.000000
215
+ 2024-09-03 22:15:36,703 epoch 10 - iter 63/90 - loss 0.01382917 - time (sec): 30.99 - samples/sec: 1382.62 - lr: 0.000006 - momentum: 0.000000
216
+ 2024-09-03 22:15:41,034 epoch 10 - iter 72/90 - loss 0.01399135 - time (sec): 35.32 - samples/sec: 1397.26 - lr: 0.000006 - momentum: 0.000000
217
+ 2024-09-03 22:15:45,489 epoch 10 - iter 81/90 - loss 0.01359549 - time (sec): 39.78 - samples/sec: 1400.47 - lr: 0.000006 - momentum: 0.000000
218
+ 2024-09-03 22:15:49,317 epoch 10 - iter 90/90 - loss 0.01459325 - time (sec): 43.60 - samples/sec: 1407.96 - lr: 0.000006 - momentum: 0.000000
219
+ 2024-09-03 22:15:49,317 ----------------------------------------------------------------------------------------------------
220
+ 2024-09-03 22:15:49,317 EPOCH 10 done: loss 0.0146 - lr: 0.000006
221
+ 2024-09-03 22:15:50,868 DEV : loss 0.2126888781785965 - f1-score (micro avg) 0.7771
222
+ 2024-09-03 22:15:50,872 saving best model
223
+ 2024-09-03 22:15:52,597 ----------------------------------------------------------------------------------------------------
224
+ 2024-09-03 22:15:56,519 epoch 11 - iter 9/90 - loss 0.01083532 - time (sec): 3.92 - samples/sec: 1537.40 - lr: 0.000006 - momentum: 0.000000
225
+ 2024-09-03 22:16:00,719 epoch 11 - iter 18/90 - loss 0.01369081 - time (sec): 8.12 - samples/sec: 1519.59 - lr: 0.000005 - momentum: 0.000000
226
+ 2024-09-03 22:16:04,964 epoch 11 - iter 27/90 - loss 0.01182479 - time (sec): 12.37 - samples/sec: 1520.67 - lr: 0.000005 - momentum: 0.000000
227
+ 2024-09-03 22:16:09,825 epoch 11 - iter 36/90 - loss 0.01111172 - time (sec): 17.23 - samples/sec: 1467.37 - lr: 0.000005 - momentum: 0.000000
228
+ 2024-09-03 22:16:14,205 epoch 11 - iter 45/90 - loss 0.00999335 - time (sec): 21.61 - samples/sec: 1456.64 - lr: 0.000005 - momentum: 0.000000
229
+ 2024-09-03 22:16:18,570 epoch 11 - iter 54/90 - loss 0.00959961 - time (sec): 25.97 - samples/sec: 1445.48 - lr: 0.000005 - momentum: 0.000000
230
+ 2024-09-03 22:16:23,295 epoch 11 - iter 63/90 - loss 0.00971446 - time (sec): 30.70 - samples/sec: 1422.94 - lr: 0.000005 - momentum: 0.000000
231
+ 2024-09-03 22:16:27,451 epoch 11 - iter 72/90 - loss 0.00968067 - time (sec): 34.85 - samples/sec: 1427.69 - lr: 0.000005 - momentum: 0.000000
232
+ 2024-09-03 22:16:31,509 epoch 11 - iter 81/90 - loss 0.00976322 - time (sec): 38.91 - samples/sec: 1433.25 - lr: 0.000005 - momentum: 0.000000
233
+ 2024-09-03 22:16:35,289 epoch 11 - iter 90/90 - loss 0.01012837 - time (sec): 42.69 - samples/sec: 1438.13 - lr: 0.000005 - momentum: 0.000000
234
+ 2024-09-03 22:16:35,289 ----------------------------------------------------------------------------------------------------
235
+ 2024-09-03 22:16:35,289 EPOCH 11 done: loss 0.0101 - lr: 0.000005
236
+ 2024-09-03 22:16:36,843 DEV : loss 0.23299568891525269 - f1-score (micro avg) 0.7622
237
+ 2024-09-03 22:16:36,848 ----------------------------------------------------------------------------------------------------
238
+ 2024-09-03 22:16:41,085 epoch 12 - iter 9/90 - loss 0.00751245 - time (sec): 4.24 - samples/sec: 1389.69 - lr: 0.000005 - momentum: 0.000000
239
+ 2024-09-03 22:16:45,480 epoch 12 - iter 18/90 - loss 0.01017883 - time (sec): 8.63 - samples/sec: 1372.66 - lr: 0.000005 - momentum: 0.000000
240
+ 2024-09-03 22:16:49,650 epoch 12 - iter 27/90 - loss 0.01071932 - time (sec): 12.80 - samples/sec: 1398.85 - lr: 0.000005 - momentum: 0.000000
241
+ 2024-09-03 22:16:53,924 epoch 12 - iter 36/90 - loss 0.01041932 - time (sec): 17.08 - samples/sec: 1426.95 - lr: 0.000005 - momentum: 0.000000
242
+ 2024-09-03 22:16:58,901 epoch 12 - iter 45/90 - loss 0.01116460 - time (sec): 22.05 - samples/sec: 1390.56 - lr: 0.000005 - momentum: 0.000000
243
+ 2024-09-03 22:17:03,101 epoch 12 - iter 54/90 - loss 0.01020009 - time (sec): 26.25 - samples/sec: 1409.48 - lr: 0.000005 - momentum: 0.000000
244
+ 2024-09-03 22:17:07,242 epoch 12 - iter 63/90 - loss 0.01021957 - time (sec): 30.39 - samples/sec: 1421.30 - lr: 0.000005 - momentum: 0.000000
245
+ 2024-09-03 22:17:11,517 epoch 12 - iter 72/90 - loss 0.01010492 - time (sec): 34.67 - samples/sec: 1430.57 - lr: 0.000005 - momentum: 0.000000
246
+ 2024-09-03 22:17:16,084 epoch 12 - iter 81/90 - loss 0.00985138 - time (sec): 39.24 - samples/sec: 1415.36 - lr: 0.000005 - momentum: 0.000000
247
+ 2024-09-03 22:17:20,410 epoch 12 - iter 90/90 - loss 0.00982517 - time (sec): 43.56 - samples/sec: 1409.35 - lr: 0.000004 - momentum: 0.000000
248
+ 2024-09-03 22:17:20,411 ----------------------------------------------------------------------------------------------------
249
+ 2024-09-03 22:17:20,411 EPOCH 12 done: loss 0.0098 - lr: 0.000004
250
+ 2024-09-03 22:17:21,963 DEV : loss 0.24243620038032532 - f1-score (micro avg) 0.777
251
+ 2024-09-03 22:17:21,967 ----------------------------------------------------------------------------------------------------
252
+ 2024-09-03 22:17:26,137 epoch 13 - iter 9/90 - loss 0.00363056 - time (sec): 4.17 - samples/sec: 1524.30 - lr: 0.000004 - momentum: 0.000000
253
+ 2024-09-03 22:17:30,774 epoch 13 - iter 18/90 - loss 0.00653768 - time (sec): 8.81 - samples/sec: 1458.55 - lr: 0.000004 - momentum: 0.000000
254
+ 2024-09-03 22:17:34,943 epoch 13 - iter 27/90 - loss 0.00636075 - time (sec): 12.98 - samples/sec: 1434.60 - lr: 0.000004 - momentum: 0.000000
255
+ 2024-09-03 22:17:39,974 epoch 13 - iter 36/90 - loss 0.00665004 - time (sec): 18.01 - samples/sec: 1404.74 - lr: 0.000004 - momentum: 0.000000
256
+ 2024-09-03 22:17:43,874 epoch 13 - iter 45/90 - loss 0.00650295 - time (sec): 21.91 - samples/sec: 1436.15 - lr: 0.000004 - momentum: 0.000000
257
+ 2024-09-03 22:17:48,318 epoch 13 - iter 54/90 - loss 0.00639820 - time (sec): 26.35 - samples/sec: 1437.45 - lr: 0.000004 - momentum: 0.000000
258
+ 2024-09-03 22:17:52,663 epoch 13 - iter 63/90 - loss 0.00598547 - time (sec): 30.70 - samples/sec: 1433.65 - lr: 0.000004 - momentum: 0.000000
259
+ 2024-09-03 22:17:56,496 epoch 13 - iter 72/90 - loss 0.00643427 - time (sec): 34.53 - samples/sec: 1438.07 - lr: 0.000004 - momentum: 0.000000
260
+ 2024-09-03 22:18:01,189 epoch 13 - iter 81/90 - loss 0.00685379 - time (sec): 39.22 - samples/sec: 1418.92 - lr: 0.000004 - momentum: 0.000000
261
+ 2024-09-03 22:18:05,134 epoch 13 - iter 90/90 - loss 0.00733702 - time (sec): 43.17 - samples/sec: 1422.30 - lr: 0.000004 - momentum: 0.000000
262
+ 2024-09-03 22:18:05,134 ----------------------------------------------------------------------------------------------------
263
+ 2024-09-03 22:18:05,134 EPOCH 13 done: loss 0.0073 - lr: 0.000004
264
+ 2024-09-03 22:18:06,691 DEV : loss 0.2638837397098541 - f1-score (micro avg) 0.7644
265
+ 2024-09-03 22:18:06,695 ----------------------------------------------------------------------------------------------------
266
+ 2024-09-03 22:18:11,175 epoch 14 - iter 9/90 - loss 0.00829397 - time (sec): 4.48 - samples/sec: 1385.83 - lr: 0.000004 - momentum: 0.000000
267
+ 2024-09-03 22:18:15,704 epoch 14 - iter 18/90 - loss 0.00671472 - time (sec): 9.01 - samples/sec: 1377.56 - lr: 0.000004 - momentum: 0.000000
268
+ 2024-09-03 22:18:20,260 epoch 14 - iter 27/90 - loss 0.00751253 - time (sec): 13.56 - samples/sec: 1386.83 - lr: 0.000004 - momentum: 0.000000
269
+ 2024-09-03 22:18:24,624 epoch 14 - iter 36/90 - loss 0.00876323 - time (sec): 17.93 - samples/sec: 1412.73 - lr: 0.000004 - momentum: 0.000000
270
+ 2024-09-03 22:18:28,316 epoch 14 - iter 45/90 - loss 0.00794374 - time (sec): 21.62 - samples/sec: 1441.03 - lr: 0.000004 - momentum: 0.000000
271
+ 2024-09-03 22:18:32,587 epoch 14 - iter 54/90 - loss 0.00775956 - time (sec): 25.89 - samples/sec: 1439.29 - lr: 0.000004 - momentum: 0.000000
272
+ 2024-09-03 22:18:37,191 epoch 14 - iter 63/90 - loss 0.00802070 - time (sec): 30.49 - samples/sec: 1430.57 - lr: 0.000004 - momentum: 0.000000
273
+ 2024-09-03 22:18:41,731 epoch 14 - iter 72/90 - loss 0.00771433 - time (sec): 35.03 - samples/sec: 1419.19 - lr: 0.000004 - momentum: 0.000000
274
+ 2024-09-03 22:18:46,062 epoch 14 - iter 81/90 - loss 0.00721965 - time (sec): 39.37 - samples/sec: 1415.81 - lr: 0.000003 - momentum: 0.000000
275
+ 2024-09-03 22:18:50,041 epoch 14 - iter 90/90 - loss 0.00707296 - time (sec): 43.35 - samples/sec: 1416.38 - lr: 0.000003 - momentum: 0.000000
276
+ 2024-09-03 22:18:50,042 ----------------------------------------------------------------------------------------------------
277
+ 2024-09-03 22:18:50,042 EPOCH 14 done: loss 0.0071 - lr: 0.000003
278
+ 2024-09-03 22:18:51,597 DEV : loss 0.2589911222457886 - f1-score (micro avg) 0.7741
279
+ 2024-09-03 22:18:51,601 ----------------------------------------------------------------------------------------------------
280
+ 2024-09-03 22:18:55,754 epoch 15 - iter 9/90 - loss 0.00633966 - time (sec): 4.15 - samples/sec: 1415.98 - lr: 0.000003 - momentum: 0.000000
281
+ 2024-09-03 22:18:59,824 epoch 15 - iter 18/90 - loss 0.00596012 - time (sec): 8.22 - samples/sec: 1455.43 - lr: 0.000003 - momentum: 0.000000
282
+ 2024-09-03 22:19:04,258 epoch 15 - iter 27/90 - loss 0.00623353 - time (sec): 12.66 - samples/sec: 1417.82 - lr: 0.000003 - momentum: 0.000000
283
+ 2024-09-03 22:19:09,005 epoch 15 - iter 36/90 - loss 0.00527114 - time (sec): 17.40 - samples/sec: 1380.87 - lr: 0.000003 - momentum: 0.000000
284
+ 2024-09-03 22:19:13,130 epoch 15 - iter 45/90 - loss 0.00521481 - time (sec): 21.53 - samples/sec: 1422.56 - lr: 0.000003 - momentum: 0.000000
285
+ 2024-09-03 22:19:17,904 epoch 15 - iter 54/90 - loss 0.00494592 - time (sec): 26.30 - samples/sec: 1397.10 - lr: 0.000003 - momentum: 0.000000
286
+ 2024-09-03 22:19:22,027 epoch 15 - iter 63/90 - loss 0.00472614 - time (sec): 30.43 - samples/sec: 1409.77 - lr: 0.000003 - momentum: 0.000000
287
+ 2024-09-03 22:19:26,657 epoch 15 - iter 72/90 - loss 0.00487358 - time (sec): 35.05 - samples/sec: 1405.03 - lr: 0.000003 - momentum: 0.000000
288
+ 2024-09-03 22:19:30,886 epoch 15 - iter 81/90 - loss 0.00551019 - time (sec): 39.28 - samples/sec: 1412.28 - lr: 0.000003 - momentum: 0.000000
289
+ 2024-09-03 22:19:34,800 epoch 15 - iter 90/90 - loss 0.00570350 - time (sec): 43.20 - samples/sec: 1421.22 - lr: 0.000003 - momentum: 0.000000
290
+ 2024-09-03 22:19:34,800 ----------------------------------------------------------------------------------------------------
291
+ 2024-09-03 22:19:34,800 EPOCH 15 done: loss 0.0057 - lr: 0.000003
292
+ 2024-09-03 22:19:36,357 DEV : loss 0.27159348130226135 - f1-score (micro avg) 0.7665
293
+ 2024-09-03 22:19:36,361 ----------------------------------------------------------------------------------------------------
294
+ 2024-09-03 22:19:40,741 epoch 16 - iter 9/90 - loss 0.00459489 - time (sec): 4.38 - samples/sec: 1442.57 - lr: 0.000003 - momentum: 0.000000
295
+ 2024-09-03 22:19:44,607 epoch 16 - iter 18/90 - loss 0.00618470 - time (sec): 8.25 - samples/sec: 1493.38 - lr: 0.000003 - momentum: 0.000000
296
+ 2024-09-03 22:19:49,028 epoch 16 - iter 27/90 - loss 0.00510669 - time (sec): 12.67 - samples/sec: 1444.92 - lr: 0.000003 - momentum: 0.000000
297
+ 2024-09-03 22:19:53,348 epoch 16 - iter 36/90 - loss 0.00570184 - time (sec): 16.99 - samples/sec: 1447.83 - lr: 0.000003 - momentum: 0.000000
298
+ 2024-09-03 22:19:57,316 epoch 16 - iter 45/90 - loss 0.00552000 - time (sec): 20.95 - samples/sec: 1469.67 - lr: 0.000003 - momentum: 0.000000
299
+ 2024-09-03 22:20:01,570 epoch 16 - iter 54/90 - loss 0.00581402 - time (sec): 25.21 - samples/sec: 1468.82 - lr: 0.000003 - momentum: 0.000000
300
+ 2024-09-03 22:20:06,247 epoch 16 - iter 63/90 - loss 0.00559269 - time (sec): 29.89 - samples/sec: 1446.70 - lr: 0.000002 - momentum: 0.000000
301
+ 2024-09-03 22:20:10,585 epoch 16 - iter 72/90 - loss 0.00515513 - time (sec): 34.22 - samples/sec: 1442.34 - lr: 0.000002 - momentum: 0.000000
302
+ 2024-09-03 22:20:15,449 epoch 16 - iter 81/90 - loss 0.00474433 - time (sec): 39.09 - samples/sec: 1420.96 - lr: 0.000002 - momentum: 0.000000
303
+ 2024-09-03 22:20:19,518 epoch 16 - iter 90/90 - loss 0.00474793 - time (sec): 43.16 - samples/sec: 1422.58 - lr: 0.000002 - momentum: 0.000000
304
+ 2024-09-03 22:20:19,519 ----------------------------------------------------------------------------------------------------
305
+ 2024-09-03 22:20:19,519 EPOCH 16 done: loss 0.0047 - lr: 0.000002
306
+ 2024-09-03 22:20:21,072 DEV : loss 0.2844613194465637 - f1-score (micro avg) 0.7658
307
+ 2024-09-03 22:20:21,077 ----------------------------------------------------------------------------------------------------
308
+ 2024-09-03 22:20:25,936 epoch 17 - iter 9/90 - loss 0.00543864 - time (sec): 4.86 - samples/sec: 1376.13 - lr: 0.000002 - momentum: 0.000000
309
+ 2024-09-03 22:20:30,131 epoch 17 - iter 18/90 - loss 0.00405570 - time (sec): 9.05 - samples/sec: 1424.00 - lr: 0.000002 - momentum: 0.000000
310
+ 2024-09-03 22:20:34,391 epoch 17 - iter 27/90 - loss 0.00364881 - time (sec): 13.31 - samples/sec: 1433.67 - lr: 0.000002 - momentum: 0.000000
311
+ 2024-09-03 22:20:38,389 epoch 17 - iter 36/90 - loss 0.00321257 - time (sec): 17.31 - samples/sec: 1454.74 - lr: 0.000002 - momentum: 0.000000
312
+ 2024-09-03 22:20:43,237 epoch 17 - iter 45/90 - loss 0.00351433 - time (sec): 22.16 - samples/sec: 1419.70 - lr: 0.000002 - momentum: 0.000000
313
+ 2024-09-03 22:20:47,371 epoch 17 - iter 54/90 - loss 0.00378463 - time (sec): 26.29 - samples/sec: 1428.58 - lr: 0.000002 - momentum: 0.000000
314
+ 2024-09-03 22:20:51,524 epoch 17 - iter 63/90 - loss 0.00363362 - time (sec): 30.45 - samples/sec: 1431.58 - lr: 0.000002 - momentum: 0.000000
315
+ 2024-09-03 22:20:55,811 epoch 17 - iter 72/90 - loss 0.00368783 - time (sec): 34.73 - samples/sec: 1430.27 - lr: 0.000002 - momentum: 0.000000
316
+ 2024-09-03 22:20:59,926 epoch 17 - iter 81/90 - loss 0.00365053 - time (sec): 38.85 - samples/sec: 1431.33 - lr: 0.000002 - momentum: 0.000000
317
+ 2024-09-03 22:21:03,944 epoch 17 - iter 90/90 - loss 0.00348700 - time (sec): 42.87 - samples/sec: 1432.22 - lr: 0.000002 - momentum: 0.000000
318
+ 2024-09-03 22:21:03,944 ----------------------------------------------------------------------------------------------------
319
+ 2024-09-03 22:21:03,944 EPOCH 17 done: loss 0.0035 - lr: 0.000002
320
+ 2024-09-03 22:21:05,496 DEV : loss 0.2972029745578766 - f1-score (micro avg) 0.773
321
+ 2024-09-03 22:21:05,500 ----------------------------------------------------------------------------------------------------
322
+ 2024-09-03 22:21:10,081 epoch 18 - iter 9/90 - loss 0.00473264 - time (sec): 4.58 - samples/sec: 1372.21 - lr: 0.000002 - momentum: 0.000000
323
+ 2024-09-03 22:21:14,856 epoch 18 - iter 18/90 - loss 0.00333959 - time (sec): 9.35 - samples/sec: 1333.79 - lr: 0.000002 - momentum: 0.000000
324
+ 2024-09-03 22:21:19,144 epoch 18 - iter 27/90 - loss 0.00386776 - time (sec): 13.64 - samples/sec: 1362.51 - lr: 0.000002 - momentum: 0.000000
325
+ 2024-09-03 22:21:23,545 epoch 18 - iter 36/90 - loss 0.00317074 - time (sec): 18.04 - samples/sec: 1375.56 - lr: 0.000002 - momentum: 0.000000
326
+ 2024-09-03 22:21:27,632 epoch 18 - iter 45/90 - loss 0.00345585 - time (sec): 22.13 - samples/sec: 1401.58 - lr: 0.000001 - momentum: 0.000000
327
+ 2024-09-03 22:21:31,759 epoch 18 - iter 54/90 - loss 0.00323514 - time (sec): 26.26 - samples/sec: 1406.92 - lr: 0.000001 - momentum: 0.000000
328
+ 2024-09-03 22:21:36,213 epoch 18 - iter 63/90 - loss 0.00309887 - time (sec): 30.71 - samples/sec: 1406.77 - lr: 0.000001 - momentum: 0.000000
329
+ 2024-09-03 22:21:40,796 epoch 18 - iter 72/90 - loss 0.00289858 - time (sec): 35.29 - samples/sec: 1405.24 - lr: 0.000001 - momentum: 0.000000
330
+ 2024-09-03 22:21:44,868 epoch 18 - iter 81/90 - loss 0.00310884 - time (sec): 39.37 - samples/sec: 1418.45 - lr: 0.000001 - momentum: 0.000000
331
+ 2024-09-03 22:21:48,652 epoch 18 - iter 90/90 - loss 0.00345453 - time (sec): 43.15 - samples/sec: 1422.77 - lr: 0.000001 - momentum: 0.000000
332
+ 2024-09-03 22:21:48,653 ----------------------------------------------------------------------------------------------------
333
+ 2024-09-03 22:21:48,653 EPOCH 18 done: loss 0.0035 - lr: 0.000001
334
+ 2024-09-03 22:21:50,219 DEV : loss 0.3015914261341095 - f1-score (micro avg) 0.7741
335
+ 2024-09-03 22:21:50,224 ----------------------------------------------------------------------------------------------------
336
+ 2024-09-03 22:21:54,090 epoch 19 - iter 9/90 - loss 0.00418814 - time (sec): 3.87 - samples/sec: 1480.24 - lr: 0.000001 - momentum: 0.000000
337
+ 2024-09-03 22:21:58,821 epoch 19 - iter 18/90 - loss 0.00225645 - time (sec): 8.60 - samples/sec: 1405.29 - lr: 0.000001 - momentum: 0.000000
338
+ 2024-09-03 22:22:03,440 epoch 19 - iter 27/90 - loss 0.00205094 - time (sec): 13.22 - samples/sec: 1403.71 - lr: 0.000001 - momentum: 0.000000
339
+ 2024-09-03 22:22:07,670 epoch 19 - iter 36/90 - loss 0.00184944 - time (sec): 17.45 - samples/sec: 1422.03 - lr: 0.000001 - momentum: 0.000000
340
+ 2024-09-03 22:22:12,181 epoch 19 - iter 45/90 - loss 0.00180956 - time (sec): 21.96 - samples/sec: 1410.69 - lr: 0.000001 - momentum: 0.000000
341
+ 2024-09-03 22:22:16,249 epoch 19 - iter 54/90 - loss 0.00175773 - time (sec): 26.02 - samples/sec: 1414.19 - lr: 0.000001 - momentum: 0.000000
342
+ 2024-09-03 22:22:20,450 epoch 19 - iter 63/90 - loss 0.00194382 - time (sec): 30.23 - samples/sec: 1418.41 - lr: 0.000001 - momentum: 0.000000
343
+ 2024-09-03 22:22:24,865 epoch 19 - iter 72/90 - loss 0.00192499 - time (sec): 34.64 - samples/sec: 1422.31 - lr: 0.000001 - momentum: 0.000000
344
+ 2024-09-03 22:22:29,214 epoch 19 - iter 81/90 - loss 0.00199577 - time (sec): 38.99 - samples/sec: 1419.45 - lr: 0.000001 - momentum: 0.000000
345
+ 2024-09-03 22:22:33,476 epoch 19 - iter 90/90 - loss 0.00197310 - time (sec): 43.25 - samples/sec: 1419.48 - lr: 0.000001 - momentum: 0.000000
346
+ 2024-09-03 22:22:33,476 ----------------------------------------------------------------------------------------------------
347
+ 2024-09-03 22:22:33,476 EPOCH 19 done: loss 0.0020 - lr: 0.000001
348
+ 2024-09-03 22:22:35,029 DEV : loss 0.3135406970977783 - f1-score (micro avg) 0.7702
349
+ 2024-09-03 22:22:35,033 ----------------------------------------------------------------------------------------------------
350
+ 2024-09-03 22:22:39,086 epoch 20 - iter 9/90 - loss 0.00362040 - time (sec): 4.05 - samples/sec: 1462.60 - lr: 0.000001 - momentum: 0.000000
351
+ 2024-09-03 22:22:43,561 epoch 20 - iter 18/90 - loss 0.00299681 - time (sec): 8.53 - samples/sec: 1423.89 - lr: 0.000001 - momentum: 0.000000
352
+ 2024-09-03 22:22:47,733 epoch 20 - iter 27/90 - loss 0.00273788 - time (sec): 12.70 - samples/sec: 1465.52 - lr: 0.000000 - momentum: 0.000000
353
+ 2024-09-03 22:22:51,996 epoch 20 - iter 36/90 - loss 0.00258476 - time (sec): 16.96 - samples/sec: 1454.16 - lr: 0.000000 - momentum: 0.000000
354
+ 2024-09-03 22:22:56,328 epoch 20 - iter 45/90 - loss 0.00252421 - time (sec): 21.29 - samples/sec: 1447.19 - lr: 0.000000 - momentum: 0.000000
355
+ 2024-09-03 22:23:00,763 epoch 20 - iter 54/90 - loss 0.00236808 - time (sec): 25.73 - samples/sec: 1445.51 - lr: 0.000000 - momentum: 0.000000
356
+ 2024-09-03 22:23:05,247 epoch 20 - iter 63/90 - loss 0.00220282 - time (sec): 30.21 - samples/sec: 1433.87 - lr: 0.000000 - momentum: 0.000000
357
+ 2024-09-03 22:23:10,243 epoch 20 - iter 72/90 - loss 0.00214946 - time (sec): 35.21 - samples/sec: 1412.84 - lr: 0.000000 - momentum: 0.000000
358
+ 2024-09-03 22:23:14,194 epoch 20 - iter 81/90 - loss 0.00246307 - time (sec): 39.16 - samples/sec: 1418.42 - lr: 0.000000 - momentum: 0.000000
359
+ 2024-09-03 22:23:18,525 epoch 20 - iter 90/90 - loss 0.00240450 - time (sec): 43.49 - samples/sec: 1411.64 - lr: 0.000000 - momentum: 0.000000
360
+ 2024-09-03 22:23:18,526 ----------------------------------------------------------------------------------------------------
361
+ 2024-09-03 22:23:18,526 EPOCH 20 done: loss 0.0024 - lr: 0.000000
362
+ 2024-09-03 22:23:20,078 DEV : loss 0.3147675693035126 - f1-score (micro avg) 0.7702
363
+ 2024-09-03 22:23:21,239 ----------------------------------------------------------------------------------------------------
364
+ 2024-09-03 22:23:21,240 Loading model from best epoch ...
365
+ 2024-09-03 22:23:25,153 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-MISC, B-MISC, E-MISC, I-MISC, S-ORG, B-ORG, E-ORG, I-ORG
366
+ 2024-09-03 22:23:26,698
367
+ Results:
368
+ - F-score (micro) 0.7297
369
+ - F-score (macro) 0.6833
370
+ - Accuracy 0.6119
371
+
372
+ By class:
373
+ precision recall f1-score support
374
+
375
+ ORG 0.7266 0.7949 0.7592 117
376
+ PER 0.8158 0.9538 0.8794 65
377
+ LOC 0.7231 0.7581 0.7402 62
378
+ MISC 0.5600 0.2593 0.3544 54
379
+
380
+ micro avg 0.7347 0.7248 0.7297 298
381
+ macro avg 0.7064 0.6915 0.6833 298
382
+ weighted avg 0.7151 0.7248 0.7081 298
383
+
384
+ 2024-09-03 22:23:26,698 ----------------------------------------------------------------------------------------------------