Lazy-Val commited on
Commit
a800d1a
·
verified ·
1 Parent(s): df48126

Update spaCy pipeline

Browse files
.gitattributes CHANGED
@@ -33,3 +33,8 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ base_transformer/model filter=lfs diff=lfs merge=lfs -text
37
+ fr_trf_nrp-any-py3-none-any.whl filter=lfs diff=lfs merge=lfs -text
38
+ ner_transformer/model filter=lfs diff=lfs merge=lfs -text
39
+ parser/model filter=lfs diff=lfs merge=lfs -text
40
+ trainable_lemmatizer/model filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,59 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - spacy
4
+ - token-classification
5
+ language:
6
+ - fr
7
+ model-index:
8
+ - name: fr_trf_nrp
9
+ results:
10
+ - task:
11
+ name: NER
12
+ type: token-classification
13
+ metrics:
14
+ - name: NER Precision
15
+ type: precision
16
+ value: 0.9688149688
17
+ - name: NER Recall
18
+ type: recall
19
+ value: 0.9789915966
20
+ - name: NER F Score
21
+ type: f_score
22
+ value: 0.973876698
23
+ ---
24
+ | Feature | Description |
25
+ | --- | --- |
26
+ | **Name** | `fr_trf_nrp` |
27
+ | **Version** | `0.0.0` |
28
+ | **spaCy** | `>=3.8.3,<3.9.0` |
29
+ | **Default Pipeline** | `ner_transformer`, `ner`, `merge_entities`, `base_transformer`, `morphologizer`, `tagger`, `parser`, `trainable_lemmatizer` |
30
+ | **Components** | `ner_transformer`, `ner`, `merge_entities`, `base_transformer`, `morphologizer`, `tagger`, `parser`, `trainable_lemmatizer` |
31
+ | **Vectors** | 0 keys, 0 unique vectors (0 dimensions) |
32
+ | **Sources** | n/a |
33
+ | **License** | n/a |
34
+ | **Author** | [n/a]() |
35
+
36
+ ### Label Scheme
37
+
38
+ <details>
39
+
40
+ <summary>View label scheme (533 labels for 4 components)</summary>
41
+
42
+ | Component | Labels |
43
+ | --- | --- |
44
+ | **`ner`** | `LOC`, `ORG`, `PER` |
45
+ | **`morphologizer`** | `POS=PROPN`, `Gender=Fem\|Number=Sing\|POS=DET\|PronType=Dem`, `Gender=Fem\|Number=Sing\|POS=NOUN`, `Number=Plur\|POS=PRON\|Person=1\|PronType=Prs`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `POS=SCONJ`, `POS=ADP`, `Definite=Def\|Gender=Masc\|Number=Sing\|POS=DET\|PronType=Art`, `NumType=Ord\|POS=ADJ`, `Gender=Masc\|Number=Sing\|POS=NOUN`, `POS=PUNCT`, `Gender=Masc\|Number=Sing\|POS=PROPN`, `Number=Plur\|POS=ADJ`, `Gender=Masc\|Number=Plur\|POS=NOUN`, `Definite=Ind\|Gender=Fem\|Number=Sing\|POS=DET\|PronType=Art`, `Number=Sing\|POS=ADJ`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=3\|Tense=Imp\|VerbForm=Fin`, `POS=ADV`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=3\|Tense=Past\|VerbForm=Fin`, `Gender=Fem\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Definite=Def\|Gender=Fem\|Number=Sing\|POS=DET\|PronType=Art`, `Gender=Fem\|Number=Sing\|POS=PROPN`, `Definite=Def\|Number=Sing\|POS=DET\|PronType=Art`, `NumType=Card\|POS=NUM`, `Definite=Def\|Number=Plur\|POS=DET\|PronType=Art`, `Gender=Masc\|Number=Plur\|POS=ADJ`, `POS=CCONJ`, `Gender=Fem\|Number=Plur\|POS=NOUN`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=3\|Tense=Past\|VerbForm=Fin`, `Gender=Masc\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Gender=Fem\|Number=Plur\|POS=ADJ`, `POS=ADJ`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=3\|Tense=Past\|VerbForm=Fin`, `POS=PRON\|PronType=Rel`, `ExtPos=CCONJ\|POS=CCONJ`, `Number=Sing\|POS=DET\|Poss=Yes`, `Definite=Def\|Gender=Masc\|Number=Sing\|POS=ADP\|PronType=Art`, `ExtPos=ADV\|POS=ADV`, `Definite=Def\|Number=Plur\|POS=ADP\|PronType=Art`, `Definite=Ind\|Number=Plur\|POS=DET\|PronType=Art`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=3\|Tense=Past\|VerbForm=Fin`, `Gender=Masc\|Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `POS=VERB\|VerbForm=Inf`, `Gender=Fem\|Number=Sing\|POS=ADJ`, `Gender=Masc\|Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `Number=Plur\|POS=DET`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|Number=Sing\|POS=ADJ`, `Gender=Masc\|Number=Sing\|POS=DET\|PronType=Dem`, `POS=ADV\|PronType=Int`, `ExtPos=SCONJ\|POS=SCONJ`, `POS=VERB\|Tense=Pres\|VerbForm=Part`, `Definite=Ind\|Gender=Masc\|Number=Sing\|POS=DET\|PronType=Art`, `Gender=Masc\|POS=ADJ`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=3\|Tense=Fut\|VerbForm=Fin`, `Number=Plur\|POS=DET\|Poss=Yes`, `POS=AUX\|VerbForm=Inf`, `Gender=Masc\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `POS=ADV\|Polarity=Neg`, `Definite=Ind\|Number=Sing\|POS=DET\|PronType=Art`, `Gender=Fem\|Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `POS=PRON\|Person=3\|PronType=Prs\|Reflex=Yes`, `Gender=Masc\|POS=NOUN`, `POS=AUX\|Tense=Past\|VerbForm=Part`, `POS=PRON\|Person=3\|PronType=Prs`, `Number=Plur\|POS=NOUN`, `ExtPos=ADV\|POS=ADP`, `NumType=Ord\|Number=Sing\|POS=ADJ`, `ExtPos=ADP\|POS=ADV\|Polarity=Neg`, `POS=VERB\|Tense=Past\|VerbForm=Part`, `POS=AUX\|Tense=Pres\|VerbForm=Part`, `Number=Sing\|POS=PRON\|Person=3\|PronType=Dem`, `Number=Sing\|POS=NOUN`, `Gender=Masc\|Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Gender=Masc\|Number=Plur\|POS=PRON\|Person=3\|PronType=Prs`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=3\|Tense=Imp\|VerbForm=Fin`, `Gender=Fem\|NumType=Ord\|Number=Sing\|POS=ADJ`, `Number=Plur\|POS=PROPN`, `Number=Sing\|POS=PROPN`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=3\|Tense=Imp\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|Number=Plur\|POS=PRON\|Person=3\|PronType=Dem`, `Gender=Masc\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Gender=Masc\|Number=Sing\|POS=DET`, `Gender=Fem\|Number=Sing\|POS=DET\|Poss=Yes`, `Gender=Masc\|Number=Sing\|POS=PRON\|Person=3\|PronType=Ind`, `POS=NOUN`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=3\|Tense=Fut\|VerbForm=Fin`, `ExtPos=ADP\|Gender=Fem\|Number=Sing\|POS=NOUN`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=3\|Tense=Fut\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `ExtPos=PRON\|POS=ADV`, `Number=Plur\|POS=PRON\|Person=3\|PronType=Ind`, `Gender=Masc\|NumType=Ord\|Number=Plur\|POS=ADJ`, `ExtPos=ADP\|Gender=Masc\|Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=3\|Tense=Fut\|VerbForm=Fin`, `Gender=Fem\|Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Number=Sing\|POS=PRON\|PronType=Neg`, `Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `Number=Sing\|POS=PRON\|Person=3\|PronType=Ind`, `Mood=Ind\|POS=VERB\|VerbForm=Fin`, `Number=Plur\|POS=DET\|PronType=Dem`, `Gender=Masc\|Number=Plur\|POS=PRON\|Person=3\|PronType=Ind`, `ExtPos=ADP\|POS=ADP`, `Gender=Masc\|Number=Sing\|POS=PRON\|Person=3\|PronType=Dem`, `Number=Sing\|POS=PRON\|Person=2\|PronType=Prs`, `Gender=Masc\|Number=Sing\|POS=PRON\|PronType=Rel`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=3\|Tense=Imp\|VerbForm=Fin`, `ExtPos=ADJ\|POS=CCONJ`, `Mood=Sub\|Number=Sing\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Definite=Ind\|ExtPos=ADV\|Gender=Masc\|Number=Sing\|POS=DET\|PronType=Art`, `Gender=Masc\|NumType=Ord\|Number=Sing\|POS=ADJ`, `POS=NUM`, `Gender=Fem\|POS=NOUN`, `Number=Plur\|POS=PRON\|Person=3\|PronType=Prs`, `Gender=Masc\|Number=Sing\|POS=DET\|Polarity=Neg`, `Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Number=Sing\|POS=PRON\|Person=1\|PronType=Prs`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Sub\|Number=Sing\|POS=VERB\|Person=3\|Tense=Past\|VerbForm=Fin`, `Gender=Fem\|Number=Sing\|POS=PRON\|Person=3\|PronType=Prs\|Reflex=Yes`, `Gender=Fem\|Number=Sing\|POS=PRON\|Person=3\|PronType=Ind`, `Definite=Def\|ExtPos=ADV\|Gender=Masc\|Number=Sing\|POS=ADP\|PronType=Art`, `Mood=Sub\|Number=Sing\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Fem\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `POS=INTJ`, `Number=Plur\|POS=PRON\|Person=2\|PronType=Prs`, `ExtPos=SCONJ\|POS=ADV`, `ExtPos=DET\|POS=ADP`, `Definite=Ind\|Gender=Fem\|Number=Plur\|POS=DET\|PronType=Art`, `Gender=Fem\|Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part`, `NumType=Card\|POS=NOUN`, `Gender=Fem\|Number=Sing\|POS=VERB\|Tense=Past\|Typo=Yes\|VerbForm=Part\|Voice=Pass`, `POS=PRON\|PronType=Int`, `Gender=Fem\|Number=Plur\|POS=PRON\|Person=3\|PronType=Prs`, `Gender=Fem\|Number=Sing\|POS=DET`, `Gender=Masc\|Number=Sing\|POS=NOUN\|Typo=Yes`, `Mood=Cnd\|Number=Sing\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Fem\|Number=Plur\|POS=DET`, `Mood=Sub\|Number=Plur\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Definite=Ind\|Gender=Masc\|Number=Plur\|POS=DET\|PronType=Art`, `Mood=Cnd\|Number=Sing\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|Number=Plur\|POS=PROPN`, `Mood=Cnd\|Number=Plur\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Fem\|Number=Sing\|POS=PRON\|Person=3\|PronType=Dem`, `Number=Sing\|POS=DET`, `Gender=Masc\|NumType=Card\|Number=Plur\|POS=NOUN`, `Gender=Fem\|Number=Plur\|POS=PRON\|Person=3\|PronType=Dem`, `Mood=Ind\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Fem\|Number=Sing\|POS=PRON\|PronType=Rel`, `ExtPos=CCONJ\|POS=ADV`, `Gender=Masc\|Number=Plur\|POS=PRON\|Person=3\|PronType=Prs\|Reflex=Yes`, `Gender=Fem\|Number=Sing\|POS=NOUN\|Typo=Yes`, `ExtPos=ADP\|Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=1\|Tense=Imp\|VerbForm=Fin`, `Mood=Cnd\|Number=Plur\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Gender=Fem\|Number=Sing\|POS=DET\|Polarity=Neg`, `ExtPos=CCONJ\|POS=ADP`, `Definite=Def\|ExtPos=ADV\|Gender=Masc\|Number=Sing\|POS=DET\|PronType=Art`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|Number=Sing\|POS=AUX\|Tense=Past\|VerbForm=Part`, `Foreign=Yes\|POS=X`, `POS=SYM`, `Mood=Imp\|Number=Plur\|POS=VERB\|Person=2\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=2\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|Number=Sing\|POS=DET\|PronType=Int`, `Gender=Fem\|Number=Plur\|POS=DET\|PronType=Int`, `POS=DET`, `Gender=Masc\|Number=Plur\|POS=PRON\|PronType=Rel`, `Definite=Ind\|ExtPos=ADV\|Gender=Fem\|Number=Sing\|POS=DET\|PronType=Art`, `Mood=Sub\|Number=Plur\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|POS=VERB\|Person=3\|VerbForm=Fin`, `ExtPos=DET\|POS=ADV\|Polarity=Neg`, `Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `POS=ADJ\|Typo=Yes`, `POS=X`, `ExtPos=SCONJ\|POS=ADP`, `ExtPos=ADJ\|POS=X`, `ExtPos=ADJ\|POS=ADP`, `POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `ExtPos=CCONJ\|POS=SYM`, `Mood=Cnd\|Number=Plur\|POS=VERB\|Person=2\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=2\|Tense=Pres\|VerbForm=Fin`, `Gender=Fem\|Number=Sing\|POS=DET\|PronType=Int`, `Gender=Masc\|Number=Plur\|POS=DET`, `Gender=Fem\|Number=Plur\|POS=PRON\|PronType=Rel`, `ExtPos=ADV\|Gender=Masc\|Number=Sing\|POS=NOUN`, `ExtPos=ADP\|POS=PRON\|Person=3\|PronType=Prs`, `Gender=Masc\|Number=Sing\|POS=VERB\|Tense=Past\|Typo=Yes\|VerbForm=Part\|Voice=Pass`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=3\|Tense=Imp\|Typo=Yes\|VerbForm=Fin`, `Gender=Fem\|NumType=Ord\|Number=Plur\|POS=ADJ`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=2\|Tense=Fut\|VerbForm=Fin`, `Mood=Imp\|POS=VERB\|Tense=Pres\|VerbForm=Fin`, `Gender=Fem\|Number=Plur\|POS=PRON\|Person=3\|PronType=Ind`, `Number=Plur\|POS=PRON\|Person=2\|PronType=Prs\|Reflex=Yes`, `Mood=Cnd\|Number=Sing\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Number=Plur\|POS=PRON\|Person=1\|PronType=Prs\|Reflex=Yes`, `Gender=Masc\|NumType=Card\|Number=Sing\|POS=NOUN`, `ExtPos=PRON\|POS=ADP`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=1\|Tense=Fut\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=1\|Tense=Fut\|VerbForm=Fin`, `Number=Sing\|POS=PRON\|Person=1\|PronType=Prs\|Reflex=Yes`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=1\|Tense=Imp\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=1\|Tense=Imp\|VerbForm=Fin`, `ExtPos=ADV\|Gender=Masc\|Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=1\|Tense=Imp\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=3\|Tense=Pres\|Typo=Yes\|VerbForm=Fin`, `Mood=Sub\|Number=Sing\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `ExtPos=ADV\|POS=ADV\|Polarity=Neg`, `Gender=Masc\|Number=Sing\|POS=PRON\|PronType=Neg`, `ExtPos=ADV\|Gender=Masc\|Number=Sing\|POS=ADJ`, `ExtPos=ADV\|Number=Sing\|POS=PRON\|Person=3\|PronType=Dem`, `ExtPos=PRON\|POS=PRON\|PronType=Rel`, `Gender=Fem\|Number=Plur\|POS=PRON\|Person=3\|PronType=Prs\|Typo=Yes`, `Gender=Masc\|POS=PROPN`, `Mood=Cnd\|Number=Plur\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=3\|Tense=Pres\|Typo=Yes\|VerbForm=Fin`, `Gender=Masc\|Number=Plur\|POS=VERB\|Tense=Past\|Typo=Yes\|VerbForm=Part\|Voice=Pass`, `Mood=Sub\|Number=Sing\|POS=AUX\|Person=1\|Tense=Pres\|VerbForm=Fin`, `ExtPos=SCONJ\|Gender=Masc\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=1\|Tense=Fut\|VerbForm=Fin`, `ExtPos=ADV\|POS=X`, `Mood=Cnd\|Number=Sing\|POS=AUX\|Person=1\|Tense=Pres\|VerbForm=Fin`, `ExtPos=CCONJ\|POS=PRON\|Person=3\|PronType=Prs`, `Mood=Sub\|Number=Sing\|POS=AUX\|Person=3\|Tense=Pres\|Typo=Yes\|VerbForm=Fin`, `Mood=Sub\|Number=Plur\|POS=AUX\|Person=1\|Tense=Pres\|VerbForm=Fin`, `ExtPos=INTJ\|POS=INTJ`, `Mood=Imp\|Number=Plur\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Sub\|Number=Plur\|POS=AUX\|Person=2\|Tense=Pres\|VerbForm=Fin`, `Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=2\|Tense=Imp\|VerbForm=Fin`, `Number=Sing\|POS=PRON\|PronType=Rel`, `Gender=Fem\|Number=Plur\|Number[psor]=Plur\|POS=PRON\|Person=3\|Person[psor]=1\|Poss=Yes\|PronType=Prs`, `ExtPos=PROPN\|Gender=Masc\|Number=Sing\|POS=NOUN`, `ExtPos=PROPN\|Gender=Masc\|Number=Plur\|POS=NOUN`, `Definite=Def\|Gender=Masc\|Number=Sing\|POS=ADP\|PronType=Art\|Typo=Yes`, `ExtPos=PROPN\|Gender=Fem\|Number=Sing\|POS=NOUN`, `ExtPos=ADV\|POS=PRON\|Person=3\|PronType=Prs`, `Gender=Fem\|Number=Plur\|POS=PROPN`, `Gender=Fem\|Number=Plur\|POS=VERB\|Tense=Past\|Typo=Yes\|VerbForm=Part\|Voice=Pass`, `ExtPos=ADJ\|Gender=Masc\|NumType=Card\|POS=NUM` |
46
+ | **`tagger`** | `ADJ`, `ADJ__ExtPos=ADV\|Gender=Masc\|Number=Sing`, `ADJ__Gender=Fem\|Number=Plur`, `ADJ__Gender=Fem\|Number=Plur\|NumType=Ord`, `ADJ__Gender=Fem\|Number=Sing`, `ADJ__Gender=Fem\|Number=Sing\|NumType=Ord`, `ADJ__Gender=Masc`, `ADJ__Gender=Masc\|Number=Plur`, `ADJ__Gender=Masc\|Number=Plur\|NumType=Ord`, `ADJ__Gender=Masc\|Number=Sing`, `ADJ__Gender=Masc\|Number=Sing\|NumType=Ord`, `ADJ__NumType=Ord`, `ADJ__Number=Plur`, `ADJ__Number=Sing`, `ADJ__Number=Sing\|NumType=Ord`, `ADJ__Typo=Yes`, `ADP`, `ADP_DET__Definite=Def\|ExtPos=ADV\|Gender=Masc\|Number=Sing\|PronType=Art`, `ADP_DET__Definite=Def\|Gender=Masc\|Number=Sing\|PronType=Art`, `ADP_DET__Definite=Def\|Gender=Masc\|Number=Sing\|PronType=Art\|Typo=Yes`, `ADP_DET__Definite=Def\|Number=Plur\|PronType=Art`, `ADP_PRON__Gender=Fem\|Number=Plur\|PronType=Rel`, `ADP_PRON__Gender=Masc\|Number=Plur\|PronType=Rel`, `ADP_PRON__Gender=Masc\|Number=Sing\|PronType=Rel`, `ADP__ExtPos=ADJ`, `ADP__ExtPos=ADP`, `ADP__ExtPos=ADV`, `ADP__ExtPos=CCONJ`, `ADP__ExtPos=DET`, `ADP__ExtPos=PRON`, `ADP__ExtPos=SCONJ`, `ADV`, `ADV__ExtPos=ADP\|Polarity=Neg`, `ADV__ExtPos=ADV`, `ADV__ExtPos=ADV\|Polarity=Neg`, `ADV__ExtPos=CCONJ`, `ADV__ExtPos=DET\|Polarity=Neg`, `ADV__ExtPos=PRON`, `ADV__ExtPos=SCONJ`, `ADV__Polarity=Neg`, `ADV__PronType=Int`, `AUX__Gender=Masc\|Number=Sing\|Tense=Past\|VerbForm=Part`, `AUX__Mood=Cnd\|Number=Plur\|Person=3\|Tense=Pres\|VerbForm=Fin`, `AUX__Mood=Cnd\|Number=Sing\|Person=1\|Tense=Pres\|VerbForm=Fin`, `AUX__Mood=Cnd\|Number=Sing\|Person=3\|Tense=Pres\|VerbForm=Fin`, `AUX__Mood=Ind\|Number=Plur\|Person=1\|Tense=Fut\|VerbForm=Fin`, `AUX__Mood=Ind\|Number=Plur\|Person=1\|Tense=Imp\|VerbForm=Fin`, `AUX__Mood=Ind\|Number=Plur\|Person=1\|Tense=Pres\|VerbForm=Fin`, `AUX__Mood=Ind\|Number=Plur\|Person=2\|Tense=Pres\|VerbForm=Fin`, `AUX__Mood=Ind\|Number=Plur\|Person=3\|Tense=Fut\|VerbForm=Fin`, `AUX__Mood=Ind\|Number=Plur\|Person=3\|Tense=Imp\|VerbForm=Fin`, `AUX__Mood=Ind\|Number=Plur\|Person=3\|Tense=Past\|VerbForm=Fin`, `AUX__Mood=Ind\|Number=Plur\|Person=3\|Tense=Pres\|Typo=Yes\|VerbForm=Fin`, `AUX__Mood=Ind\|Number=Plur\|Person=3\|Tense=Pres\|VerbForm=Fin`, `AUX__Mood=Ind\|Number=Sing\|Person=1\|Tense=Imp\|VerbForm=Fin`, `AUX__Mood=Ind\|Number=Sing\|Person=1\|Tense=Pres\|VerbForm=Fin`, `AUX__Mood=Ind\|Number=Sing\|Person=3\|Tense=Fut\|VerbForm=Fin`, `AUX__Mood=Ind\|Number=Sing\|Person=3\|Tense=Imp\|Typo=Yes\|VerbForm=Fin`, `AUX__Mood=Ind\|Number=Sing\|Person=3\|Tense=Imp\|VerbForm=Fin`, `AUX__Mood=Ind\|Number=Sing\|Person=3\|Tense=Past\|VerbForm=Fin`, `AUX__Mood=Ind\|Number=Sing\|Person=3\|Tense=Pres\|VerbForm=Fin`, `AUX__Mood=Sub\|Number=Plur\|Person=1\|Tense=Pres\|VerbForm=Fin`, `AUX__Mood=Sub\|Number=Plur\|Person=2\|Tense=Pres\|VerbForm=Fin`, `AUX__Mood=Sub\|Number=Plur\|Person=3\|Tense=Pres\|VerbForm=Fin`, `AUX__Mood=Sub\|Number=Sing\|Person=1\|Tense=Pres\|VerbForm=Fin`, `AUX__Mood=Sub\|Number=Sing\|Person=3\|Tense=Pres\|Typo=Yes\|VerbForm=Fin`, `AUX__Mood=Sub\|Number=Sing\|Person=3\|Tense=Pres\|VerbForm=Fin`, `AUX__Tense=Past\|VerbForm=Part`, `AUX__Tense=Pres\|VerbForm=Part`, `AUX__VerbForm=Inf`, `CCONJ`, `CCONJ__ExtPos=ADJ`, `CCONJ__ExtPos=CCONJ`, `DET`, `DET__Definite=Def\|ExtPos=ADV\|Gender=Masc\|Number=Sing\|PronType=Art`, `DET__Definite=Def\|Gender=Fem\|Number=Sing\|PronType=Art`, `DET__Definite=Def\|Gender=Masc\|Number=Sing\|PronType=Art`, `DET__Definite=Def\|Number=Plur\|PronType=Art`, `DET__Definite=Def\|Number=Sing\|PronType=Art`, `DET__Definite=Ind\|ExtPos=ADV\|Gender=Fem\|Number=Sing\|PronType=Art`, `DET__Definite=Ind\|ExtPos=ADV\|Gender=Masc\|Number=Sing\|PronType=Art`, `DET__Definite=Ind\|Gender=Fem\|Number=Plur\|PronType=Art`, `DET__Definite=Ind\|Gender=Fem\|Number=Sing\|PronType=Art`, `DET__Definite=Ind\|Gender=Masc\|Number=Plur\|PronType=Art`, `DET__Definite=Ind\|Gender=Masc\|Number=Sing\|PronType=Art`, `DET__Definite=Ind\|Number=Plur\|PronType=Art`, `DET__Definite=Ind\|Number=Sing\|PronType=Art`, `DET__Gender=Fem\|Number=Plur`, `DET__Gender=Fem\|Number=Plur\|PronType=Int`, `DET__Gender=Fem\|Number=Sing`, `DET__Gender=Fem\|Number=Sing\|Polarity=Neg`, `DET__Gender=Fem\|Number=Sing\|Poss=Yes`, `DET__Gender=Fem\|Number=Sing\|PronType=Dem`, `DET__Gender=Fem\|Number=Sing\|PronType=Int`, `DET__Gender=Masc\|Number=Plur`, `DET__Gender=Masc\|Number=Sing`, `DET__Gender=Masc\|Number=Sing\|Polarity=Neg`, `DET__Gender=Masc\|Number=Sing\|PronType=Dem`, `DET__Gender=Masc\|Number=Sing\|PronType=Int`, `DET__Number=Plur`, `DET__Number=Plur\|Poss=Yes`, `DET__Number=Plur\|PronType=Dem`, `DET__Number=Sing`, `DET__Number=Sing\|Poss=Yes`, `INTJ`, `INTJ__ExtPos=INTJ`, `NOUN`, `NOUN__ExtPos=ADP\|Gender=Fem\|Number=Sing`, `NOUN__ExtPos=ADV\|Gender=Masc\|Number=Sing`, `NOUN__ExtPos=PROPN\|Gender=Fem\|Number=Sing`, `NOUN__ExtPos=PROPN\|Gender=Masc\|Number=Plur`, `NOUN__ExtPos=PROPN\|Gender=Masc\|Number=Sing`, `NOUN__Gender=Fem`, `NOUN__Gender=Fem\|Number=Plur`, `NOUN__Gender=Fem\|Number=Sing`, `NOUN__Gender=Fem\|Number=Sing\|Typo=Yes`, `NOUN__Gender=Masc`, `NOUN__Gender=Masc\|Number=Plur`, `NOUN__Gender=Masc\|Number=Plur\|NumType=Card`, `NOUN__Gender=Masc\|Number=Sing`, `NOUN__Gender=Masc\|Number=Sing\|NumType=Card`, `NOUN__Gender=Masc\|Number=Sing\|Typo=Yes`, `NOUN__NumType=Card`, `NOUN__Number=Plur`, `NOUN__Number=Sing`, `NUM`, `NUM__ExtPos=ADJ\|Gender=Masc\|NumType=Card`, `NUM__NumType=Card`, `PRON__ExtPos=ADP\|Gender=Masc\|Number=Sing\|Person=3\|PronType=Prs`, `PRON__ExtPos=ADP\|Number=Sing\|Person=3\|PronType=Prs`, `PRON__ExtPos=ADP\|Person=3\|PronType=Prs`, `PRON__ExtPos=ADV\|Gender=Masc\|Number=Sing\|Person=3\|PronType=Prs`, `PRON__ExtPos=ADV\|Number=Sing\|Person=3\|PronType=Dem`, `PRON__ExtPos=ADV\|Person=3\|PronType=Prs`, `PRON__ExtPos=CCONJ\|Person=3\|PronType=Prs`, `PRON__ExtPos=PRON\|PronType=Rel`, `PRON__Gender=Fem\|Number=Plur\|Number[psor]=Plur\|Person=3\|Person[psor]=1\|Poss=Yes\|PronType=Prs`, `PRON__Gender=Fem\|Number=Plur\|Person=3\|PronType=Dem`, `PRON__Gender=Fem\|Number=Plur\|Person=3\|PronType=Ind`, `PRON__Gender=Fem\|Number=Plur\|Person=3\|PronType=Prs`, `PRON__Gender=Fem\|Number=Plur\|Person=3\|PronType=Prs\|Typo=Yes`, `PRON__Gender=Fem\|Number=Plur\|PronType=Rel`, `PRON__Gender=Fem\|Number=Sing\|Person=3\|PronType=Dem`, `PRON__Gender=Fem\|Number=Sing\|Person=3\|PronType=Ind`, `PRON__Gender=Fem\|Number=Sing\|Person=3\|PronType=Prs`, `PRON__Gender=Fem\|Number=Sing\|Person=3\|PronType=Prs\|Reflex=Yes`, `PRON__Gender=Fem\|Number=Sing\|PronType=Rel`, `PRON__Gender=Masc\|Number=Plur\|Person=3\|PronType=Dem`, `PRON__Gender=Masc\|Number=Plur\|Person=3\|PronType=Ind`, `PRON__Gender=Masc\|Number=Plur\|Person=3\|PronType=Prs`, `PRON__Gender=Masc\|Number=Plur\|Person=3\|PronType=Prs\|Reflex=Yes`, `PRON__Gender=Masc\|Number=Plur\|PronType=Rel`, `PRON__Gender=Masc\|Number=Sing\|Person=3\|PronType=Dem`, `PRON__Gender=Masc\|Number=Sing\|Person=3\|PronType=Ind`, `PRON__Gender=Masc\|Number=Sing\|Person=3\|PronType=Prs`, `PRON__Gender=Masc\|Number=Sing\|PronType=Neg`, `PRON__Gender=Masc\|Number=Sing\|PronType=Rel`, `PRON__Number=Plur\|Person=1\|PronType=Prs`, `PRON__Number=Plur\|Person=1\|PronType=Prs\|Reflex=Yes`, `PRON__Number=Plur\|Person=2\|PronType=Prs`, `PRON__Number=Plur\|Person=2\|PronType=Prs\|Reflex=Yes`, `PRON__Number=Plur\|Person=3\|PronType=Ind`, `PRON__Number=Plur\|Person=3\|PronType=Prs`, `PRON__Number=Sing\|Person=1\|PronType=Prs`, `PRON__Number=Sing\|Person=1\|PronType=Prs\|Reflex=Yes`, `PRON__Number=Sing\|Person=2\|PronType=Prs`, `PRON__Number=Sing\|Person=3\|PronType=Dem`, `PRON__Number=Sing\|Person=3\|PronType=Ind`, `PRON__Number=Sing\|Person=3\|PronType=Prs`, `PRON__Number=Sing\|PronType=Neg`, `PRON__Number=Sing\|PronType=Rel`, `PRON__Person=3\|PronType=Prs`, `PRON__Person=3\|PronType=Prs\|Reflex=Yes`, `PRON__PronType=Int`, `PRON__PronType=Rel`, `PROPN`, `PROPN__Gender=Fem\|Number=Plur`, `PROPN__Gender=Fem\|Number=Sing`, `PROPN__Gender=Masc`, `PROPN__Gender=Masc\|Number=Plur`, `PROPN__Gender=Masc\|Number=Sing`, `PROPN__Number=Plur`, `PROPN__Number=Sing`, `PUNCT`, `SCONJ`, `SCONJ__ExtPos=SCONJ`, `SYM`, `SYM__ExtPos=CCONJ`, `VERB__ExtPos=SCONJ\|Gender=Masc\|Number=Sing\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `VERB__Gender=Fem\|Number=Plur\|Tense=Past\|Typo=Yes\|VerbForm=Part\|Voice=Pass`, `VERB__Gender=Fem\|Number=Plur\|Tense=Past\|VerbForm=Part`, `VERB__Gender=Fem\|Number=Plur\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `VERB__Gender=Fem\|Number=Sing\|Tense=Past\|Typo=Yes\|VerbForm=Part\|Voice=Pass`, `VERB__Gender=Fem\|Number=Sing\|Tense=Past\|VerbForm=Part`, `VERB__Gender=Fem\|Number=Sing\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `VERB__Gender=Masc\|Number=Plur\|Tense=Past\|Typo=Yes\|VerbForm=Part\|Voice=Pass`, `VERB__Gender=Masc\|Number=Plur\|Tense=Past\|VerbForm=Part`, `VERB__Gender=Masc\|Number=Plur\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `VERB__Gender=Masc\|Number=Sing\|Tense=Past\|Typo=Yes\|VerbForm=Part\|Voice=Pass`, `VERB__Gender=Masc\|Number=Sing\|Tense=Past\|VerbForm=Part`, `VERB__Gender=Masc\|Number=Sing\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `VERB__Gender=Masc\|Tense=Past\|VerbForm=Part`, `VERB__Gender=Masc\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `VERB__Mood=Cnd\|Number=Plur\|Person=1\|Tense=Pres\|VerbForm=Fin`, `VERB__Mood=Cnd\|Number=Plur\|Person=2\|Tense=Pres\|VerbForm=Fin`, `VERB__Mood=Cnd\|Number=Plur\|Person=3\|Tense=Pres\|VerbForm=Fin`, `VERB__Mood=Cnd\|Number=Sing\|Person=1\|Tense=Pres\|VerbForm=Fin`, `VERB__Mood=Cnd\|Number=Sing\|Person=3\|Tense=Pres\|VerbForm=Fin`, `VERB__Mood=Imp\|Number=Plur\|Person=1\|Tense=Pres\|VerbForm=Fin`, `VERB__Mood=Imp\|Number=Plur\|Person=2\|Tense=Pres\|VerbForm=Fin`, `VERB__Mood=Imp\|Tense=Pres\|VerbForm=Fin`, `VERB__Mood=Ind\|Number=Plur\|Person=1\|Tense=Fut\|VerbForm=Fin`, `VERB__Mood=Ind\|Number=Plur\|Person=1\|Tense=Imp\|VerbForm=Fin`, `VERB__Mood=Ind\|Number=Plur\|Person=1\|Tense=Pres\|VerbForm=Fin`, `VERB__Mood=Ind\|Number=Plur\|Person=2\|Tense=Fut\|VerbForm=Fin`, `VERB__Mood=Ind\|Number=Plur\|Person=2\|Tense=Imp\|VerbForm=Fin`, `VERB__Mood=Ind\|Number=Plur\|Person=2\|Tense=Pres\|VerbForm=Fin`, `VERB__Mood=Ind\|Number=Plur\|Person=3\|Tense=Fut\|VerbForm=Fin`, `VERB__Mood=Ind\|Number=Plur\|Person=3\|Tense=Imp\|VerbForm=Fin`, `VERB__Mood=Ind\|Number=Plur\|Person=3\|Tense=Past\|VerbForm=Fin`, `VERB__Mood=Ind\|Number=Plur\|Person=3\|Tense=Pres\|Typo=Yes\|VerbForm=Fin`, `VERB__Mood=Ind\|Number=Plur\|Person=3\|Tense=Pres\|VerbForm=Fin`, `VERB__Mood=Ind\|Number=Sing\|Person=1\|Tense=Fut\|VerbForm=Fin`, `VERB__Mood=Ind\|Number=Sing\|Person=1\|Tense=Imp\|VerbForm=Fin`, `VERB__Mood=Ind\|Number=Sing\|Person=1\|Tense=Pres\|VerbForm=Fin`, `VERB__Mood=Ind\|Number=Sing\|Person=3\|Tense=Fut\|VerbForm=Fin`, `VERB__Mood=Ind\|Number=Sing\|Person=3\|Tense=Imp\|VerbForm=Fin`, `VERB__Mood=Ind\|Number=Sing\|Person=3\|Tense=Past\|VerbForm=Fin`, `VERB__Mood=Ind\|Number=Sing\|Person=3\|Tense=Pres\|VerbForm=Fin`, `VERB__Mood=Ind\|Person=3\|Tense=Pres\|VerbForm=Fin`, `VERB__Mood=Ind\|Person=3\|VerbForm=Fin`, `VERB__Mood=Ind\|VerbForm=Fin`, `VERB__Mood=Sub\|Number=Plur\|Person=3\|Tense=Pres\|VerbForm=Fin`, `VERB__Mood=Sub\|Number=Sing\|Person=1\|Tense=Pres\|VerbForm=Fin`, `VERB__Mood=Sub\|Number=Sing\|Person=3\|Tense=Past\|VerbForm=Fin`, `VERB__Mood=Sub\|Number=Sing\|Person=3\|Tense=Pres\|VerbForm=Fin`, `VERB__Number=Plur\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `VERB__Number=Sing\|Tense=Past\|VerbForm=Part`, `VERB__Number=Sing\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `VERB__Tense=Past\|VerbForm=Part`, `VERB__Tense=Past\|VerbForm=Part\|Voice=Pass`, `VERB__Tense=Pres\|VerbForm=Part`, `VERB__VerbForm=Inf`, `X`, `X__ExtPos=ADJ`, `X__ExtPos=ADV`, `X__Foreign=Yes` |
47
+ | **`parser`** | `ROOT`, `acl`, `acl:relcl`, `advcl`, `advmod`, `amod`, `appos`, `aux:pass`, `aux:tense`, `case`, `cc`, `ccomp`, `conj`, `cop`, `csubj`, `dep`, `det`, `expl:comp`, `expl:pass`, `expl:pv`, `expl:subj`, `fixed`, `flat:foreign`, `flat:name`, `iobj`, `mark`, `nmod`, `nsubj`, `nsubj:pass`, `nummod`, `obj`, `obl:agent`, `obl:arg`, `obl:mod`, `parataxis`, `parataxis:insert`, `punct`, `vocative`, `xcomp` |
48
+
49
+ </details>
50
+
51
+ ### Accuracy
52
+
53
+ | Type | Score |
54
+ | --- | --- |
55
+ | `ENTS_F` | 97.39 |
56
+ | `ENTS_P` | 96.88 |
57
+ | `ENTS_R` | 97.90 |
58
+ | `NER_TRANSFORMER_LOSS` | 2373.89 |
59
+ | `NER_LOSS` | 9945.98 |
base_transformer/cfg ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ {
2
+ "max_batch_items":4096
3
+ }
base_transformer/model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ae82900483b15cbd017ba63a5a7e8833ffa9d295fe764b168d33281a2bcd746c
3
+ size 443537828
config.cfg ADDED
@@ -0,0 +1,269 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [paths]
2
+ train = "./data/training/fr/reviewed_annotations-11-2024/train.spacy"
3
+ dev = "./data/training/fr/reviewed_annotations-11-2024/dev.spacy"
4
+ vectors = null
5
+ init_tok2vec = null
6
+
7
+ [system]
8
+ gpu_allocator = "pytorch"
9
+ seed = 17
10
+
11
+ [nlp]
12
+ lang = "fr"
13
+ pipeline = ["ner_transformer","ner","merge_entities","base_transformer","morphologizer","tagger","parser","trainable_lemmatizer"]
14
+ batch_size = 512
15
+ disabled = []
16
+ before_creation = null
17
+ after_creation = null
18
+ after_pipeline_creation = null
19
+ tokenizer = {"@tokenizers":"customize_tokenizer"}
20
+ vectors = {"@vectors":"spacy.Vectors.v1"}
21
+
22
+ [components]
23
+
24
+ [components.base_transformer]
25
+ factory = "transformer"
26
+ max_batch_items = 4096
27
+ set_extra_annotations = {"@annotation_setters":"spacy-transformers.null_annotation_setter.v1"}
28
+
29
+ [components.base_transformer.model]
30
+ @architectures = "spacy-transformers.TransformerModel.v3"
31
+ name = "almanach/camembertav2-base"
32
+ mixed_precision = false
33
+
34
+ [components.base_transformer.model.get_spans]
35
+ @span_getters = "spacy-transformers.strided_spans.v1"
36
+ window = 128
37
+ stride = 96
38
+
39
+ [components.base_transformer.model.grad_scaler_config]
40
+
41
+ [components.base_transformer.model.tokenizer_config]
42
+ use_fast = true
43
+
44
+ [components.base_transformer.model.transformer_config]
45
+
46
+ [components.merge_entities]
47
+ factory = "merge_entities"
48
+
49
+ [components.morphologizer]
50
+ factory = "morphologizer"
51
+ extend = false
52
+ label_smoothing = 0.0
53
+ overwrite = true
54
+ scorer = {"@scorers":"spacy.morphologizer_scorer.v1"}
55
+
56
+ [components.morphologizer.model]
57
+ @architectures = "spacy.Tagger.v2"
58
+ nO = null
59
+ normalize = false
60
+
61
+ [components.morphologizer.model.tok2vec]
62
+ @architectures = "spacy-transformers.TransformerListener.v1"
63
+ grad_factor = 1.0
64
+ pooling = {"@layers":"reduce_mean.v1"}
65
+ upstream = "*"
66
+
67
+ [components.ner]
68
+ factory = "ner"
69
+ incorrect_spans_key = null
70
+ moves = null
71
+ scorer = {"@scorers":"spacy.ner_scorer.v1"}
72
+ update_with_oracle_cut_size = 100
73
+
74
+ [components.ner.model]
75
+ @architectures = "spacy.TransitionBasedParser.v2"
76
+ state_type = "ner"
77
+ extra_state_tokens = false
78
+ hidden_width = 64
79
+ maxout_pieces = 2
80
+ use_upper = false
81
+ nO = null
82
+
83
+ [components.ner.model.tok2vec]
84
+ @architectures = "spacy-transformers.TransformerListener.v1"
85
+ grad_factor = 1.0
86
+ pooling = {"@layers":"reduce_mean.v1"}
87
+ upstream = "ner_transformer"
88
+
89
+ [components.ner_transformer]
90
+ factory = "transformer"
91
+ max_batch_items = 4096
92
+ set_extra_annotations = {"@annotation_setters":"spacy-transformers.null_annotation_setter.v1"}
93
+
94
+ [components.ner_transformer.model]
95
+ @architectures = "spacy-transformers.TransformerModel.v3"
96
+ name = "dbmdz/bert-base-german-cased"
97
+ mixed_precision = false
98
+
99
+ [components.ner_transformer.model.get_spans]
100
+ @span_getters = "spacy-transformers.strided_spans.v1"
101
+ window = 128
102
+ stride = 96
103
+
104
+ [components.ner_transformer.model.grad_scaler_config]
105
+
106
+ [components.ner_transformer.model.tokenizer_config]
107
+ use_fast = true
108
+
109
+ [components.ner_transformer.model.transformer_config]
110
+
111
+ [components.parser]
112
+ factory = "parser"
113
+ learn_tokens = false
114
+ min_action_freq = 30
115
+ moves = null
116
+ scorer = {"@scorers":"spacy.parser_scorer.v1"}
117
+ update_with_oracle_cut_size = 100
118
+
119
+ [components.parser.model]
120
+ @architectures = "spacy.TransitionBasedParser.v2"
121
+ state_type = "parser"
122
+ extra_state_tokens = false
123
+ hidden_width = 128
124
+ maxout_pieces = 3
125
+ use_upper = false
126
+ nO = null
127
+
128
+ [components.parser.model.tok2vec]
129
+ @architectures = "spacy-transformers.TransformerListener.v1"
130
+ grad_factor = 1.0
131
+ pooling = {"@layers":"reduce_mean.v1"}
132
+ upstream = "base_transformer"
133
+
134
+ [components.tagger]
135
+ factory = "tagger"
136
+ label_smoothing = 0.0
137
+ neg_prefix = "!"
138
+ overwrite = false
139
+ scorer = {"@scorers":"spacy.tagger_scorer.v1"}
140
+
141
+ [components.tagger.model]
142
+ @architectures = "spacy.Tagger.v2"
143
+ nO = null
144
+ normalize = false
145
+
146
+ [components.tagger.model.tok2vec]
147
+ @architectures = "spacy-transformers.TransformerListener.v1"
148
+ grad_factor = 1.0
149
+ pooling = {"@layers":"reduce_mean.v1"}
150
+ upstream = "*"
151
+
152
+ [components.trainable_lemmatizer]
153
+ factory = "trainable_lemmatizer"
154
+ backoff = "orth"
155
+ min_tree_freq = 3
156
+ overwrite = false
157
+ scorer = {"@scorers":"spacy.lemmatizer_scorer.v1"}
158
+ top_k = 1
159
+
160
+ [components.trainable_lemmatizer.model]
161
+ @architectures = "spacy.Tagger.v2"
162
+ nO = null
163
+ normalize = false
164
+
165
+ [components.trainable_lemmatizer.model.tok2vec]
166
+ @architectures = "spacy-transformers.TransformerListener.v1"
167
+ grad_factor = 1.0
168
+ pooling = {"@layers":"reduce_mean.v1"}
169
+ upstream = "*"
170
+
171
+ [corpora]
172
+
173
+ [corpora.dev]
174
+ @readers = "spacy.Corpus.v1"
175
+ path = ${paths.dev}
176
+ max_length = 0
177
+ gold_preproc = false
178
+ limit = 0
179
+ augmenter = null
180
+
181
+ [corpora.train]
182
+ @readers = "spacy.Corpus.v1"
183
+ path = ${paths.train}
184
+ max_length = 0
185
+ gold_preproc = false
186
+ limit = 0
187
+ augmenter = null
188
+
189
+ [training]
190
+ accumulate_gradient = 3
191
+ dev_corpus = "corpora.dev"
192
+ train_corpus = "corpora.train"
193
+ seed = ${system.seed}
194
+ gpu_allocator = ${system.gpu_allocator}
195
+ dropout = 0.1
196
+ patience = 4800
197
+ max_epochs = 0
198
+ max_steps = 20000
199
+ eval_frequency = 200
200
+ frozen_components = []
201
+ annotating_components = []
202
+ before_to_disk = null
203
+ before_update = null
204
+
205
+ [training.batcher]
206
+ @batchers = "spacy.batch_by_padded.v1"
207
+ discard_oversize = false
208
+ buffer = 256
209
+ get_length = null
210
+
211
+ [training.batcher.size]
212
+ @schedules = "compounding.v1"
213
+ start = 100
214
+ stop = 1000
215
+ compound = 1.001
216
+ t = 0.0
217
+
218
+ [training.logger]
219
+ @loggers = "spacy.ConsoleLogger.v3"
220
+ progress_bar = "train"
221
+ console_output = true
222
+ output_file = null
223
+
224
+ [training.optimizer]
225
+ @optimizers = "Adam.v1"
226
+ beta1 = 0.9
227
+ beta2 = 0.999
228
+ L2_is_weight_decay = true
229
+ L2 = 0.01
230
+ grad_clip = 1.0
231
+ use_averages = false
232
+ eps = 0.00000001
233
+
234
+ [training.optimizer.learn_rate]
235
+ @schedules = "warmup_linear.v1"
236
+ warmup_steps = 250
237
+ total_steps = 20000
238
+ initial_rate = 0.00005
239
+
240
+ [training.score_weights]
241
+ ents_f = 0.2
242
+ ents_p = 0.0
243
+ ents_r = 0.0
244
+ ents_per_type = null
245
+ pos_acc = 0.1
246
+ morph_acc = 0.1
247
+ morph_per_feat = null
248
+ tag_acc = 0.2
249
+ dep_uas = 0.1
250
+ dep_las = 0.1
251
+ dep_las_per_type = null
252
+ sents_p = null
253
+ sents_r = null
254
+ sents_f = 0.0
255
+ lemma_acc = 0.2
256
+
257
+ [pretraining]
258
+
259
+ [initialize]
260
+ vectors = ${paths.vectors}
261
+ init_tok2vec = ${paths.init_tok2vec}
262
+ vocab_data = null
263
+ lookups = null
264
+ before_init = null
265
+ after_init = null
266
+
267
+ [initialize.components]
268
+
269
+ [initialize.tokenizer]
fr_trf_nrp-any-py3-none-any.whl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:21aa8991ffd5c4f6327d8c41b4deef58c781fb3c5575f8ae7169e89f6a477854
3
+ size 822747526
meta.json ADDED
@@ -0,0 +1,619 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "lang":"fr",
3
+ "name":"trf_nrp",
4
+ "version":"0.0.0",
5
+ "description":"",
6
+ "author":"",
7
+ "email":"",
8
+ "url":"",
9
+ "license":"",
10
+ "spacy_version":">=3.8.3,<3.9.0",
11
+ "spacy_git_version":"be0fa81",
12
+ "vectors":{
13
+ "width":0,
14
+ "vectors":0,
15
+ "keys":0,
16
+ "name":null
17
+ },
18
+ "labels":{
19
+ "ner_transformer":[
20
+
21
+ ],
22
+ "ner":[
23
+ "LOC",
24
+ "ORG",
25
+ "PER"
26
+ ],
27
+ "base_transformer":[
28
+
29
+ ],
30
+ "morphologizer":[
31
+ "POS=PROPN",
32
+ "Gender=Fem|Number=Sing|POS=DET|PronType=Dem",
33
+ "Gender=Fem|Number=Sing|POS=NOUN",
34
+ "Number=Plur|POS=PRON|Person=1|PronType=Prs",
35
+ "Mood=Ind|Number=Sing|POS=VERB|Person=3|Tense=Pres|VerbForm=Fin",
36
+ "POS=SCONJ",
37
+ "POS=ADP",
38
+ "Definite=Def|Gender=Masc|Number=Sing|POS=DET|PronType=Art",
39
+ "NumType=Ord|POS=ADJ",
40
+ "Gender=Masc|Number=Sing|POS=NOUN",
41
+ "POS=PUNCT",
42
+ "Gender=Masc|Number=Sing|POS=PROPN",
43
+ "Number=Plur|POS=ADJ",
44
+ "Gender=Masc|Number=Plur|POS=NOUN",
45
+ "Definite=Ind|Gender=Fem|Number=Sing|POS=DET|PronType=Art",
46
+ "Number=Sing|POS=ADJ",
47
+ "Mood=Ind|Number=Sing|POS=VERB|Person=3|Tense=Imp|VerbForm=Fin",
48
+ "POS=ADV",
49
+ "Mood=Ind|Number=Sing|POS=AUX|Person=3|Tense=Past|VerbForm=Fin",
50
+ "Gender=Fem|Number=Sing|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass",
51
+ "Definite=Def|Gender=Fem|Number=Sing|POS=DET|PronType=Art",
52
+ "Gender=Fem|Number=Sing|POS=PROPN",
53
+ "Definite=Def|Number=Sing|POS=DET|PronType=Art",
54
+ "NumType=Card|POS=NUM",
55
+ "Definite=Def|Number=Plur|POS=DET|PronType=Art",
56
+ "Gender=Masc|Number=Plur|POS=ADJ",
57
+ "POS=CCONJ",
58
+ "Gender=Fem|Number=Plur|POS=NOUN",
59
+ "Mood=Ind|Number=Plur|POS=VERB|Person=3|Tense=Past|VerbForm=Fin",
60
+ "Gender=Masc|Number=Sing|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass",
61
+ "Gender=Fem|Number=Plur|POS=ADJ",
62
+ "POS=ADJ",
63
+ "Mood=Ind|Number=Sing|POS=VERB|Person=3|Tense=Past|VerbForm=Fin",
64
+ "POS=PRON|PronType=Rel",
65
+ "ExtPos=CCONJ|POS=CCONJ",
66
+ "Number=Sing|POS=DET|Poss=Yes",
67
+ "Definite=Def|Gender=Masc|Number=Sing|POS=ADP|PronType=Art",
68
+ "ExtPos=ADV|POS=ADV",
69
+ "Definite=Def|Number=Plur|POS=ADP|PronType=Art",
70
+ "Definite=Ind|Number=Plur|POS=DET|PronType=Art",
71
+ "Mood=Ind|Number=Plur|POS=AUX|Person=3|Tense=Past|VerbForm=Fin",
72
+ "Gender=Masc|Number=Plur|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass",
73
+ "Mood=Ind|Number=Sing|POS=AUX|Person=3|Tense=Pres|VerbForm=Fin",
74
+ "Gender=Masc|Number=Sing|POS=VERB|Tense=Past|VerbForm=Part",
75
+ "POS=VERB|VerbForm=Inf",
76
+ "Gender=Fem|Number=Sing|POS=ADJ",
77
+ "Gender=Masc|Number=Sing|POS=PRON|Person=3|PronType=Prs",
78
+ "Number=Plur|POS=DET",
79
+ "Mood=Ind|Number=Plur|POS=AUX|Person=3|Tense=Pres|VerbForm=Fin",
80
+ "Gender=Masc|Number=Sing|POS=ADJ",
81
+ "Gender=Masc|Number=Sing|POS=DET|PronType=Dem",
82
+ "POS=ADV|PronType=Int",
83
+ "ExtPos=SCONJ|POS=SCONJ",
84
+ "POS=VERB|Tense=Pres|VerbForm=Part",
85
+ "Definite=Ind|Gender=Masc|Number=Sing|POS=DET|PronType=Art",
86
+ "Gender=Masc|POS=ADJ",
87
+ "Mood=Ind|Number=Plur|POS=VERB|Person=3|Tense=Fut|VerbForm=Fin",
88
+ "Number=Plur|POS=DET|Poss=Yes",
89
+ "POS=AUX|VerbForm=Inf",
90
+ "Gender=Masc|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass",
91
+ "POS=ADV|Polarity=Neg",
92
+ "Definite=Ind|Number=Sing|POS=DET|PronType=Art",
93
+ "Gender=Fem|Number=Sing|POS=PRON|Person=3|PronType=Prs",
94
+ "POS=PRON|Person=3|PronType=Prs|Reflex=Yes",
95
+ "Gender=Masc|POS=NOUN",
96
+ "POS=AUX|Tense=Past|VerbForm=Part",
97
+ "POS=PRON|Person=3|PronType=Prs",
98
+ "Number=Plur|POS=NOUN",
99
+ "ExtPos=ADV|POS=ADP",
100
+ "NumType=Ord|Number=Sing|POS=ADJ",
101
+ "ExtPos=ADP|POS=ADV|Polarity=Neg",
102
+ "POS=VERB|Tense=Past|VerbForm=Part",
103
+ "POS=AUX|Tense=Pres|VerbForm=Part",
104
+ "Number=Sing|POS=PRON|Person=3|PronType=Dem",
105
+ "Number=Sing|POS=NOUN",
106
+ "Gender=Masc|Number=Plur|POS=VERB|Tense=Past|VerbForm=Part",
107
+ "Gender=Masc|Number=Plur|POS=PRON|Person=3|PronType=Prs",
108
+ "Mood=Ind|Number=Plur|POS=VERB|Person=3|Tense=Imp|VerbForm=Fin",
109
+ "Gender=Fem|NumType=Ord|Number=Sing|POS=ADJ",
110
+ "Number=Plur|POS=PROPN",
111
+ "Number=Sing|POS=PROPN",
112
+ "Mood=Ind|Number=Sing|POS=AUX|Person=3|Tense=Imp|VerbForm=Fin",
113
+ "Mood=Ind|Number=Plur|POS=VERB|Person=3|Tense=Pres|VerbForm=Fin",
114
+ "Gender=Masc|Number=Plur|POS=PRON|Person=3|PronType=Dem",
115
+ "Gender=Masc|POS=VERB|Tense=Past|VerbForm=Part",
116
+ "Gender=Masc|Number=Sing|POS=DET",
117
+ "Gender=Fem|Number=Sing|POS=DET|Poss=Yes",
118
+ "Gender=Masc|Number=Sing|POS=PRON|Person=3|PronType=Ind",
119
+ "POS=NOUN",
120
+ "Mood=Ind|Number=Sing|POS=VERB|Person=3|Tense=Fut|VerbForm=Fin",
121
+ "ExtPos=ADP|Gender=Fem|Number=Sing|POS=NOUN",
122
+ "Mood=Ind|Number=Sing|POS=AUX|Person=3|Tense=Fut|VerbForm=Fin",
123
+ "Mood=Ind|Number=Plur|POS=VERB|Person=1|Tense=Pres|VerbForm=Fin",
124
+ "ExtPos=PRON|POS=ADV",
125
+ "Number=Plur|POS=PRON|Person=3|PronType=Ind",
126
+ "Gender=Masc|NumType=Ord|Number=Plur|POS=ADJ",
127
+ "ExtPos=ADP|Gender=Masc|Number=Sing|POS=PRON|Person=3|PronType=Prs",
128
+ "Mood=Ind|Number=Plur|POS=AUX|Person=3|Tense=Fut|VerbForm=Fin",
129
+ "Gender=Fem|Number=Plur|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass",
130
+ "Number=Sing|POS=PRON|PronType=Neg",
131
+ "Number=Sing|POS=PRON|Person=3|PronType=Prs",
132
+ "Number=Sing|POS=PRON|Person=3|PronType=Ind",
133
+ "Mood=Ind|POS=VERB|VerbForm=Fin",
134
+ "Number=Plur|POS=DET|PronType=Dem",
135
+ "Gender=Masc|Number=Plur|POS=PRON|Person=3|PronType=Ind",
136
+ "ExtPos=ADP|POS=ADP",
137
+ "Gender=Masc|Number=Sing|POS=PRON|Person=3|PronType=Dem",
138
+ "Number=Sing|POS=PRON|Person=2|PronType=Prs",
139
+ "Gender=Masc|Number=Sing|POS=PRON|PronType=Rel",
140
+ "Mood=Ind|Number=Plur|POS=AUX|Person=3|Tense=Imp|VerbForm=Fin",
141
+ "ExtPos=ADJ|POS=CCONJ",
142
+ "Mood=Sub|Number=Sing|POS=AUX|Person=3|Tense=Pres|VerbForm=Fin",
143
+ "Definite=Ind|ExtPos=ADV|Gender=Masc|Number=Sing|POS=DET|PronType=Art",
144
+ "Gender=Masc|NumType=Ord|Number=Sing|POS=ADJ",
145
+ "POS=NUM",
146
+ "Gender=Fem|POS=NOUN",
147
+ "Number=Plur|POS=PRON|Person=3|PronType=Prs",
148
+ "Gender=Masc|Number=Sing|POS=DET|Polarity=Neg",
149
+ "Number=Plur|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass",
150
+ "Number=Sing|POS=PRON|Person=1|PronType=Prs",
151
+ "Mood=Ind|Number=Sing|POS=VERB|Person=1|Tense=Pres|VerbForm=Fin",
152
+ "Mood=Sub|Number=Sing|POS=VERB|Person=3|Tense=Past|VerbForm=Fin",
153
+ "Gender=Fem|Number=Sing|POS=PRON|Person=3|PronType=Prs|Reflex=Yes",
154
+ "Gender=Fem|Number=Sing|POS=PRON|Person=3|PronType=Ind",
155
+ "Definite=Def|ExtPos=ADV|Gender=Masc|Number=Sing|POS=ADP|PronType=Art",
156
+ "Mood=Sub|Number=Sing|POS=VERB|Person=3|Tense=Pres|VerbForm=Fin",
157
+ "Gender=Fem|Number=Sing|POS=VERB|Tense=Past|VerbForm=Part",
158
+ "POS=INTJ",
159
+ "Number=Plur|POS=PRON|Person=2|PronType=Prs",
160
+ "ExtPos=SCONJ|POS=ADV",
161
+ "ExtPos=DET|POS=ADP",
162
+ "Definite=Ind|Gender=Fem|Number=Plur|POS=DET|PronType=Art",
163
+ "Gender=Fem|Number=Plur|POS=VERB|Tense=Past|VerbForm=Part",
164
+ "NumType=Card|POS=NOUN",
165
+ "Gender=Fem|Number=Sing|POS=VERB|Tense=Past|Typo=Yes|VerbForm=Part|Voice=Pass",
166
+ "POS=PRON|PronType=Int",
167
+ "Gender=Fem|Number=Plur|POS=PRON|Person=3|PronType=Prs",
168
+ "Gender=Fem|Number=Sing|POS=DET",
169
+ "Gender=Masc|Number=Sing|POS=NOUN|Typo=Yes",
170
+ "Mood=Cnd|Number=Sing|POS=AUX|Person=3|Tense=Pres|VerbForm=Fin",
171
+ "Gender=Fem|Number=Plur|POS=DET",
172
+ "Mood=Sub|Number=Plur|POS=VERB|Person=3|Tense=Pres|VerbForm=Fin",
173
+ "Definite=Ind|Gender=Masc|Number=Plur|POS=DET|PronType=Art",
174
+ "Mood=Cnd|Number=Sing|POS=VERB|Person=3|Tense=Pres|VerbForm=Fin",
175
+ "Gender=Masc|Number=Plur|POS=PROPN",
176
+ "Mood=Cnd|Number=Plur|POS=VERB|Person=3|Tense=Pres|VerbForm=Fin",
177
+ "Gender=Fem|Number=Sing|POS=PRON|Person=3|PronType=Dem",
178
+ "Number=Sing|POS=DET",
179
+ "Gender=Masc|NumType=Card|Number=Plur|POS=NOUN",
180
+ "Gender=Fem|Number=Plur|POS=PRON|Person=3|PronType=Dem",
181
+ "Mood=Ind|POS=VERB|Person=3|Tense=Pres|VerbForm=Fin",
182
+ "Gender=Fem|Number=Sing|POS=PRON|PronType=Rel",
183
+ "ExtPos=CCONJ|POS=ADV",
184
+ "Gender=Masc|Number=Plur|POS=PRON|Person=3|PronType=Prs|Reflex=Yes",
185
+ "Gender=Fem|Number=Sing|POS=NOUN|Typo=Yes",
186
+ "ExtPos=ADP|Number=Sing|POS=PRON|Person=3|PronType=Prs",
187
+ "Mood=Ind|Number=Sing|POS=AUX|Person=1|Tense=Imp|VerbForm=Fin",
188
+ "Mood=Cnd|Number=Plur|POS=VERB|Person=1|Tense=Pres|VerbForm=Fin",
189
+ "Gender=Fem|Number=Sing|POS=DET|Polarity=Neg",
190
+ "ExtPos=CCONJ|POS=ADP",
191
+ "Definite=Def|ExtPos=ADV|Gender=Masc|Number=Sing|POS=DET|PronType=Art",
192
+ "Mood=Ind|Number=Sing|POS=AUX|Person=1|Tense=Pres|VerbForm=Fin",
193
+ "Gender=Masc|Number=Sing|POS=AUX|Tense=Past|VerbForm=Part",
194
+ "Foreign=Yes|POS=X",
195
+ "POS=SYM",
196
+ "Mood=Imp|Number=Plur|POS=VERB|Person=2|Tense=Pres|VerbForm=Fin",
197
+ "Mood=Ind|Number=Plur|POS=VERB|Person=2|Tense=Pres|VerbForm=Fin",
198
+ "Gender=Masc|Number=Sing|POS=DET|PronType=Int",
199
+ "Gender=Fem|Number=Plur|POS=DET|PronType=Int",
200
+ "POS=DET",
201
+ "Gender=Masc|Number=Plur|POS=PRON|PronType=Rel",
202
+ "Definite=Ind|ExtPos=ADV|Gender=Fem|Number=Sing|POS=DET|PronType=Art",
203
+ "Mood=Sub|Number=Plur|POS=AUX|Person=3|Tense=Pres|VerbForm=Fin",
204
+ "Mood=Ind|POS=VERB|Person=3|VerbForm=Fin",
205
+ "ExtPos=DET|POS=ADV|Polarity=Neg",
206
+ "Number=Sing|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass",
207
+ "POS=ADJ|Typo=Yes",
208
+ "POS=X",
209
+ "ExtPos=SCONJ|POS=ADP",
210
+ "ExtPos=ADJ|POS=X",
211
+ "ExtPos=ADJ|POS=ADP",
212
+ "POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass",
213
+ "ExtPos=CCONJ|POS=SYM",
214
+ "Mood=Cnd|Number=Plur|POS=VERB|Person=2|Tense=Pres|VerbForm=Fin",
215
+ "Mood=Ind|Number=Plur|POS=AUX|Person=2|Tense=Pres|VerbForm=Fin",
216
+ "Gender=Fem|Number=Sing|POS=DET|PronType=Int",
217
+ "Gender=Masc|Number=Plur|POS=DET",
218
+ "Gender=Fem|Number=Plur|POS=PRON|PronType=Rel",
219
+ "ExtPos=ADV|Gender=Masc|Number=Sing|POS=NOUN",
220
+ "ExtPos=ADP|POS=PRON|Person=3|PronType=Prs",
221
+ "Gender=Masc|Number=Sing|POS=VERB|Tense=Past|Typo=Yes|VerbForm=Part|Voice=Pass",
222
+ "Mood=Ind|Number=Sing|POS=AUX|Person=3|Tense=Imp|Typo=Yes|VerbForm=Fin",
223
+ "Gender=Fem|NumType=Ord|Number=Plur|POS=ADJ",
224
+ "Mood=Ind|Number=Plur|POS=VERB|Person=2|Tense=Fut|VerbForm=Fin",
225
+ "Mood=Imp|POS=VERB|Tense=Pres|VerbForm=Fin",
226
+ "Gender=Fem|Number=Plur|POS=PRON|Person=3|PronType=Ind",
227
+ "Number=Plur|POS=PRON|Person=2|PronType=Prs|Reflex=Yes",
228
+ "Mood=Cnd|Number=Sing|POS=VERB|Person=1|Tense=Pres|VerbForm=Fin",
229
+ "Number=Plur|POS=PRON|Person=1|PronType=Prs|Reflex=Yes",
230
+ "Gender=Masc|NumType=Card|Number=Sing|POS=NOUN",
231
+ "ExtPos=PRON|POS=ADP",
232
+ "Mood=Ind|Number=Plur|POS=AUX|Person=1|Tense=Pres|VerbForm=Fin",
233
+ "Mood=Ind|Number=Plur|POS=AUX|Person=1|Tense=Fut|VerbForm=Fin",
234
+ "Mood=Ind|Number=Plur|POS=VERB|Person=1|Tense=Fut|VerbForm=Fin",
235
+ "Number=Sing|POS=PRON|Person=1|PronType=Prs|Reflex=Yes",
236
+ "Mood=Ind|Number=Plur|POS=VERB|Person=1|Tense=Imp|VerbForm=Fin",
237
+ "Mood=Ind|Number=Plur|POS=AUX|Person=1|Tense=Imp|VerbForm=Fin",
238
+ "ExtPos=ADV|Gender=Masc|Number=Sing|POS=PRON|Person=3|PronType=Prs",
239
+ "Mood=Ind|Number=Sing|POS=VERB|Person=1|Tense=Imp|VerbForm=Fin",
240
+ "Mood=Ind|Number=Plur|POS=VERB|Person=3|Tense=Pres|Typo=Yes|VerbForm=Fin",
241
+ "Mood=Sub|Number=Sing|POS=VERB|Person=1|Tense=Pres|VerbForm=Fin",
242
+ "ExtPos=ADV|POS=ADV|Polarity=Neg",
243
+ "Gender=Masc|Number=Sing|POS=PRON|PronType=Neg",
244
+ "ExtPos=ADV|Gender=Masc|Number=Sing|POS=ADJ",
245
+ "ExtPos=ADV|Number=Sing|POS=PRON|Person=3|PronType=Dem",
246
+ "ExtPos=PRON|POS=PRON|PronType=Rel",
247
+ "Gender=Fem|Number=Plur|POS=PRON|Person=3|PronType=Prs|Typo=Yes",
248
+ "Gender=Masc|POS=PROPN",
249
+ "Mood=Cnd|Number=Plur|POS=AUX|Person=3|Tense=Pres|VerbForm=Fin",
250
+ "Mood=Ind|Number=Plur|POS=AUX|Person=3|Tense=Pres|Typo=Yes|VerbForm=Fin",
251
+ "Gender=Masc|Number=Plur|POS=VERB|Tense=Past|Typo=Yes|VerbForm=Part|Voice=Pass",
252
+ "Mood=Sub|Number=Sing|POS=AUX|Person=1|Tense=Pres|VerbForm=Fin",
253
+ "ExtPos=SCONJ|Gender=Masc|Number=Sing|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass",
254
+ "Mood=Ind|Number=Sing|POS=VERB|Person=1|Tense=Fut|VerbForm=Fin",
255
+ "ExtPos=ADV|POS=X",
256
+ "Mood=Cnd|Number=Sing|POS=AUX|Person=1|Tense=Pres|VerbForm=Fin",
257
+ "ExtPos=CCONJ|POS=PRON|Person=3|PronType=Prs",
258
+ "Mood=Sub|Number=Sing|POS=AUX|Person=3|Tense=Pres|Typo=Yes|VerbForm=Fin",
259
+ "Mood=Sub|Number=Plur|POS=AUX|Person=1|Tense=Pres|VerbForm=Fin",
260
+ "ExtPos=INTJ|POS=INTJ",
261
+ "Mood=Imp|Number=Plur|POS=VERB|Person=1|Tense=Pres|VerbForm=Fin",
262
+ "Mood=Sub|Number=Plur|POS=AUX|Person=2|Tense=Pres|VerbForm=Fin",
263
+ "Number=Sing|POS=VERB|Tense=Past|VerbForm=Part",
264
+ "Mood=Ind|Number=Plur|POS=VERB|Person=2|Tense=Imp|VerbForm=Fin",
265
+ "Number=Sing|POS=PRON|PronType=Rel",
266
+ "Gender=Fem|Number=Plur|Number[psor]=Plur|POS=PRON|Person=3|Person[psor]=1|Poss=Yes|PronType=Prs",
267
+ "ExtPos=PROPN|Gender=Masc|Number=Sing|POS=NOUN",
268
+ "ExtPos=PROPN|Gender=Masc|Number=Plur|POS=NOUN",
269
+ "Definite=Def|Gender=Masc|Number=Sing|POS=ADP|PronType=Art|Typo=Yes",
270
+ "ExtPos=PROPN|Gender=Fem|Number=Sing|POS=NOUN",
271
+ "ExtPos=ADV|POS=PRON|Person=3|PronType=Prs",
272
+ "Gender=Fem|Number=Plur|POS=PROPN",
273
+ "Gender=Fem|Number=Plur|POS=VERB|Tense=Past|Typo=Yes|VerbForm=Part|Voice=Pass",
274
+ "ExtPos=ADJ|Gender=Masc|NumType=Card|POS=NUM"
275
+ ],
276
+ "tagger":[
277
+ "ADJ",
278
+ "ADJ__ExtPos=ADV|Gender=Masc|Number=Sing",
279
+ "ADJ__Gender=Fem|Number=Plur",
280
+ "ADJ__Gender=Fem|Number=Plur|NumType=Ord",
281
+ "ADJ__Gender=Fem|Number=Sing",
282
+ "ADJ__Gender=Fem|Number=Sing|NumType=Ord",
283
+ "ADJ__Gender=Masc",
284
+ "ADJ__Gender=Masc|Number=Plur",
285
+ "ADJ__Gender=Masc|Number=Plur|NumType=Ord",
286
+ "ADJ__Gender=Masc|Number=Sing",
287
+ "ADJ__Gender=Masc|Number=Sing|NumType=Ord",
288
+ "ADJ__NumType=Ord",
289
+ "ADJ__Number=Plur",
290
+ "ADJ__Number=Sing",
291
+ "ADJ__Number=Sing|NumType=Ord",
292
+ "ADJ__Typo=Yes",
293
+ "ADP",
294
+ "ADP_DET__Definite=Def|ExtPos=ADV|Gender=Masc|Number=Sing|PronType=Art",
295
+ "ADP_DET__Definite=Def|Gender=Masc|Number=Sing|PronType=Art",
296
+ "ADP_DET__Definite=Def|Gender=Masc|Number=Sing|PronType=Art|Typo=Yes",
297
+ "ADP_DET__Definite=Def|Number=Plur|PronType=Art",
298
+ "ADP_PRON__Gender=Fem|Number=Plur|PronType=Rel",
299
+ "ADP_PRON__Gender=Masc|Number=Plur|PronType=Rel",
300
+ "ADP_PRON__Gender=Masc|Number=Sing|PronType=Rel",
301
+ "ADP__ExtPos=ADJ",
302
+ "ADP__ExtPos=ADP",
303
+ "ADP__ExtPos=ADV",
304
+ "ADP__ExtPos=CCONJ",
305
+ "ADP__ExtPos=DET",
306
+ "ADP__ExtPos=PRON",
307
+ "ADP__ExtPos=SCONJ",
308
+ "ADV",
309
+ "ADV__ExtPos=ADP|Polarity=Neg",
310
+ "ADV__ExtPos=ADV",
311
+ "ADV__ExtPos=ADV|Polarity=Neg",
312
+ "ADV__ExtPos=CCONJ",
313
+ "ADV__ExtPos=DET|Polarity=Neg",
314
+ "ADV__ExtPos=PRON",
315
+ "ADV__ExtPos=SCONJ",
316
+ "ADV__Polarity=Neg",
317
+ "ADV__PronType=Int",
318
+ "AUX__Gender=Masc|Number=Sing|Tense=Past|VerbForm=Part",
319
+ "AUX__Mood=Cnd|Number=Plur|Person=3|Tense=Pres|VerbForm=Fin",
320
+ "AUX__Mood=Cnd|Number=Sing|Person=1|Tense=Pres|VerbForm=Fin",
321
+ "AUX__Mood=Cnd|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin",
322
+ "AUX__Mood=Ind|Number=Plur|Person=1|Tense=Fut|VerbForm=Fin",
323
+ "AUX__Mood=Ind|Number=Plur|Person=1|Tense=Imp|VerbForm=Fin",
324
+ "AUX__Mood=Ind|Number=Plur|Person=1|Tense=Pres|VerbForm=Fin",
325
+ "AUX__Mood=Ind|Number=Plur|Person=2|Tense=Pres|VerbForm=Fin",
326
+ "AUX__Mood=Ind|Number=Plur|Person=3|Tense=Fut|VerbForm=Fin",
327
+ "AUX__Mood=Ind|Number=Plur|Person=3|Tense=Imp|VerbForm=Fin",
328
+ "AUX__Mood=Ind|Number=Plur|Person=3|Tense=Past|VerbForm=Fin",
329
+ "AUX__Mood=Ind|Number=Plur|Person=3|Tense=Pres|Typo=Yes|VerbForm=Fin",
330
+ "AUX__Mood=Ind|Number=Plur|Person=3|Tense=Pres|VerbForm=Fin",
331
+ "AUX__Mood=Ind|Number=Sing|Person=1|Tense=Imp|VerbForm=Fin",
332
+ "AUX__Mood=Ind|Number=Sing|Person=1|Tense=Pres|VerbForm=Fin",
333
+ "AUX__Mood=Ind|Number=Sing|Person=3|Tense=Fut|VerbForm=Fin",
334
+ "AUX__Mood=Ind|Number=Sing|Person=3|Tense=Imp|Typo=Yes|VerbForm=Fin",
335
+ "AUX__Mood=Ind|Number=Sing|Person=3|Tense=Imp|VerbForm=Fin",
336
+ "AUX__Mood=Ind|Number=Sing|Person=3|Tense=Past|VerbForm=Fin",
337
+ "AUX__Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin",
338
+ "AUX__Mood=Sub|Number=Plur|Person=1|Tense=Pres|VerbForm=Fin",
339
+ "AUX__Mood=Sub|Number=Plur|Person=2|Tense=Pres|VerbForm=Fin",
340
+ "AUX__Mood=Sub|Number=Plur|Person=3|Tense=Pres|VerbForm=Fin",
341
+ "AUX__Mood=Sub|Number=Sing|Person=1|Tense=Pres|VerbForm=Fin",
342
+ "AUX__Mood=Sub|Number=Sing|Person=3|Tense=Pres|Typo=Yes|VerbForm=Fin",
343
+ "AUX__Mood=Sub|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin",
344
+ "AUX__Tense=Past|VerbForm=Part",
345
+ "AUX__Tense=Pres|VerbForm=Part",
346
+ "AUX__VerbForm=Inf",
347
+ "CCONJ",
348
+ "CCONJ__ExtPos=ADJ",
349
+ "CCONJ__ExtPos=CCONJ",
350
+ "DET",
351
+ "DET__Definite=Def|ExtPos=ADV|Gender=Masc|Number=Sing|PronType=Art",
352
+ "DET__Definite=Def|Gender=Fem|Number=Sing|PronType=Art",
353
+ "DET__Definite=Def|Gender=Masc|Number=Sing|PronType=Art",
354
+ "DET__Definite=Def|Number=Plur|PronType=Art",
355
+ "DET__Definite=Def|Number=Sing|PronType=Art",
356
+ "DET__Definite=Ind|ExtPos=ADV|Gender=Fem|Number=Sing|PronType=Art",
357
+ "DET__Definite=Ind|ExtPos=ADV|Gender=Masc|Number=Sing|PronType=Art",
358
+ "DET__Definite=Ind|Gender=Fem|Number=Plur|PronType=Art",
359
+ "DET__Definite=Ind|Gender=Fem|Number=Sing|PronType=Art",
360
+ "DET__Definite=Ind|Gender=Masc|Number=Plur|PronType=Art",
361
+ "DET__Definite=Ind|Gender=Masc|Number=Sing|PronType=Art",
362
+ "DET__Definite=Ind|Number=Plur|PronType=Art",
363
+ "DET__Definite=Ind|Number=Sing|PronType=Art",
364
+ "DET__Gender=Fem|Number=Plur",
365
+ "DET__Gender=Fem|Number=Plur|PronType=Int",
366
+ "DET__Gender=Fem|Number=Sing",
367
+ "DET__Gender=Fem|Number=Sing|Polarity=Neg",
368
+ "DET__Gender=Fem|Number=Sing|Poss=Yes",
369
+ "DET__Gender=Fem|Number=Sing|PronType=Dem",
370
+ "DET__Gender=Fem|Number=Sing|PronType=Int",
371
+ "DET__Gender=Masc|Number=Plur",
372
+ "DET__Gender=Masc|Number=Sing",
373
+ "DET__Gender=Masc|Number=Sing|Polarity=Neg",
374
+ "DET__Gender=Masc|Number=Sing|PronType=Dem",
375
+ "DET__Gender=Masc|Number=Sing|PronType=Int",
376
+ "DET__Number=Plur",
377
+ "DET__Number=Plur|Poss=Yes",
378
+ "DET__Number=Plur|PronType=Dem",
379
+ "DET__Number=Sing",
380
+ "DET__Number=Sing|Poss=Yes",
381
+ "INTJ",
382
+ "INTJ__ExtPos=INTJ",
383
+ "NOUN",
384
+ "NOUN__ExtPos=ADP|Gender=Fem|Number=Sing",
385
+ "NOUN__ExtPos=ADV|Gender=Masc|Number=Sing",
386
+ "NOUN__ExtPos=PROPN|Gender=Fem|Number=Sing",
387
+ "NOUN__ExtPos=PROPN|Gender=Masc|Number=Plur",
388
+ "NOUN__ExtPos=PROPN|Gender=Masc|Number=Sing",
389
+ "NOUN__Gender=Fem",
390
+ "NOUN__Gender=Fem|Number=Plur",
391
+ "NOUN__Gender=Fem|Number=Sing",
392
+ "NOUN__Gender=Fem|Number=Sing|Typo=Yes",
393
+ "NOUN__Gender=Masc",
394
+ "NOUN__Gender=Masc|Number=Plur",
395
+ "NOUN__Gender=Masc|Number=Plur|NumType=Card",
396
+ "NOUN__Gender=Masc|Number=Sing",
397
+ "NOUN__Gender=Masc|Number=Sing|NumType=Card",
398
+ "NOUN__Gender=Masc|Number=Sing|Typo=Yes",
399
+ "NOUN__NumType=Card",
400
+ "NOUN__Number=Plur",
401
+ "NOUN__Number=Sing",
402
+ "NUM",
403
+ "NUM__ExtPos=ADJ|Gender=Masc|NumType=Card",
404
+ "NUM__NumType=Card",
405
+ "PRON__ExtPos=ADP|Gender=Masc|Number=Sing|Person=3|PronType=Prs",
406
+ "PRON__ExtPos=ADP|Number=Sing|Person=3|PronType=Prs",
407
+ "PRON__ExtPos=ADP|Person=3|PronType=Prs",
408
+ "PRON__ExtPos=ADV|Gender=Masc|Number=Sing|Person=3|PronType=Prs",
409
+ "PRON__ExtPos=ADV|Number=Sing|Person=3|PronType=Dem",
410
+ "PRON__ExtPos=ADV|Person=3|PronType=Prs",
411
+ "PRON__ExtPos=CCONJ|Person=3|PronType=Prs",
412
+ "PRON__ExtPos=PRON|PronType=Rel",
413
+ "PRON__Gender=Fem|Number=Plur|Number[psor]=Plur|Person=3|Person[psor]=1|Poss=Yes|PronType=Prs",
414
+ "PRON__Gender=Fem|Number=Plur|Person=3|PronType=Dem",
415
+ "PRON__Gender=Fem|Number=Plur|Person=3|PronType=Ind",
416
+ "PRON__Gender=Fem|Number=Plur|Person=3|PronType=Prs",
417
+ "PRON__Gender=Fem|Number=Plur|Person=3|PronType=Prs|Typo=Yes",
418
+ "PRON__Gender=Fem|Number=Plur|PronType=Rel",
419
+ "PRON__Gender=Fem|Number=Sing|Person=3|PronType=Dem",
420
+ "PRON__Gender=Fem|Number=Sing|Person=3|PronType=Ind",
421
+ "PRON__Gender=Fem|Number=Sing|Person=3|PronType=Prs",
422
+ "PRON__Gender=Fem|Number=Sing|Person=3|PronType=Prs|Reflex=Yes",
423
+ "PRON__Gender=Fem|Number=Sing|PronType=Rel",
424
+ "PRON__Gender=Masc|Number=Plur|Person=3|PronType=Dem",
425
+ "PRON__Gender=Masc|Number=Plur|Person=3|PronType=Ind",
426
+ "PRON__Gender=Masc|Number=Plur|Person=3|PronType=Prs",
427
+ "PRON__Gender=Masc|Number=Plur|Person=3|PronType=Prs|Reflex=Yes",
428
+ "PRON__Gender=Masc|Number=Plur|PronType=Rel",
429
+ "PRON__Gender=Masc|Number=Sing|Person=3|PronType=Dem",
430
+ "PRON__Gender=Masc|Number=Sing|Person=3|PronType=Ind",
431
+ "PRON__Gender=Masc|Number=Sing|Person=3|PronType=Prs",
432
+ "PRON__Gender=Masc|Number=Sing|PronType=Neg",
433
+ "PRON__Gender=Masc|Number=Sing|PronType=Rel",
434
+ "PRON__Number=Plur|Person=1|PronType=Prs",
435
+ "PRON__Number=Plur|Person=1|PronType=Prs|Reflex=Yes",
436
+ "PRON__Number=Plur|Person=2|PronType=Prs",
437
+ "PRON__Number=Plur|Person=2|PronType=Prs|Reflex=Yes",
438
+ "PRON__Number=Plur|Person=3|PronType=Ind",
439
+ "PRON__Number=Plur|Person=3|PronType=Prs",
440
+ "PRON__Number=Sing|Person=1|PronType=Prs",
441
+ "PRON__Number=Sing|Person=1|PronType=Prs|Reflex=Yes",
442
+ "PRON__Number=Sing|Person=2|PronType=Prs",
443
+ "PRON__Number=Sing|Person=3|PronType=Dem",
444
+ "PRON__Number=Sing|Person=3|PronType=Ind",
445
+ "PRON__Number=Sing|Person=3|PronType=Prs",
446
+ "PRON__Number=Sing|PronType=Neg",
447
+ "PRON__Number=Sing|PronType=Rel",
448
+ "PRON__Person=3|PronType=Prs",
449
+ "PRON__Person=3|PronType=Prs|Reflex=Yes",
450
+ "PRON__PronType=Int",
451
+ "PRON__PronType=Rel",
452
+ "PROPN",
453
+ "PROPN__Gender=Fem|Number=Plur",
454
+ "PROPN__Gender=Fem|Number=Sing",
455
+ "PROPN__Gender=Masc",
456
+ "PROPN__Gender=Masc|Number=Plur",
457
+ "PROPN__Gender=Masc|Number=Sing",
458
+ "PROPN__Number=Plur",
459
+ "PROPN__Number=Sing",
460
+ "PUNCT",
461
+ "SCONJ",
462
+ "SCONJ__ExtPos=SCONJ",
463
+ "SYM",
464
+ "SYM__ExtPos=CCONJ",
465
+ "VERB__ExtPos=SCONJ|Gender=Masc|Number=Sing|Tense=Past|VerbForm=Part|Voice=Pass",
466
+ "VERB__Gender=Fem|Number=Plur|Tense=Past|Typo=Yes|VerbForm=Part|Voice=Pass",
467
+ "VERB__Gender=Fem|Number=Plur|Tense=Past|VerbForm=Part",
468
+ "VERB__Gender=Fem|Number=Plur|Tense=Past|VerbForm=Part|Voice=Pass",
469
+ "VERB__Gender=Fem|Number=Sing|Tense=Past|Typo=Yes|VerbForm=Part|Voice=Pass",
470
+ "VERB__Gender=Fem|Number=Sing|Tense=Past|VerbForm=Part",
471
+ "VERB__Gender=Fem|Number=Sing|Tense=Past|VerbForm=Part|Voice=Pass",
472
+ "VERB__Gender=Masc|Number=Plur|Tense=Past|Typo=Yes|VerbForm=Part|Voice=Pass",
473
+ "VERB__Gender=Masc|Number=Plur|Tense=Past|VerbForm=Part",
474
+ "VERB__Gender=Masc|Number=Plur|Tense=Past|VerbForm=Part|Voice=Pass",
475
+ "VERB__Gender=Masc|Number=Sing|Tense=Past|Typo=Yes|VerbForm=Part|Voice=Pass",
476
+ "VERB__Gender=Masc|Number=Sing|Tense=Past|VerbForm=Part",
477
+ "VERB__Gender=Masc|Number=Sing|Tense=Past|VerbForm=Part|Voice=Pass",
478
+ "VERB__Gender=Masc|Tense=Past|VerbForm=Part",
479
+ "VERB__Gender=Masc|Tense=Past|VerbForm=Part|Voice=Pass",
480
+ "VERB__Mood=Cnd|Number=Plur|Person=1|Tense=Pres|VerbForm=Fin",
481
+ "VERB__Mood=Cnd|Number=Plur|Person=2|Tense=Pres|VerbForm=Fin",
482
+ "VERB__Mood=Cnd|Number=Plur|Person=3|Tense=Pres|VerbForm=Fin",
483
+ "VERB__Mood=Cnd|Number=Sing|Person=1|Tense=Pres|VerbForm=Fin",
484
+ "VERB__Mood=Cnd|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin",
485
+ "VERB__Mood=Imp|Number=Plur|Person=1|Tense=Pres|VerbForm=Fin",
486
+ "VERB__Mood=Imp|Number=Plur|Person=2|Tense=Pres|VerbForm=Fin",
487
+ "VERB__Mood=Imp|Tense=Pres|VerbForm=Fin",
488
+ "VERB__Mood=Ind|Number=Plur|Person=1|Tense=Fut|VerbForm=Fin",
489
+ "VERB__Mood=Ind|Number=Plur|Person=1|Tense=Imp|VerbForm=Fin",
490
+ "VERB__Mood=Ind|Number=Plur|Person=1|Tense=Pres|VerbForm=Fin",
491
+ "VERB__Mood=Ind|Number=Plur|Person=2|Tense=Fut|VerbForm=Fin",
492
+ "VERB__Mood=Ind|Number=Plur|Person=2|Tense=Imp|VerbForm=Fin",
493
+ "VERB__Mood=Ind|Number=Plur|Person=2|Tense=Pres|VerbForm=Fin",
494
+ "VERB__Mood=Ind|Number=Plur|Person=3|Tense=Fut|VerbForm=Fin",
495
+ "VERB__Mood=Ind|Number=Plur|Person=3|Tense=Imp|VerbForm=Fin",
496
+ "VERB__Mood=Ind|Number=Plur|Person=3|Tense=Past|VerbForm=Fin",
497
+ "VERB__Mood=Ind|Number=Plur|Person=3|Tense=Pres|Typo=Yes|VerbForm=Fin",
498
+ "VERB__Mood=Ind|Number=Plur|Person=3|Tense=Pres|VerbForm=Fin",
499
+ "VERB__Mood=Ind|Number=Sing|Person=1|Tense=Fut|VerbForm=Fin",
500
+ "VERB__Mood=Ind|Number=Sing|Person=1|Tense=Imp|VerbForm=Fin",
501
+ "VERB__Mood=Ind|Number=Sing|Person=1|Tense=Pres|VerbForm=Fin",
502
+ "VERB__Mood=Ind|Number=Sing|Person=3|Tense=Fut|VerbForm=Fin",
503
+ "VERB__Mood=Ind|Number=Sing|Person=3|Tense=Imp|VerbForm=Fin",
504
+ "VERB__Mood=Ind|Number=Sing|Person=3|Tense=Past|VerbForm=Fin",
505
+ "VERB__Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin",
506
+ "VERB__Mood=Ind|Person=3|Tense=Pres|VerbForm=Fin",
507
+ "VERB__Mood=Ind|Person=3|VerbForm=Fin",
508
+ "VERB__Mood=Ind|VerbForm=Fin",
509
+ "VERB__Mood=Sub|Number=Plur|Person=3|Tense=Pres|VerbForm=Fin",
510
+ "VERB__Mood=Sub|Number=Sing|Person=1|Tense=Pres|VerbForm=Fin",
511
+ "VERB__Mood=Sub|Number=Sing|Person=3|Tense=Past|VerbForm=Fin",
512
+ "VERB__Mood=Sub|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin",
513
+ "VERB__Number=Plur|Tense=Past|VerbForm=Part|Voice=Pass",
514
+ "VERB__Number=Sing|Tense=Past|VerbForm=Part",
515
+ "VERB__Number=Sing|Tense=Past|VerbForm=Part|Voice=Pass",
516
+ "VERB__Tense=Past|VerbForm=Part",
517
+ "VERB__Tense=Past|VerbForm=Part|Voice=Pass",
518
+ "VERB__Tense=Pres|VerbForm=Part",
519
+ "VERB__VerbForm=Inf",
520
+ "X",
521
+ "X__ExtPos=ADJ",
522
+ "X__ExtPos=ADV",
523
+ "X__Foreign=Yes"
524
+ ],
525
+ "parser":[
526
+ "ROOT",
527
+ "acl",
528
+ "acl:relcl",
529
+ "advcl",
530
+ "advmod",
531
+ "amod",
532
+ "appos",
533
+ "aux:pass",
534
+ "aux:tense",
535
+ "case",
536
+ "cc",
537
+ "ccomp",
538
+ "conj",
539
+ "cop",
540
+ "csubj",
541
+ "dep",
542
+ "det",
543
+ "expl:comp",
544
+ "expl:pass",
545
+ "expl:pv",
546
+ "expl:subj",
547
+ "fixed",
548
+ "flat:foreign",
549
+ "flat:name",
550
+ "iobj",
551
+ "mark",
552
+ "nmod",
553
+ "nsubj",
554
+ "nsubj:pass",
555
+ "nummod",
556
+ "obj",
557
+ "obl:agent",
558
+ "obl:arg",
559
+ "obl:mod",
560
+ "parataxis",
561
+ "parataxis:insert",
562
+ "punct",
563
+ "vocative",
564
+ "xcomp"
565
+ ]
566
+ },
567
+ "pipeline":[
568
+ "ner_transformer",
569
+ "ner",
570
+ "merge_entities",
571
+ "base_transformer",
572
+ "morphologizer",
573
+ "tagger",
574
+ "parser",
575
+ "trainable_lemmatizer"
576
+ ],
577
+ "components":[
578
+ "ner_transformer",
579
+ "ner",
580
+ "merge_entities",
581
+ "base_transformer",
582
+ "morphologizer",
583
+ "tagger",
584
+ "parser",
585
+ "trainable_lemmatizer"
586
+ ],
587
+ "disabled":[
588
+
589
+ ],
590
+ "performance":{
591
+ "ents_f":0.973876698,
592
+ "ents_p":0.9688149688,
593
+ "ents_r":0.9789915966,
594
+ "ents_per_type":{
595
+ "ORG":{
596
+ "p":0.9562043796,
597
+ "r":0.9562043796,
598
+ "f":0.9562043796
599
+ },
600
+ "LOC":{
601
+ "p":0.9813084112,
602
+ "r":0.9952606635,
603
+ "f":0.9882352941
604
+ },
605
+ "PER":{
606
+ "p":0.9615384615,
607
+ "r":0.9765625,
608
+ "f":0.9689922481
609
+ }
610
+ },
611
+ "ner_transformer_loss":23.7388525084,
612
+ "ner_loss":99.4598413509
613
+ },
614
+ "requirements":[
615
+ "spacy-transformers>=1.3.5,<1.4.0",
616
+ "spacy>=3.8.3,<3.9.0",
617
+ "spacy>=3.8.3,<3.9.0"
618
+ ]
619
+ }
morphologizer/cfg ADDED
@@ -0,0 +1,497 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "extend":false,
3
+ "label_smoothing":0.0,
4
+ "labels_morph":{
5
+ "POS=PROPN":"",
6
+ "Gender=Fem|Number=Sing|POS=DET|PronType=Dem":"Gender=Fem|Number=Sing|PronType=Dem",
7
+ "Gender=Fem|Number=Sing|POS=NOUN":"Gender=Fem|Number=Sing",
8
+ "Number=Plur|POS=PRON|Person=1|PronType=Prs":"Number=Plur|Person=1|PronType=Prs",
9
+ "Mood=Ind|Number=Sing|POS=VERB|Person=3|Tense=Pres|VerbForm=Fin":"Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin",
10
+ "POS=SCONJ":"",
11
+ "POS=ADP":"",
12
+ "Definite=Def|Gender=Masc|Number=Sing|POS=DET|PronType=Art":"Definite=Def|Gender=Masc|Number=Sing|PronType=Art",
13
+ "NumType=Ord|POS=ADJ":"NumType=Ord",
14
+ "Gender=Masc|Number=Sing|POS=NOUN":"Gender=Masc|Number=Sing",
15
+ "POS=PUNCT":"",
16
+ "Gender=Masc|Number=Sing|POS=PROPN":"Gender=Masc|Number=Sing",
17
+ "Number=Plur|POS=ADJ":"Number=Plur",
18
+ "Gender=Masc|Number=Plur|POS=NOUN":"Gender=Masc|Number=Plur",
19
+ "Definite=Ind|Gender=Fem|Number=Sing|POS=DET|PronType=Art":"Definite=Ind|Gender=Fem|Number=Sing|PronType=Art",
20
+ "Number=Sing|POS=ADJ":"Number=Sing",
21
+ "Mood=Ind|Number=Sing|POS=VERB|Person=3|Tense=Imp|VerbForm=Fin":"Mood=Ind|Number=Sing|Person=3|Tense=Imp|VerbForm=Fin",
22
+ "POS=ADV":"",
23
+ "Mood=Ind|Number=Sing|POS=AUX|Person=3|Tense=Past|VerbForm=Fin":"Mood=Ind|Number=Sing|Person=3|Tense=Past|VerbForm=Fin",
24
+ "Gender=Fem|Number=Sing|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass":"Gender=Fem|Number=Sing|Tense=Past|VerbForm=Part|Voice=Pass",
25
+ "Definite=Def|Gender=Fem|Number=Sing|POS=DET|PronType=Art":"Definite=Def|Gender=Fem|Number=Sing|PronType=Art",
26
+ "Gender=Fem|Number=Sing|POS=PROPN":"Gender=Fem|Number=Sing",
27
+ "Definite=Def|Number=Sing|POS=DET|PronType=Art":"Definite=Def|Number=Sing|PronType=Art",
28
+ "NumType=Card|POS=NUM":"NumType=Card",
29
+ "Definite=Def|Number=Plur|POS=DET|PronType=Art":"Definite=Def|Number=Plur|PronType=Art",
30
+ "Gender=Masc|Number=Plur|POS=ADJ":"Gender=Masc|Number=Plur",
31
+ "POS=CCONJ":"",
32
+ "Gender=Fem|Number=Plur|POS=NOUN":"Gender=Fem|Number=Plur",
33
+ "Mood=Ind|Number=Plur|POS=VERB|Person=3|Tense=Past|VerbForm=Fin":"Mood=Ind|Number=Plur|Person=3|Tense=Past|VerbForm=Fin",
34
+ "Gender=Masc|Number=Sing|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass":"Gender=Masc|Number=Sing|Tense=Past|VerbForm=Part|Voice=Pass",
35
+ "Gender=Fem|Number=Plur|POS=ADJ":"Gender=Fem|Number=Plur",
36
+ "POS=ADJ":"",
37
+ "Mood=Ind|Number=Sing|POS=VERB|Person=3|Tense=Past|VerbForm=Fin":"Mood=Ind|Number=Sing|Person=3|Tense=Past|VerbForm=Fin",
38
+ "POS=PRON|PronType=Rel":"PronType=Rel",
39
+ "ExtPos=CCONJ|POS=CCONJ":"ExtPos=CCONJ",
40
+ "Number=Sing|POS=DET|Poss=Yes":"Number=Sing|Poss=Yes",
41
+ "Definite=Def|Gender=Masc|Number=Sing|POS=ADP|PronType=Art":"Definite=Def|Gender=Masc|Number=Sing|PronType=Art",
42
+ "ExtPos=ADV|POS=ADV":"ExtPos=ADV",
43
+ "Definite=Def|Number=Plur|POS=ADP|PronType=Art":"Definite=Def|Number=Plur|PronType=Art",
44
+ "Definite=Ind|Number=Plur|POS=DET|PronType=Art":"Definite=Ind|Number=Plur|PronType=Art",
45
+ "Mood=Ind|Number=Plur|POS=AUX|Person=3|Tense=Past|VerbForm=Fin":"Mood=Ind|Number=Plur|Person=3|Tense=Past|VerbForm=Fin",
46
+ "Gender=Masc|Number=Plur|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass":"Gender=Masc|Number=Plur|Tense=Past|VerbForm=Part|Voice=Pass",
47
+ "Mood=Ind|Number=Sing|POS=AUX|Person=3|Tense=Pres|VerbForm=Fin":"Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin",
48
+ "Gender=Masc|Number=Sing|POS=VERB|Tense=Past|VerbForm=Part":"Gender=Masc|Number=Sing|Tense=Past|VerbForm=Part",
49
+ "POS=VERB|VerbForm=Inf":"VerbForm=Inf",
50
+ "Gender=Fem|Number=Sing|POS=ADJ":"Gender=Fem|Number=Sing",
51
+ "Gender=Masc|Number=Sing|POS=PRON|Person=3|PronType=Prs":"Gender=Masc|Number=Sing|Person=3|PronType=Prs",
52
+ "Number=Plur|POS=DET":"Number=Plur",
53
+ "Mood=Ind|Number=Plur|POS=AUX|Person=3|Tense=Pres|VerbForm=Fin":"Mood=Ind|Number=Plur|Person=3|Tense=Pres|VerbForm=Fin",
54
+ "Gender=Masc|Number=Sing|POS=ADJ":"Gender=Masc|Number=Sing",
55
+ "Gender=Masc|Number=Sing|POS=DET|PronType=Dem":"Gender=Masc|Number=Sing|PronType=Dem",
56
+ "POS=ADV|PronType=Int":"PronType=Int",
57
+ "ExtPos=SCONJ|POS=SCONJ":"ExtPos=SCONJ",
58
+ "POS=VERB|Tense=Pres|VerbForm=Part":"Tense=Pres|VerbForm=Part",
59
+ "Definite=Ind|Gender=Masc|Number=Sing|POS=DET|PronType=Art":"Definite=Ind|Gender=Masc|Number=Sing|PronType=Art",
60
+ "Gender=Masc|POS=ADJ":"Gender=Masc",
61
+ "Mood=Ind|Number=Plur|POS=VERB|Person=3|Tense=Fut|VerbForm=Fin":"Mood=Ind|Number=Plur|Person=3|Tense=Fut|VerbForm=Fin",
62
+ "Number=Plur|POS=DET|Poss=Yes":"Number=Plur|Poss=Yes",
63
+ "POS=AUX|VerbForm=Inf":"VerbForm=Inf",
64
+ "Gender=Masc|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass":"Gender=Masc|Tense=Past|VerbForm=Part|Voice=Pass",
65
+ "POS=ADV|Polarity=Neg":"Polarity=Neg",
66
+ "Definite=Ind|Number=Sing|POS=DET|PronType=Art":"Definite=Ind|Number=Sing|PronType=Art",
67
+ "Gender=Fem|Number=Sing|POS=PRON|Person=3|PronType=Prs":"Gender=Fem|Number=Sing|Person=3|PronType=Prs",
68
+ "POS=PRON|Person=3|PronType=Prs|Reflex=Yes":"Person=3|PronType=Prs|Reflex=Yes",
69
+ "Gender=Masc|POS=NOUN":"Gender=Masc",
70
+ "POS=AUX|Tense=Past|VerbForm=Part":"Tense=Past|VerbForm=Part",
71
+ "POS=PRON|Person=3|PronType=Prs":"Person=3|PronType=Prs",
72
+ "Number=Plur|POS=NOUN":"Number=Plur",
73
+ "ExtPos=ADV|POS=ADP":"ExtPos=ADV",
74
+ "NumType=Ord|Number=Sing|POS=ADJ":"NumType=Ord|Number=Sing",
75
+ "ExtPos=ADP|POS=ADV|Polarity=Neg":"ExtPos=ADP|Polarity=Neg",
76
+ "POS=VERB|Tense=Past|VerbForm=Part":"Tense=Past|VerbForm=Part",
77
+ "POS=AUX|Tense=Pres|VerbForm=Part":"Tense=Pres|VerbForm=Part",
78
+ "Number=Sing|POS=PRON|Person=3|PronType=Dem":"Number=Sing|Person=3|PronType=Dem",
79
+ "Number=Sing|POS=NOUN":"Number=Sing",
80
+ "Gender=Masc|Number=Plur|POS=VERB|Tense=Past|VerbForm=Part":"Gender=Masc|Number=Plur|Tense=Past|VerbForm=Part",
81
+ "Gender=Masc|Number=Plur|POS=PRON|Person=3|PronType=Prs":"Gender=Masc|Number=Plur|Person=3|PronType=Prs",
82
+ "Mood=Ind|Number=Plur|POS=VERB|Person=3|Tense=Imp|VerbForm=Fin":"Mood=Ind|Number=Plur|Person=3|Tense=Imp|VerbForm=Fin",
83
+ "Gender=Fem|NumType=Ord|Number=Sing|POS=ADJ":"Gender=Fem|NumType=Ord|Number=Sing",
84
+ "Number=Plur|POS=PROPN":"Number=Plur",
85
+ "Number=Sing|POS=PROPN":"Number=Sing",
86
+ "Mood=Ind|Number=Sing|POS=AUX|Person=3|Tense=Imp|VerbForm=Fin":"Mood=Ind|Number=Sing|Person=3|Tense=Imp|VerbForm=Fin",
87
+ "Mood=Ind|Number=Plur|POS=VERB|Person=3|Tense=Pres|VerbForm=Fin":"Mood=Ind|Number=Plur|Person=3|Tense=Pres|VerbForm=Fin",
88
+ "Gender=Masc|Number=Plur|POS=PRON|Person=3|PronType=Dem":"Gender=Masc|Number=Plur|Person=3|PronType=Dem",
89
+ "Gender=Masc|POS=VERB|Tense=Past|VerbForm=Part":"Gender=Masc|Tense=Past|VerbForm=Part",
90
+ "Gender=Masc|Number=Sing|POS=DET":"Gender=Masc|Number=Sing",
91
+ "Gender=Fem|Number=Sing|POS=DET|Poss=Yes":"Gender=Fem|Number=Sing|Poss=Yes",
92
+ "Gender=Masc|Number=Sing|POS=PRON|Person=3|PronType=Ind":"Gender=Masc|Number=Sing|Person=3|PronType=Ind",
93
+ "POS=NOUN":"",
94
+ "Mood=Ind|Number=Sing|POS=VERB|Person=3|Tense=Fut|VerbForm=Fin":"Mood=Ind|Number=Sing|Person=3|Tense=Fut|VerbForm=Fin",
95
+ "ExtPos=ADP|Gender=Fem|Number=Sing|POS=NOUN":"ExtPos=ADP|Gender=Fem|Number=Sing",
96
+ "Mood=Ind|Number=Sing|POS=AUX|Person=3|Tense=Fut|VerbForm=Fin":"Mood=Ind|Number=Sing|Person=3|Tense=Fut|VerbForm=Fin",
97
+ "Mood=Ind|Number=Plur|POS=VERB|Person=1|Tense=Pres|VerbForm=Fin":"Mood=Ind|Number=Plur|Person=1|Tense=Pres|VerbForm=Fin",
98
+ "ExtPos=PRON|POS=ADV":"ExtPos=PRON",
99
+ "Number=Plur|POS=PRON|Person=3|PronType=Ind":"Number=Plur|Person=3|PronType=Ind",
100
+ "Gender=Masc|NumType=Ord|Number=Plur|POS=ADJ":"Gender=Masc|NumType=Ord|Number=Plur",
101
+ "ExtPos=ADP|Gender=Masc|Number=Sing|POS=PRON|Person=3|PronType=Prs":"ExtPos=ADP|Gender=Masc|Number=Sing|Person=3|PronType=Prs",
102
+ "Mood=Ind|Number=Plur|POS=AUX|Person=3|Tense=Fut|VerbForm=Fin":"Mood=Ind|Number=Plur|Person=3|Tense=Fut|VerbForm=Fin",
103
+ "Gender=Fem|Number=Plur|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass":"Gender=Fem|Number=Plur|Tense=Past|VerbForm=Part|Voice=Pass",
104
+ "Number=Sing|POS=PRON|PronType=Neg":"Number=Sing|PronType=Neg",
105
+ "Number=Sing|POS=PRON|Person=3|PronType=Prs":"Number=Sing|Person=3|PronType=Prs",
106
+ "Number=Sing|POS=PRON|Person=3|PronType=Ind":"Number=Sing|Person=3|PronType=Ind",
107
+ "Mood=Ind|POS=VERB|VerbForm=Fin":"Mood=Ind|VerbForm=Fin",
108
+ "Number=Plur|POS=DET|PronType=Dem":"Number=Plur|PronType=Dem",
109
+ "Gender=Masc|Number=Plur|POS=PRON|Person=3|PronType=Ind":"Gender=Masc|Number=Plur|Person=3|PronType=Ind",
110
+ "ExtPos=ADP|POS=ADP":"ExtPos=ADP",
111
+ "Gender=Masc|Number=Sing|POS=PRON|Person=3|PronType=Dem":"Gender=Masc|Number=Sing|Person=3|PronType=Dem",
112
+ "Number=Sing|POS=PRON|Person=2|PronType=Prs":"Number=Sing|Person=2|PronType=Prs",
113
+ "Gender=Masc|Number=Sing|POS=PRON|PronType=Rel":"Gender=Masc|Number=Sing|PronType=Rel",
114
+ "Mood=Ind|Number=Plur|POS=AUX|Person=3|Tense=Imp|VerbForm=Fin":"Mood=Ind|Number=Plur|Person=3|Tense=Imp|VerbForm=Fin",
115
+ "ExtPos=ADJ|POS=CCONJ":"ExtPos=ADJ",
116
+ "Mood=Sub|Number=Sing|POS=AUX|Person=3|Tense=Pres|VerbForm=Fin":"Mood=Sub|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin",
117
+ "Definite=Ind|ExtPos=ADV|Gender=Masc|Number=Sing|POS=DET|PronType=Art":"Definite=Ind|ExtPos=ADV|Gender=Masc|Number=Sing|PronType=Art",
118
+ "Gender=Masc|NumType=Ord|Number=Sing|POS=ADJ":"Gender=Masc|NumType=Ord|Number=Sing",
119
+ "POS=NUM":"",
120
+ "Gender=Fem|POS=NOUN":"Gender=Fem",
121
+ "Number=Plur|POS=PRON|Person=3|PronType=Prs":"Number=Plur|Person=3|PronType=Prs",
122
+ "Gender=Masc|Number=Sing|POS=DET|Polarity=Neg":"Gender=Masc|Number=Sing|Polarity=Neg",
123
+ "Number=Plur|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass":"Number=Plur|Tense=Past|VerbForm=Part|Voice=Pass",
124
+ "Number=Sing|POS=PRON|Person=1|PronType=Prs":"Number=Sing|Person=1|PronType=Prs",
125
+ "Mood=Ind|Number=Sing|POS=VERB|Person=1|Tense=Pres|VerbForm=Fin":"Mood=Ind|Number=Sing|Person=1|Tense=Pres|VerbForm=Fin",
126
+ "Mood=Sub|Number=Sing|POS=VERB|Person=3|Tense=Past|VerbForm=Fin":"Mood=Sub|Number=Sing|Person=3|Tense=Past|VerbForm=Fin",
127
+ "Gender=Fem|Number=Sing|POS=PRON|Person=3|PronType=Prs|Reflex=Yes":"Gender=Fem|Number=Sing|Person=3|PronType=Prs|Reflex=Yes",
128
+ "Gender=Fem|Number=Sing|POS=PRON|Person=3|PronType=Ind":"Gender=Fem|Number=Sing|Person=3|PronType=Ind",
129
+ "Definite=Def|ExtPos=ADV|Gender=Masc|Number=Sing|POS=ADP|PronType=Art":"Definite=Def|ExtPos=ADV|Gender=Masc|Number=Sing|PronType=Art",
130
+ "Mood=Sub|Number=Sing|POS=VERB|Person=3|Tense=Pres|VerbForm=Fin":"Mood=Sub|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin",
131
+ "Gender=Fem|Number=Sing|POS=VERB|Tense=Past|VerbForm=Part":"Gender=Fem|Number=Sing|Tense=Past|VerbForm=Part",
132
+ "POS=INTJ":"",
133
+ "Number=Plur|POS=PRON|Person=2|PronType=Prs":"Number=Plur|Person=2|PronType=Prs",
134
+ "ExtPos=SCONJ|POS=ADV":"ExtPos=SCONJ",
135
+ "ExtPos=DET|POS=ADP":"ExtPos=DET",
136
+ "Definite=Ind|Gender=Fem|Number=Plur|POS=DET|PronType=Art":"Definite=Ind|Gender=Fem|Number=Plur|PronType=Art",
137
+ "Gender=Fem|Number=Plur|POS=VERB|Tense=Past|VerbForm=Part":"Gender=Fem|Number=Plur|Tense=Past|VerbForm=Part",
138
+ "NumType=Card|POS=NOUN":"NumType=Card",
139
+ "Gender=Fem|Number=Sing|POS=VERB|Tense=Past|Typo=Yes|VerbForm=Part|Voice=Pass":"Gender=Fem|Number=Sing|Tense=Past|Typo=Yes|VerbForm=Part|Voice=Pass",
140
+ "POS=PRON|PronType=Int":"PronType=Int",
141
+ "Gender=Fem|Number=Plur|POS=PRON|Person=3|PronType=Prs":"Gender=Fem|Number=Plur|Person=3|PronType=Prs",
142
+ "Gender=Fem|Number=Sing|POS=DET":"Gender=Fem|Number=Sing",
143
+ "Gender=Masc|Number=Sing|POS=NOUN|Typo=Yes":"Gender=Masc|Number=Sing|Typo=Yes",
144
+ "Mood=Cnd|Number=Sing|POS=AUX|Person=3|Tense=Pres|VerbForm=Fin":"Mood=Cnd|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin",
145
+ "Gender=Fem|Number=Plur|POS=DET":"Gender=Fem|Number=Plur",
146
+ "Mood=Sub|Number=Plur|POS=VERB|Person=3|Tense=Pres|VerbForm=Fin":"Mood=Sub|Number=Plur|Person=3|Tense=Pres|VerbForm=Fin",
147
+ "Definite=Ind|Gender=Masc|Number=Plur|POS=DET|PronType=Art":"Definite=Ind|Gender=Masc|Number=Plur|PronType=Art",
148
+ "Mood=Cnd|Number=Sing|POS=VERB|Person=3|Tense=Pres|VerbForm=Fin":"Mood=Cnd|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin",
149
+ "Gender=Masc|Number=Plur|POS=PROPN":"Gender=Masc|Number=Plur",
150
+ "Mood=Cnd|Number=Plur|POS=VERB|Person=3|Tense=Pres|VerbForm=Fin":"Mood=Cnd|Number=Plur|Person=3|Tense=Pres|VerbForm=Fin",
151
+ "Gender=Fem|Number=Sing|POS=PRON|Person=3|PronType=Dem":"Gender=Fem|Number=Sing|Person=3|PronType=Dem",
152
+ "Number=Sing|POS=DET":"Number=Sing",
153
+ "Gender=Masc|NumType=Card|Number=Plur|POS=NOUN":"Gender=Masc|NumType=Card|Number=Plur",
154
+ "Gender=Fem|Number=Plur|POS=PRON|Person=3|PronType=Dem":"Gender=Fem|Number=Plur|Person=3|PronType=Dem",
155
+ "Mood=Ind|POS=VERB|Person=3|Tense=Pres|VerbForm=Fin":"Mood=Ind|Person=3|Tense=Pres|VerbForm=Fin",
156
+ "Gender=Fem|Number=Sing|POS=PRON|PronType=Rel":"Gender=Fem|Number=Sing|PronType=Rel",
157
+ "ExtPos=CCONJ|POS=ADV":"ExtPos=CCONJ",
158
+ "Gender=Masc|Number=Plur|POS=PRON|Person=3|PronType=Prs|Reflex=Yes":"Gender=Masc|Number=Plur|Person=3|PronType=Prs|Reflex=Yes",
159
+ "Gender=Fem|Number=Sing|POS=NOUN|Typo=Yes":"Gender=Fem|Number=Sing|Typo=Yes",
160
+ "ExtPos=ADP|Number=Sing|POS=PRON|Person=3|PronType=Prs":"ExtPos=ADP|Number=Sing|Person=3|PronType=Prs",
161
+ "Mood=Ind|Number=Sing|POS=AUX|Person=1|Tense=Imp|VerbForm=Fin":"Mood=Ind|Number=Sing|Person=1|Tense=Imp|VerbForm=Fin",
162
+ "Mood=Cnd|Number=Plur|POS=VERB|Person=1|Tense=Pres|VerbForm=Fin":"Mood=Cnd|Number=Plur|Person=1|Tense=Pres|VerbForm=Fin",
163
+ "Gender=Fem|Number=Sing|POS=DET|Polarity=Neg":"Gender=Fem|Number=Sing|Polarity=Neg",
164
+ "ExtPos=CCONJ|POS=ADP":"ExtPos=CCONJ",
165
+ "Definite=Def|ExtPos=ADV|Gender=Masc|Number=Sing|POS=DET|PronType=Art":"Definite=Def|ExtPos=ADV|Gender=Masc|Number=Sing|PronType=Art",
166
+ "Mood=Ind|Number=Sing|POS=AUX|Person=1|Tense=Pres|VerbForm=Fin":"Mood=Ind|Number=Sing|Person=1|Tense=Pres|VerbForm=Fin",
167
+ "Gender=Masc|Number=Sing|POS=AUX|Tense=Past|VerbForm=Part":"Gender=Masc|Number=Sing|Tense=Past|VerbForm=Part",
168
+ "Foreign=Yes|POS=X":"Foreign=Yes",
169
+ "POS=SYM":"",
170
+ "Mood=Imp|Number=Plur|POS=VERB|Person=2|Tense=Pres|VerbForm=Fin":"Mood=Imp|Number=Plur|Person=2|Tense=Pres|VerbForm=Fin",
171
+ "Mood=Ind|Number=Plur|POS=VERB|Person=2|Tense=Pres|VerbForm=Fin":"Mood=Ind|Number=Plur|Person=2|Tense=Pres|VerbForm=Fin",
172
+ "Gender=Masc|Number=Sing|POS=DET|PronType=Int":"Gender=Masc|Number=Sing|PronType=Int",
173
+ "Gender=Fem|Number=Plur|POS=DET|PronType=Int":"Gender=Fem|Number=Plur|PronType=Int",
174
+ "POS=DET":"",
175
+ "Gender=Masc|Number=Plur|POS=PRON|PronType=Rel":"Gender=Masc|Number=Plur|PronType=Rel",
176
+ "Definite=Ind|ExtPos=ADV|Gender=Fem|Number=Sing|POS=DET|PronType=Art":"Definite=Ind|ExtPos=ADV|Gender=Fem|Number=Sing|PronType=Art",
177
+ "Mood=Sub|Number=Plur|POS=AUX|Person=3|Tense=Pres|VerbForm=Fin":"Mood=Sub|Number=Plur|Person=3|Tense=Pres|VerbForm=Fin",
178
+ "Mood=Ind|POS=VERB|Person=3|VerbForm=Fin":"Mood=Ind|Person=3|VerbForm=Fin",
179
+ "ExtPos=DET|POS=ADV|Polarity=Neg":"ExtPos=DET|Polarity=Neg",
180
+ "Number=Sing|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass":"Number=Sing|Tense=Past|VerbForm=Part|Voice=Pass",
181
+ "POS=ADJ|Typo=Yes":"Typo=Yes",
182
+ "POS=X":"",
183
+ "ExtPos=SCONJ|POS=ADP":"ExtPos=SCONJ",
184
+ "ExtPos=ADJ|POS=X":"ExtPos=ADJ",
185
+ "ExtPos=ADJ|POS=ADP":"ExtPos=ADJ",
186
+ "POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass":"Tense=Past|VerbForm=Part|Voice=Pass",
187
+ "ExtPos=CCONJ|POS=SYM":"ExtPos=CCONJ",
188
+ "Mood=Cnd|Number=Plur|POS=VERB|Person=2|Tense=Pres|VerbForm=Fin":"Mood=Cnd|Number=Plur|Person=2|Tense=Pres|VerbForm=Fin",
189
+ "Mood=Ind|Number=Plur|POS=AUX|Person=2|Tense=Pres|VerbForm=Fin":"Mood=Ind|Number=Plur|Person=2|Tense=Pres|VerbForm=Fin",
190
+ "Gender=Fem|Number=Sing|POS=DET|PronType=Int":"Gender=Fem|Number=Sing|PronType=Int",
191
+ "Gender=Masc|Number=Plur|POS=DET":"Gender=Masc|Number=Plur",
192
+ "Gender=Fem|Number=Plur|POS=PRON|PronType=Rel":"Gender=Fem|Number=Plur|PronType=Rel",
193
+ "ExtPos=ADV|Gender=Masc|Number=Sing|POS=NOUN":"ExtPos=ADV|Gender=Masc|Number=Sing",
194
+ "ExtPos=ADP|POS=PRON|Person=3|PronType=Prs":"ExtPos=ADP|Person=3|PronType=Prs",
195
+ "Gender=Masc|Number=Sing|POS=VERB|Tense=Past|Typo=Yes|VerbForm=Part|Voice=Pass":"Gender=Masc|Number=Sing|Tense=Past|Typo=Yes|VerbForm=Part|Voice=Pass",
196
+ "Mood=Ind|Number=Sing|POS=AUX|Person=3|Tense=Imp|Typo=Yes|VerbForm=Fin":"Mood=Ind|Number=Sing|Person=3|Tense=Imp|Typo=Yes|VerbForm=Fin",
197
+ "Gender=Fem|NumType=Ord|Number=Plur|POS=ADJ":"Gender=Fem|NumType=Ord|Number=Plur",
198
+ "Mood=Ind|Number=Plur|POS=VERB|Person=2|Tense=Fut|VerbForm=Fin":"Mood=Ind|Number=Plur|Person=2|Tense=Fut|VerbForm=Fin",
199
+ "Mood=Imp|POS=VERB|Tense=Pres|VerbForm=Fin":"Mood=Imp|Tense=Pres|VerbForm=Fin",
200
+ "Gender=Fem|Number=Plur|POS=PRON|Person=3|PronType=Ind":"Gender=Fem|Number=Plur|Person=3|PronType=Ind",
201
+ "Number=Plur|POS=PRON|Person=2|PronType=Prs|Reflex=Yes":"Number=Plur|Person=2|PronType=Prs|Reflex=Yes",
202
+ "Mood=Cnd|Number=Sing|POS=VERB|Person=1|Tense=Pres|VerbForm=Fin":"Mood=Cnd|Number=Sing|Person=1|Tense=Pres|VerbForm=Fin",
203
+ "Number=Plur|POS=PRON|Person=1|PronType=Prs|Reflex=Yes":"Number=Plur|Person=1|PronType=Prs|Reflex=Yes",
204
+ "Gender=Masc|NumType=Card|Number=Sing|POS=NOUN":"Gender=Masc|NumType=Card|Number=Sing",
205
+ "ExtPos=PRON|POS=ADP":"ExtPos=PRON",
206
+ "Mood=Ind|Number=Plur|POS=AUX|Person=1|Tense=Pres|VerbForm=Fin":"Mood=Ind|Number=Plur|Person=1|Tense=Pres|VerbForm=Fin",
207
+ "Mood=Ind|Number=Plur|POS=AUX|Person=1|Tense=Fut|VerbForm=Fin":"Mood=Ind|Number=Plur|Person=1|Tense=Fut|VerbForm=Fin",
208
+ "Mood=Ind|Number=Plur|POS=VERB|Person=1|Tense=Fut|VerbForm=Fin":"Mood=Ind|Number=Plur|Person=1|Tense=Fut|VerbForm=Fin",
209
+ "Number=Sing|POS=PRON|Person=1|PronType=Prs|Reflex=Yes":"Number=Sing|Person=1|PronType=Prs|Reflex=Yes",
210
+ "Mood=Ind|Number=Plur|POS=VERB|Person=1|Tense=Imp|VerbForm=Fin":"Mood=Ind|Number=Plur|Person=1|Tense=Imp|VerbForm=Fin",
211
+ "Mood=Ind|Number=Plur|POS=AUX|Person=1|Tense=Imp|VerbForm=Fin":"Mood=Ind|Number=Plur|Person=1|Tense=Imp|VerbForm=Fin",
212
+ "ExtPos=ADV|Gender=Masc|Number=Sing|POS=PRON|Person=3|PronType=Prs":"ExtPos=ADV|Gender=Masc|Number=Sing|Person=3|PronType=Prs",
213
+ "Mood=Ind|Number=Sing|POS=VERB|Person=1|Tense=Imp|VerbForm=Fin":"Mood=Ind|Number=Sing|Person=1|Tense=Imp|VerbForm=Fin",
214
+ "Mood=Ind|Number=Plur|POS=VERB|Person=3|Tense=Pres|Typo=Yes|VerbForm=Fin":"Mood=Ind|Number=Plur|Person=3|Tense=Pres|Typo=Yes|VerbForm=Fin",
215
+ "Mood=Sub|Number=Sing|POS=VERB|Person=1|Tense=Pres|VerbForm=Fin":"Mood=Sub|Number=Sing|Person=1|Tense=Pres|VerbForm=Fin",
216
+ "ExtPos=ADV|POS=ADV|Polarity=Neg":"ExtPos=ADV|Polarity=Neg",
217
+ "Gender=Masc|Number=Sing|POS=PRON|PronType=Neg":"Gender=Masc|Number=Sing|PronType=Neg",
218
+ "ExtPos=ADV|Gender=Masc|Number=Sing|POS=ADJ":"ExtPos=ADV|Gender=Masc|Number=Sing",
219
+ "ExtPos=ADV|Number=Sing|POS=PRON|Person=3|PronType=Dem":"ExtPos=ADV|Number=Sing|Person=3|PronType=Dem",
220
+ "ExtPos=PRON|POS=PRON|PronType=Rel":"ExtPos=PRON|PronType=Rel",
221
+ "Gender=Fem|Number=Plur|POS=PRON|Person=3|PronType=Prs|Typo=Yes":"Gender=Fem|Number=Plur|Person=3|PronType=Prs|Typo=Yes",
222
+ "Gender=Masc|POS=PROPN":"Gender=Masc",
223
+ "Mood=Cnd|Number=Plur|POS=AUX|Person=3|Tense=Pres|VerbForm=Fin":"Mood=Cnd|Number=Plur|Person=3|Tense=Pres|VerbForm=Fin",
224
+ "Mood=Ind|Number=Plur|POS=AUX|Person=3|Tense=Pres|Typo=Yes|VerbForm=Fin":"Mood=Ind|Number=Plur|Person=3|Tense=Pres|Typo=Yes|VerbForm=Fin",
225
+ "Gender=Masc|Number=Plur|POS=VERB|Tense=Past|Typo=Yes|VerbForm=Part|Voice=Pass":"Gender=Masc|Number=Plur|Tense=Past|Typo=Yes|VerbForm=Part|Voice=Pass",
226
+ "Mood=Sub|Number=Sing|POS=AUX|Person=1|Tense=Pres|VerbForm=Fin":"Mood=Sub|Number=Sing|Person=1|Tense=Pres|VerbForm=Fin",
227
+ "ExtPos=SCONJ|Gender=Masc|Number=Sing|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass":"ExtPos=SCONJ|Gender=Masc|Number=Sing|Tense=Past|VerbForm=Part|Voice=Pass",
228
+ "Mood=Ind|Number=Sing|POS=VERB|Person=1|Tense=Fut|VerbForm=Fin":"Mood=Ind|Number=Sing|Person=1|Tense=Fut|VerbForm=Fin",
229
+ "ExtPos=ADV|POS=X":"ExtPos=ADV",
230
+ "Mood=Cnd|Number=Sing|POS=AUX|Person=1|Tense=Pres|VerbForm=Fin":"Mood=Cnd|Number=Sing|Person=1|Tense=Pres|VerbForm=Fin",
231
+ "ExtPos=CCONJ|POS=PRON|Person=3|PronType=Prs":"ExtPos=CCONJ|Person=3|PronType=Prs",
232
+ "Mood=Sub|Number=Sing|POS=AUX|Person=3|Tense=Pres|Typo=Yes|VerbForm=Fin":"Mood=Sub|Number=Sing|Person=3|Tense=Pres|Typo=Yes|VerbForm=Fin",
233
+ "Mood=Sub|Number=Plur|POS=AUX|Person=1|Tense=Pres|VerbForm=Fin":"Mood=Sub|Number=Plur|Person=1|Tense=Pres|VerbForm=Fin",
234
+ "ExtPos=INTJ|POS=INTJ":"ExtPos=INTJ",
235
+ "Mood=Imp|Number=Plur|POS=VERB|Person=1|Tense=Pres|VerbForm=Fin":"Mood=Imp|Number=Plur|Person=1|Tense=Pres|VerbForm=Fin",
236
+ "Mood=Sub|Number=Plur|POS=AUX|Person=2|Tense=Pres|VerbForm=Fin":"Mood=Sub|Number=Plur|Person=2|Tense=Pres|VerbForm=Fin",
237
+ "Number=Sing|POS=VERB|Tense=Past|VerbForm=Part":"Number=Sing|Tense=Past|VerbForm=Part",
238
+ "Mood=Ind|Number=Plur|POS=VERB|Person=2|Tense=Imp|VerbForm=Fin":"Mood=Ind|Number=Plur|Person=2|Tense=Imp|VerbForm=Fin",
239
+ "Number=Sing|POS=PRON|PronType=Rel":"Number=Sing|PronType=Rel",
240
+ "Gender=Fem|Number=Plur|Number[psor]=Plur|POS=PRON|Person=3|Person[psor]=1|Poss=Yes|PronType=Prs":"Gender=Fem|Number=Plur|Number[psor]=Plur|Person=3|Person[psor]=1|Poss=Yes|PronType=Prs",
241
+ "ExtPos=PROPN|Gender=Masc|Number=Sing|POS=NOUN":"ExtPos=PROPN|Gender=Masc|Number=Sing",
242
+ "ExtPos=PROPN|Gender=Masc|Number=Plur|POS=NOUN":"ExtPos=PROPN|Gender=Masc|Number=Plur",
243
+ "Definite=Def|Gender=Masc|Number=Sing|POS=ADP|PronType=Art|Typo=Yes":"Definite=Def|Gender=Masc|Number=Sing|PronType=Art|Typo=Yes",
244
+ "ExtPos=PROPN|Gender=Fem|Number=Sing|POS=NOUN":"ExtPos=PROPN|Gender=Fem|Number=Sing",
245
+ "ExtPos=ADV|POS=PRON|Person=3|PronType=Prs":"ExtPos=ADV|Person=3|PronType=Prs",
246
+ "Gender=Fem|Number=Plur|POS=PROPN":"Gender=Fem|Number=Plur",
247
+ "Gender=Fem|Number=Plur|POS=VERB|Tense=Past|Typo=Yes|VerbForm=Part|Voice=Pass":"Gender=Fem|Number=Plur|Tense=Past|Typo=Yes|VerbForm=Part|Voice=Pass",
248
+ "ExtPos=ADJ|Gender=Masc|NumType=Card|POS=NUM":"ExtPos=ADJ|Gender=Masc|NumType=Card"
249
+ },
250
+ "labels_pos":{
251
+ "POS=PROPN":96,
252
+ "Gender=Fem|Number=Sing|POS=DET|PronType=Dem":90,
253
+ "Gender=Fem|Number=Sing|POS=NOUN":92,
254
+ "Number=Plur|POS=PRON|Person=1|PronType=Prs":95,
255
+ "Mood=Ind|Number=Sing|POS=VERB|Person=3|Tense=Pres|VerbForm=Fin":100,
256
+ "POS=SCONJ":98,
257
+ "POS=ADP":85,
258
+ "Definite=Def|Gender=Masc|Number=Sing|POS=DET|PronType=Art":90,
259
+ "NumType=Ord|POS=ADJ":84,
260
+ "Gender=Masc|Number=Sing|POS=NOUN":92,
261
+ "POS=PUNCT":97,
262
+ "Gender=Masc|Number=Sing|POS=PROPN":96,
263
+ "Number=Plur|POS=ADJ":84,
264
+ "Gender=Masc|Number=Plur|POS=NOUN":92,
265
+ "Definite=Ind|Gender=Fem|Number=Sing|POS=DET|PronType=Art":90,
266
+ "Number=Sing|POS=ADJ":84,
267
+ "Mood=Ind|Number=Sing|POS=VERB|Person=3|Tense=Imp|VerbForm=Fin":100,
268
+ "POS=ADV":86,
269
+ "Mood=Ind|Number=Sing|POS=AUX|Person=3|Tense=Past|VerbForm=Fin":87,
270
+ "Gender=Fem|Number=Sing|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass":100,
271
+ "Definite=Def|Gender=Fem|Number=Sing|POS=DET|PronType=Art":90,
272
+ "Gender=Fem|Number=Sing|POS=PROPN":96,
273
+ "Definite=Def|Number=Sing|POS=DET|PronType=Art":90,
274
+ "NumType=Card|POS=NUM":93,
275
+ "Definite=Def|Number=Plur|POS=DET|PronType=Art":90,
276
+ "Gender=Masc|Number=Plur|POS=ADJ":84,
277
+ "POS=CCONJ":89,
278
+ "Gender=Fem|Number=Plur|POS=NOUN":92,
279
+ "Mood=Ind|Number=Plur|POS=VERB|Person=3|Tense=Past|VerbForm=Fin":100,
280
+ "Gender=Masc|Number=Sing|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass":100,
281
+ "Gender=Fem|Number=Plur|POS=ADJ":84,
282
+ "POS=ADJ":84,
283
+ "Mood=Ind|Number=Sing|POS=VERB|Person=3|Tense=Past|VerbForm=Fin":100,
284
+ "POS=PRON|PronType=Rel":95,
285
+ "ExtPos=CCONJ|POS=CCONJ":89,
286
+ "Number=Sing|POS=DET|Poss=Yes":90,
287
+ "Definite=Def|Gender=Masc|Number=Sing|POS=ADP|PronType=Art":85,
288
+ "ExtPos=ADV|POS=ADV":86,
289
+ "Definite=Def|Number=Plur|POS=ADP|PronType=Art":85,
290
+ "Definite=Ind|Number=Plur|POS=DET|PronType=Art":90,
291
+ "Mood=Ind|Number=Plur|POS=AUX|Person=3|Tense=Past|VerbForm=Fin":87,
292
+ "Gender=Masc|Number=Plur|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass":100,
293
+ "Mood=Ind|Number=Sing|POS=AUX|Person=3|Tense=Pres|VerbForm=Fin":87,
294
+ "Gender=Masc|Number=Sing|POS=VERB|Tense=Past|VerbForm=Part":100,
295
+ "POS=VERB|VerbForm=Inf":100,
296
+ "Gender=Fem|Number=Sing|POS=ADJ":84,
297
+ "Gender=Masc|Number=Sing|POS=PRON|Person=3|PronType=Prs":95,
298
+ "Number=Plur|POS=DET":90,
299
+ "Mood=Ind|Number=Plur|POS=AUX|Person=3|Tense=Pres|VerbForm=Fin":87,
300
+ "Gender=Masc|Number=Sing|POS=ADJ":84,
301
+ "Gender=Masc|Number=Sing|POS=DET|PronType=Dem":90,
302
+ "POS=ADV|PronType=Int":86,
303
+ "ExtPos=SCONJ|POS=SCONJ":98,
304
+ "POS=VERB|Tense=Pres|VerbForm=Part":100,
305
+ "Definite=Ind|Gender=Masc|Number=Sing|POS=DET|PronType=Art":90,
306
+ "Gender=Masc|POS=ADJ":84,
307
+ "Mood=Ind|Number=Plur|POS=VERB|Person=3|Tense=Fut|VerbForm=Fin":100,
308
+ "Number=Plur|POS=DET|Poss=Yes":90,
309
+ "POS=AUX|VerbForm=Inf":87,
310
+ "Gender=Masc|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass":100,
311
+ "POS=ADV|Polarity=Neg":86,
312
+ "Definite=Ind|Number=Sing|POS=DET|PronType=Art":90,
313
+ "Gender=Fem|Number=Sing|POS=PRON|Person=3|PronType=Prs":95,
314
+ "POS=PRON|Person=3|PronType=Prs|Reflex=Yes":95,
315
+ "Gender=Masc|POS=NOUN":92,
316
+ "POS=AUX|Tense=Past|VerbForm=Part":87,
317
+ "POS=PRON|Person=3|PronType=Prs":95,
318
+ "Number=Plur|POS=NOUN":92,
319
+ "ExtPos=ADV|POS=ADP":85,
320
+ "NumType=Ord|Number=Sing|POS=ADJ":84,
321
+ "ExtPos=ADP|POS=ADV|Polarity=Neg":86,
322
+ "POS=VERB|Tense=Past|VerbForm=Part":100,
323
+ "POS=AUX|Tense=Pres|VerbForm=Part":87,
324
+ "Number=Sing|POS=PRON|Person=3|PronType=Dem":95,
325
+ "Number=Sing|POS=NOUN":92,
326
+ "Gender=Masc|Number=Plur|POS=VERB|Tense=Past|VerbForm=Part":100,
327
+ "Gender=Masc|Number=Plur|POS=PRON|Person=3|PronType=Prs":95,
328
+ "Mood=Ind|Number=Plur|POS=VERB|Person=3|Tense=Imp|VerbForm=Fin":100,
329
+ "Gender=Fem|NumType=Ord|Number=Sing|POS=ADJ":84,
330
+ "Number=Plur|POS=PROPN":96,
331
+ "Number=Sing|POS=PROPN":96,
332
+ "Mood=Ind|Number=Sing|POS=AUX|Person=3|Tense=Imp|VerbForm=Fin":87,
333
+ "Mood=Ind|Number=Plur|POS=VERB|Person=3|Tense=Pres|VerbForm=Fin":100,
334
+ "Gender=Masc|Number=Plur|POS=PRON|Person=3|PronType=Dem":95,
335
+ "Gender=Masc|POS=VERB|Tense=Past|VerbForm=Part":100,
336
+ "Gender=Masc|Number=Sing|POS=DET":90,
337
+ "Gender=Fem|Number=Sing|POS=DET|Poss=Yes":90,
338
+ "Gender=Masc|Number=Sing|POS=PRON|Person=3|PronType=Ind":95,
339
+ "POS=NOUN":92,
340
+ "Mood=Ind|Number=Sing|POS=VERB|Person=3|Tense=Fut|VerbForm=Fin":100,
341
+ "ExtPos=ADP|Gender=Fem|Number=Sing|POS=NOUN":92,
342
+ "Mood=Ind|Number=Sing|POS=AUX|Person=3|Tense=Fut|VerbForm=Fin":87,
343
+ "Mood=Ind|Number=Plur|POS=VERB|Person=1|Tense=Pres|VerbForm=Fin":100,
344
+ "ExtPos=PRON|POS=ADV":86,
345
+ "Number=Plur|POS=PRON|Person=3|PronType=Ind":95,
346
+ "Gender=Masc|NumType=Ord|Number=Plur|POS=ADJ":84,
347
+ "ExtPos=ADP|Gender=Masc|Number=Sing|POS=PRON|Person=3|PronType=Prs":95,
348
+ "Mood=Ind|Number=Plur|POS=AUX|Person=3|Tense=Fut|VerbForm=Fin":87,
349
+ "Gender=Fem|Number=Plur|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass":100,
350
+ "Number=Sing|POS=PRON|PronType=Neg":95,
351
+ "Number=Sing|POS=PRON|Person=3|PronType=Prs":95,
352
+ "Number=Sing|POS=PRON|Person=3|PronType=Ind":95,
353
+ "Mood=Ind|POS=VERB|VerbForm=Fin":100,
354
+ "Number=Plur|POS=DET|PronType=Dem":90,
355
+ "Gender=Masc|Number=Plur|POS=PRON|Person=3|PronType=Ind":95,
356
+ "ExtPos=ADP|POS=ADP":85,
357
+ "Gender=Masc|Number=Sing|POS=PRON|Person=3|PronType=Dem":95,
358
+ "Number=Sing|POS=PRON|Person=2|PronType=Prs":95,
359
+ "Gender=Masc|Number=Sing|POS=PRON|PronType=Rel":95,
360
+ "Mood=Ind|Number=Plur|POS=AUX|Person=3|Tense=Imp|VerbForm=Fin":87,
361
+ "ExtPos=ADJ|POS=CCONJ":89,
362
+ "Mood=Sub|Number=Sing|POS=AUX|Person=3|Tense=Pres|VerbForm=Fin":87,
363
+ "Definite=Ind|ExtPos=ADV|Gender=Masc|Number=Sing|POS=DET|PronType=Art":90,
364
+ "Gender=Masc|NumType=Ord|Number=Sing|POS=ADJ":84,
365
+ "POS=NUM":93,
366
+ "Gender=Fem|POS=NOUN":92,
367
+ "Number=Plur|POS=PRON|Person=3|PronType=Prs":95,
368
+ "Gender=Masc|Number=Sing|POS=DET|Polarity=Neg":90,
369
+ "Number=Plur|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass":100,
370
+ "Number=Sing|POS=PRON|Person=1|PronType=Prs":95,
371
+ "Mood=Ind|Number=Sing|POS=VERB|Person=1|Tense=Pres|VerbForm=Fin":100,
372
+ "Mood=Sub|Number=Sing|POS=VERB|Person=3|Tense=Past|VerbForm=Fin":100,
373
+ "Gender=Fem|Number=Sing|POS=PRON|Person=3|PronType=Prs|Reflex=Yes":95,
374
+ "Gender=Fem|Number=Sing|POS=PRON|Person=3|PronType=Ind":95,
375
+ "Definite=Def|ExtPos=ADV|Gender=Masc|Number=Sing|POS=ADP|PronType=Art":85,
376
+ "Mood=Sub|Number=Sing|POS=VERB|Person=3|Tense=Pres|VerbForm=Fin":100,
377
+ "Gender=Fem|Number=Sing|POS=VERB|Tense=Past|VerbForm=Part":100,
378
+ "POS=INTJ":91,
379
+ "Number=Plur|POS=PRON|Person=2|PronType=Prs":95,
380
+ "ExtPos=SCONJ|POS=ADV":86,
381
+ "ExtPos=DET|POS=ADP":85,
382
+ "Definite=Ind|Gender=Fem|Number=Plur|POS=DET|PronType=Art":90,
383
+ "Gender=Fem|Number=Plur|POS=VERB|Tense=Past|VerbForm=Part":100,
384
+ "NumType=Card|POS=NOUN":92,
385
+ "Gender=Fem|Number=Sing|POS=VERB|Tense=Past|Typo=Yes|VerbForm=Part|Voice=Pass":100,
386
+ "POS=PRON|PronType=Int":95,
387
+ "Gender=Fem|Number=Plur|POS=PRON|Person=3|PronType=Prs":95,
388
+ "Gender=Fem|Number=Sing|POS=DET":90,
389
+ "Gender=Masc|Number=Sing|POS=NOUN|Typo=Yes":92,
390
+ "Mood=Cnd|Number=Sing|POS=AUX|Person=3|Tense=Pres|VerbForm=Fin":87,
391
+ "Gender=Fem|Number=Plur|POS=DET":90,
392
+ "Mood=Sub|Number=Plur|POS=VERB|Person=3|Tense=Pres|VerbForm=Fin":100,
393
+ "Definite=Ind|Gender=Masc|Number=Plur|POS=DET|PronType=Art":90,
394
+ "Mood=Cnd|Number=Sing|POS=VERB|Person=3|Tense=Pres|VerbForm=Fin":100,
395
+ "Gender=Masc|Number=Plur|POS=PROPN":96,
396
+ "Mood=Cnd|Number=Plur|POS=VERB|Person=3|Tense=Pres|VerbForm=Fin":100,
397
+ "Gender=Fem|Number=Sing|POS=PRON|Person=3|PronType=Dem":95,
398
+ "Number=Sing|POS=DET":90,
399
+ "Gender=Masc|NumType=Card|Number=Plur|POS=NOUN":92,
400
+ "Gender=Fem|Number=Plur|POS=PRON|Person=3|PronType=Dem":95,
401
+ "Mood=Ind|POS=VERB|Person=3|Tense=Pres|VerbForm=Fin":100,
402
+ "Gender=Fem|Number=Sing|POS=PRON|PronType=Rel":95,
403
+ "ExtPos=CCONJ|POS=ADV":86,
404
+ "Gender=Masc|Number=Plur|POS=PRON|Person=3|PronType=Prs|Reflex=Yes":95,
405
+ "Gender=Fem|Number=Sing|POS=NOUN|Typo=Yes":92,
406
+ "ExtPos=ADP|Number=Sing|POS=PRON|Person=3|PronType=Prs":95,
407
+ "Mood=Ind|Number=Sing|POS=AUX|Person=1|Tense=Imp|VerbForm=Fin":87,
408
+ "Mood=Cnd|Number=Plur|POS=VERB|Person=1|Tense=Pres|VerbForm=Fin":100,
409
+ "Gender=Fem|Number=Sing|POS=DET|Polarity=Neg":90,
410
+ "ExtPos=CCONJ|POS=ADP":85,
411
+ "Definite=Def|ExtPos=ADV|Gender=Masc|Number=Sing|POS=DET|PronType=Art":90,
412
+ "Mood=Ind|Number=Sing|POS=AUX|Person=1|Tense=Pres|VerbForm=Fin":87,
413
+ "Gender=Masc|Number=Sing|POS=AUX|Tense=Past|VerbForm=Part":87,
414
+ "Foreign=Yes|POS=X":101,
415
+ "POS=SYM":99,
416
+ "Mood=Imp|Number=Plur|POS=VERB|Person=2|Tense=Pres|VerbForm=Fin":100,
417
+ "Mood=Ind|Number=Plur|POS=VERB|Person=2|Tense=Pres|VerbForm=Fin":100,
418
+ "Gender=Masc|Number=Sing|POS=DET|PronType=Int":90,
419
+ "Gender=Fem|Number=Plur|POS=DET|PronType=Int":90,
420
+ "POS=DET":90,
421
+ "Gender=Masc|Number=Plur|POS=PRON|PronType=Rel":95,
422
+ "Definite=Ind|ExtPos=ADV|Gender=Fem|Number=Sing|POS=DET|PronType=Art":90,
423
+ "Mood=Sub|Number=Plur|POS=AUX|Person=3|Tense=Pres|VerbForm=Fin":87,
424
+ "Mood=Ind|POS=VERB|Person=3|VerbForm=Fin":100,
425
+ "ExtPos=DET|POS=ADV|Polarity=Neg":86,
426
+ "Number=Sing|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass":100,
427
+ "POS=ADJ|Typo=Yes":84,
428
+ "POS=X":101,
429
+ "ExtPos=SCONJ|POS=ADP":85,
430
+ "ExtPos=ADJ|POS=X":101,
431
+ "ExtPos=ADJ|POS=ADP":85,
432
+ "POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass":100,
433
+ "ExtPos=CCONJ|POS=SYM":99,
434
+ "Mood=Cnd|Number=Plur|POS=VERB|Person=2|Tense=Pres|VerbForm=Fin":100,
435
+ "Mood=Ind|Number=Plur|POS=AUX|Person=2|Tense=Pres|VerbForm=Fin":87,
436
+ "Gender=Fem|Number=Sing|POS=DET|PronType=Int":90,
437
+ "Gender=Masc|Number=Plur|POS=DET":90,
438
+ "Gender=Fem|Number=Plur|POS=PRON|PronType=Rel":95,
439
+ "ExtPos=ADV|Gender=Masc|Number=Sing|POS=NOUN":92,
440
+ "ExtPos=ADP|POS=PRON|Person=3|PronType=Prs":95,
441
+ "Gender=Masc|Number=Sing|POS=VERB|Tense=Past|Typo=Yes|VerbForm=Part|Voice=Pass":100,
442
+ "Mood=Ind|Number=Sing|POS=AUX|Person=3|Tense=Imp|Typo=Yes|VerbForm=Fin":87,
443
+ "Gender=Fem|NumType=Ord|Number=Plur|POS=ADJ":84,
444
+ "Mood=Ind|Number=Plur|POS=VERB|Person=2|Tense=Fut|VerbForm=Fin":100,
445
+ "Mood=Imp|POS=VERB|Tense=Pres|VerbForm=Fin":100,
446
+ "Gender=Fem|Number=Plur|POS=PRON|Person=3|PronType=Ind":95,
447
+ "Number=Plur|POS=PRON|Person=2|PronType=Prs|Reflex=Yes":95,
448
+ "Mood=Cnd|Number=Sing|POS=VERB|Person=1|Tense=Pres|VerbForm=Fin":100,
449
+ "Number=Plur|POS=PRON|Person=1|PronType=Prs|Reflex=Yes":95,
450
+ "Gender=Masc|NumType=Card|Number=Sing|POS=NOUN":92,
451
+ "ExtPos=PRON|POS=ADP":85,
452
+ "Mood=Ind|Number=Plur|POS=AUX|Person=1|Tense=Pres|VerbForm=Fin":87,
453
+ "Mood=Ind|Number=Plur|POS=AUX|Person=1|Tense=Fut|VerbForm=Fin":87,
454
+ "Mood=Ind|Number=Plur|POS=VERB|Person=1|Tense=Fut|VerbForm=Fin":100,
455
+ "Number=Sing|POS=PRON|Person=1|PronType=Prs|Reflex=Yes":95,
456
+ "Mood=Ind|Number=Plur|POS=VERB|Person=1|Tense=Imp|VerbForm=Fin":100,
457
+ "Mood=Ind|Number=Plur|POS=AUX|Person=1|Tense=Imp|VerbForm=Fin":87,
458
+ "ExtPos=ADV|Gender=Masc|Number=Sing|POS=PRON|Person=3|PronType=Prs":95,
459
+ "Mood=Ind|Number=Sing|POS=VERB|Person=1|Tense=Imp|VerbForm=Fin":100,
460
+ "Mood=Ind|Number=Plur|POS=VERB|Person=3|Tense=Pres|Typo=Yes|VerbForm=Fin":100,
461
+ "Mood=Sub|Number=Sing|POS=VERB|Person=1|Tense=Pres|VerbForm=Fin":100,
462
+ "ExtPos=ADV|POS=ADV|Polarity=Neg":86,
463
+ "Gender=Masc|Number=Sing|POS=PRON|PronType=Neg":95,
464
+ "ExtPos=ADV|Gender=Masc|Number=Sing|POS=ADJ":84,
465
+ "ExtPos=ADV|Number=Sing|POS=PRON|Person=3|PronType=Dem":95,
466
+ "ExtPos=PRON|POS=PRON|PronType=Rel":95,
467
+ "Gender=Fem|Number=Plur|POS=PRON|Person=3|PronType=Prs|Typo=Yes":95,
468
+ "Gender=Masc|POS=PROPN":96,
469
+ "Mood=Cnd|Number=Plur|POS=AUX|Person=3|Tense=Pres|VerbForm=Fin":87,
470
+ "Mood=Ind|Number=Plur|POS=AUX|Person=3|Tense=Pres|Typo=Yes|VerbForm=Fin":87,
471
+ "Gender=Masc|Number=Plur|POS=VERB|Tense=Past|Typo=Yes|VerbForm=Part|Voice=Pass":100,
472
+ "Mood=Sub|Number=Sing|POS=AUX|Person=1|Tense=Pres|VerbForm=Fin":87,
473
+ "ExtPos=SCONJ|Gender=Masc|Number=Sing|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass":100,
474
+ "Mood=Ind|Number=Sing|POS=VERB|Person=1|Tense=Fut|VerbForm=Fin":100,
475
+ "ExtPos=ADV|POS=X":101,
476
+ "Mood=Cnd|Number=Sing|POS=AUX|Person=1|Tense=Pres|VerbForm=Fin":87,
477
+ "ExtPos=CCONJ|POS=PRON|Person=3|PronType=Prs":95,
478
+ "Mood=Sub|Number=Sing|POS=AUX|Person=3|Tense=Pres|Typo=Yes|VerbForm=Fin":87,
479
+ "Mood=Sub|Number=Plur|POS=AUX|Person=1|Tense=Pres|VerbForm=Fin":87,
480
+ "ExtPos=INTJ|POS=INTJ":91,
481
+ "Mood=Imp|Number=Plur|POS=VERB|Person=1|Tense=Pres|VerbForm=Fin":100,
482
+ "Mood=Sub|Number=Plur|POS=AUX|Person=2|Tense=Pres|VerbForm=Fin":87,
483
+ "Number=Sing|POS=VERB|Tense=Past|VerbForm=Part":100,
484
+ "Mood=Ind|Number=Plur|POS=VERB|Person=2|Tense=Imp|VerbForm=Fin":100,
485
+ "Number=Sing|POS=PRON|PronType=Rel":95,
486
+ "Gender=Fem|Number=Plur|Number[psor]=Plur|POS=PRON|Person=3|Person[psor]=1|Poss=Yes|PronType=Prs":95,
487
+ "ExtPos=PROPN|Gender=Masc|Number=Sing|POS=NOUN":92,
488
+ "ExtPos=PROPN|Gender=Masc|Number=Plur|POS=NOUN":92,
489
+ "Definite=Def|Gender=Masc|Number=Sing|POS=ADP|PronType=Art|Typo=Yes":85,
490
+ "ExtPos=PROPN|Gender=Fem|Number=Sing|POS=NOUN":92,
491
+ "ExtPos=ADV|POS=PRON|Person=3|PronType=Prs":95,
492
+ "Gender=Fem|Number=Plur|POS=PROPN":96,
493
+ "Gender=Fem|Number=Plur|POS=VERB|Tense=Past|Typo=Yes|VerbForm=Part|Voice=Pass":100,
494
+ "ExtPos=ADJ|Gender=Masc|NumType=Card|POS=NUM":93
495
+ },
496
+ "overwrite":true
497
+ }
morphologizer/model ADDED
Binary file (751 kB). View file
 
ner/cfg ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "moves":null,
3
+ "update_with_oracle_cut_size":100,
4
+ "multitasks":[
5
+
6
+ ],
7
+ "min_action_freq":1,
8
+ "learn_tokens":false,
9
+ "beam_width":1,
10
+ "beam_density":0.0,
11
+ "beam_update_prob":0.0,
12
+ "incorrect_spans_key":null
13
+ }
ner/model ADDED
Binary file (220 kB). View file
 
ner/moves ADDED
@@ -0,0 +1 @@
 
 
1
+ ��movesٴ{"0":{},"1":{"ORG":3036,"PER":2084,"LOC":1914},"2":{"ORG":3036,"PER":2084,"LOC":1914},"3":{"ORG":3036,"PER":2084,"LOC":1914},"4":{"ORG":3036,"PER":2084,"LOC":1914,"":1},"5":{"":1}}�cfg��neg_key�
ner_transformer/cfg ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ {
2
+ "max_batch_items":4096
3
+ }
ner_transformer/model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ec954b90f0e7d6ddb6e3473315aa58d75ce25f4afb0340776ec337a7817ea614
3
+ size 440759145
parser/cfg ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "moves":null,
3
+ "update_with_oracle_cut_size":100,
4
+ "multitasks":[
5
+
6
+ ],
7
+ "min_action_freq":30,
8
+ "learn_tokens":false,
9
+ "beam_width":1,
10
+ "beam_density":0.0,
11
+ "beam_update_prob":0.0,
12
+ "incorrect_spans_key":null
13
+ }
parser/model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0c8c07ef954f23f206051c3c8ecc076fc8646c31b44d65c6d2c4550f24d05ee2
3
+ size 1014411
parser/moves ADDED
@@ -0,0 +1 @@
 
 
1
+ ��moves��{"0":{"":26178},"1":{"":20735},"2":{"case":7357,"det":6066,"punct":2497,"nsubj":1946,"cc":1205,"advmod":1181,"mark":1031,"aux:tense":673,"amod":668,"nummod":621,"aux:pass":549,"obl:mod":500,"nsubj:pass":442,"cop":375,"expl:pv":172,"obj":169,"expl:subj":165,"advcl":136,"iobj":97,"nmod":84,"expl:pass":40,"vocative":35,"expl:comp":32,"dep":0},"3":{"nmod":5196,"punct":3204,"amod":2045,"conj":1533,"obj":1393,"obl:mod":1207,"obl:arg":1142,"acl":766,"xcomp":630,"advmod":573,"flat:name":547,"fixed":378,"acl:relcl":369,"ccomp":339,"appos":335,"advcl":269,"obl:agent":202,"parataxis":107,"flat:foreign":97,"nsubj":96,"parataxis:insert":80,"dep":48,"csubj":45},"4":{"ROOT":2231}}�cfg��neg_key�
tagger/cfg ADDED
@@ -0,0 +1,254 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "label_smoothing":0.0,
3
+ "labels":[
4
+ "ADJ",
5
+ "ADJ__ExtPos=ADV|Gender=Masc|Number=Sing",
6
+ "ADJ__Gender=Fem|Number=Plur",
7
+ "ADJ__Gender=Fem|Number=Plur|NumType=Ord",
8
+ "ADJ__Gender=Fem|Number=Sing",
9
+ "ADJ__Gender=Fem|Number=Sing|NumType=Ord",
10
+ "ADJ__Gender=Masc",
11
+ "ADJ__Gender=Masc|Number=Plur",
12
+ "ADJ__Gender=Masc|Number=Plur|NumType=Ord",
13
+ "ADJ__Gender=Masc|Number=Sing",
14
+ "ADJ__Gender=Masc|Number=Sing|NumType=Ord",
15
+ "ADJ__NumType=Ord",
16
+ "ADJ__Number=Plur",
17
+ "ADJ__Number=Sing",
18
+ "ADJ__Number=Sing|NumType=Ord",
19
+ "ADJ__Typo=Yes",
20
+ "ADP",
21
+ "ADP_DET__Definite=Def|ExtPos=ADV|Gender=Masc|Number=Sing|PronType=Art",
22
+ "ADP_DET__Definite=Def|Gender=Masc|Number=Sing|PronType=Art",
23
+ "ADP_DET__Definite=Def|Gender=Masc|Number=Sing|PronType=Art|Typo=Yes",
24
+ "ADP_DET__Definite=Def|Number=Plur|PronType=Art",
25
+ "ADP_PRON__Gender=Fem|Number=Plur|PronType=Rel",
26
+ "ADP_PRON__Gender=Masc|Number=Plur|PronType=Rel",
27
+ "ADP_PRON__Gender=Masc|Number=Sing|PronType=Rel",
28
+ "ADP__ExtPos=ADJ",
29
+ "ADP__ExtPos=ADP",
30
+ "ADP__ExtPos=ADV",
31
+ "ADP__ExtPos=CCONJ",
32
+ "ADP__ExtPos=DET",
33
+ "ADP__ExtPos=PRON",
34
+ "ADP__ExtPos=SCONJ",
35
+ "ADV",
36
+ "ADV__ExtPos=ADP|Polarity=Neg",
37
+ "ADV__ExtPos=ADV",
38
+ "ADV__ExtPos=ADV|Polarity=Neg",
39
+ "ADV__ExtPos=CCONJ",
40
+ "ADV__ExtPos=DET|Polarity=Neg",
41
+ "ADV__ExtPos=PRON",
42
+ "ADV__ExtPos=SCONJ",
43
+ "ADV__Polarity=Neg",
44
+ "ADV__PronType=Int",
45
+ "AUX__Gender=Masc|Number=Sing|Tense=Past|VerbForm=Part",
46
+ "AUX__Mood=Cnd|Number=Plur|Person=3|Tense=Pres|VerbForm=Fin",
47
+ "AUX__Mood=Cnd|Number=Sing|Person=1|Tense=Pres|VerbForm=Fin",
48
+ "AUX__Mood=Cnd|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin",
49
+ "AUX__Mood=Ind|Number=Plur|Person=1|Tense=Fut|VerbForm=Fin",
50
+ "AUX__Mood=Ind|Number=Plur|Person=1|Tense=Imp|VerbForm=Fin",
51
+ "AUX__Mood=Ind|Number=Plur|Person=1|Tense=Pres|VerbForm=Fin",
52
+ "AUX__Mood=Ind|Number=Plur|Person=2|Tense=Pres|VerbForm=Fin",
53
+ "AUX__Mood=Ind|Number=Plur|Person=3|Tense=Fut|VerbForm=Fin",
54
+ "AUX__Mood=Ind|Number=Plur|Person=3|Tense=Imp|VerbForm=Fin",
55
+ "AUX__Mood=Ind|Number=Plur|Person=3|Tense=Past|VerbForm=Fin",
56
+ "AUX__Mood=Ind|Number=Plur|Person=3|Tense=Pres|Typo=Yes|VerbForm=Fin",
57
+ "AUX__Mood=Ind|Number=Plur|Person=3|Tense=Pres|VerbForm=Fin",
58
+ "AUX__Mood=Ind|Number=Sing|Person=1|Tense=Imp|VerbForm=Fin",
59
+ "AUX__Mood=Ind|Number=Sing|Person=1|Tense=Pres|VerbForm=Fin",
60
+ "AUX__Mood=Ind|Number=Sing|Person=3|Tense=Fut|VerbForm=Fin",
61
+ "AUX__Mood=Ind|Number=Sing|Person=3|Tense=Imp|Typo=Yes|VerbForm=Fin",
62
+ "AUX__Mood=Ind|Number=Sing|Person=3|Tense=Imp|VerbForm=Fin",
63
+ "AUX__Mood=Ind|Number=Sing|Person=3|Tense=Past|VerbForm=Fin",
64
+ "AUX__Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin",
65
+ "AUX__Mood=Sub|Number=Plur|Person=1|Tense=Pres|VerbForm=Fin",
66
+ "AUX__Mood=Sub|Number=Plur|Person=2|Tense=Pres|VerbForm=Fin",
67
+ "AUX__Mood=Sub|Number=Plur|Person=3|Tense=Pres|VerbForm=Fin",
68
+ "AUX__Mood=Sub|Number=Sing|Person=1|Tense=Pres|VerbForm=Fin",
69
+ "AUX__Mood=Sub|Number=Sing|Person=3|Tense=Pres|Typo=Yes|VerbForm=Fin",
70
+ "AUX__Mood=Sub|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin",
71
+ "AUX__Tense=Past|VerbForm=Part",
72
+ "AUX__Tense=Pres|VerbForm=Part",
73
+ "AUX__VerbForm=Inf",
74
+ "CCONJ",
75
+ "CCONJ__ExtPos=ADJ",
76
+ "CCONJ__ExtPos=CCONJ",
77
+ "DET",
78
+ "DET__Definite=Def|ExtPos=ADV|Gender=Masc|Number=Sing|PronType=Art",
79
+ "DET__Definite=Def|Gender=Fem|Number=Sing|PronType=Art",
80
+ "DET__Definite=Def|Gender=Masc|Number=Sing|PronType=Art",
81
+ "DET__Definite=Def|Number=Plur|PronType=Art",
82
+ "DET__Definite=Def|Number=Sing|PronType=Art",
83
+ "DET__Definite=Ind|ExtPos=ADV|Gender=Fem|Number=Sing|PronType=Art",
84
+ "DET__Definite=Ind|ExtPos=ADV|Gender=Masc|Number=Sing|PronType=Art",
85
+ "DET__Definite=Ind|Gender=Fem|Number=Plur|PronType=Art",
86
+ "DET__Definite=Ind|Gender=Fem|Number=Sing|PronType=Art",
87
+ "DET__Definite=Ind|Gender=Masc|Number=Plur|PronType=Art",
88
+ "DET__Definite=Ind|Gender=Masc|Number=Sing|PronType=Art",
89
+ "DET__Definite=Ind|Number=Plur|PronType=Art",
90
+ "DET__Definite=Ind|Number=Sing|PronType=Art",
91
+ "DET__Gender=Fem|Number=Plur",
92
+ "DET__Gender=Fem|Number=Plur|PronType=Int",
93
+ "DET__Gender=Fem|Number=Sing",
94
+ "DET__Gender=Fem|Number=Sing|Polarity=Neg",
95
+ "DET__Gender=Fem|Number=Sing|Poss=Yes",
96
+ "DET__Gender=Fem|Number=Sing|PronType=Dem",
97
+ "DET__Gender=Fem|Number=Sing|PronType=Int",
98
+ "DET__Gender=Masc|Number=Plur",
99
+ "DET__Gender=Masc|Number=Sing",
100
+ "DET__Gender=Masc|Number=Sing|Polarity=Neg",
101
+ "DET__Gender=Masc|Number=Sing|PronType=Dem",
102
+ "DET__Gender=Masc|Number=Sing|PronType=Int",
103
+ "DET__Number=Plur",
104
+ "DET__Number=Plur|Poss=Yes",
105
+ "DET__Number=Plur|PronType=Dem",
106
+ "DET__Number=Sing",
107
+ "DET__Number=Sing|Poss=Yes",
108
+ "INTJ",
109
+ "INTJ__ExtPos=INTJ",
110
+ "NOUN",
111
+ "NOUN__ExtPos=ADP|Gender=Fem|Number=Sing",
112
+ "NOUN__ExtPos=ADV|Gender=Masc|Number=Sing",
113
+ "NOUN__ExtPos=PROPN|Gender=Fem|Number=Sing",
114
+ "NOUN__ExtPos=PROPN|Gender=Masc|Number=Plur",
115
+ "NOUN__ExtPos=PROPN|Gender=Masc|Number=Sing",
116
+ "NOUN__Gender=Fem",
117
+ "NOUN__Gender=Fem|Number=Plur",
118
+ "NOUN__Gender=Fem|Number=Sing",
119
+ "NOUN__Gender=Fem|Number=Sing|Typo=Yes",
120
+ "NOUN__Gender=Masc",
121
+ "NOUN__Gender=Masc|Number=Plur",
122
+ "NOUN__Gender=Masc|Number=Plur|NumType=Card",
123
+ "NOUN__Gender=Masc|Number=Sing",
124
+ "NOUN__Gender=Masc|Number=Sing|NumType=Card",
125
+ "NOUN__Gender=Masc|Number=Sing|Typo=Yes",
126
+ "NOUN__NumType=Card",
127
+ "NOUN__Number=Plur",
128
+ "NOUN__Number=Sing",
129
+ "NUM",
130
+ "NUM__ExtPos=ADJ|Gender=Masc|NumType=Card",
131
+ "NUM__NumType=Card",
132
+ "PRON__ExtPos=ADP|Gender=Masc|Number=Sing|Person=3|PronType=Prs",
133
+ "PRON__ExtPos=ADP|Number=Sing|Person=3|PronType=Prs",
134
+ "PRON__ExtPos=ADP|Person=3|PronType=Prs",
135
+ "PRON__ExtPos=ADV|Gender=Masc|Number=Sing|Person=3|PronType=Prs",
136
+ "PRON__ExtPos=ADV|Number=Sing|Person=3|PronType=Dem",
137
+ "PRON__ExtPos=ADV|Person=3|PronType=Prs",
138
+ "PRON__ExtPos=CCONJ|Person=3|PronType=Prs",
139
+ "PRON__ExtPos=PRON|PronType=Rel",
140
+ "PRON__Gender=Fem|Number=Plur|Number[psor]=Plur|Person=3|Person[psor]=1|Poss=Yes|PronType=Prs",
141
+ "PRON__Gender=Fem|Number=Plur|Person=3|PronType=Dem",
142
+ "PRON__Gender=Fem|Number=Plur|Person=3|PronType=Ind",
143
+ "PRON__Gender=Fem|Number=Plur|Person=3|PronType=Prs",
144
+ "PRON__Gender=Fem|Number=Plur|Person=3|PronType=Prs|Typo=Yes",
145
+ "PRON__Gender=Fem|Number=Plur|PronType=Rel",
146
+ "PRON__Gender=Fem|Number=Sing|Person=3|PronType=Dem",
147
+ "PRON__Gender=Fem|Number=Sing|Person=3|PronType=Ind",
148
+ "PRON__Gender=Fem|Number=Sing|Person=3|PronType=Prs",
149
+ "PRON__Gender=Fem|Number=Sing|Person=3|PronType=Prs|Reflex=Yes",
150
+ "PRON__Gender=Fem|Number=Sing|PronType=Rel",
151
+ "PRON__Gender=Masc|Number=Plur|Person=3|PronType=Dem",
152
+ "PRON__Gender=Masc|Number=Plur|Person=3|PronType=Ind",
153
+ "PRON__Gender=Masc|Number=Plur|Person=3|PronType=Prs",
154
+ "PRON__Gender=Masc|Number=Plur|Person=3|PronType=Prs|Reflex=Yes",
155
+ "PRON__Gender=Masc|Number=Plur|PronType=Rel",
156
+ "PRON__Gender=Masc|Number=Sing|Person=3|PronType=Dem",
157
+ "PRON__Gender=Masc|Number=Sing|Person=3|PronType=Ind",
158
+ "PRON__Gender=Masc|Number=Sing|Person=3|PronType=Prs",
159
+ "PRON__Gender=Masc|Number=Sing|PronType=Neg",
160
+ "PRON__Gender=Masc|Number=Sing|PronType=Rel",
161
+ "PRON__Number=Plur|Person=1|PronType=Prs",
162
+ "PRON__Number=Plur|Person=1|PronType=Prs|Reflex=Yes",
163
+ "PRON__Number=Plur|Person=2|PronType=Prs",
164
+ "PRON__Number=Plur|Person=2|PronType=Prs|Reflex=Yes",
165
+ "PRON__Number=Plur|Person=3|PronType=Ind",
166
+ "PRON__Number=Plur|Person=3|PronType=Prs",
167
+ "PRON__Number=Sing|Person=1|PronType=Prs",
168
+ "PRON__Number=Sing|Person=1|PronType=Prs|Reflex=Yes",
169
+ "PRON__Number=Sing|Person=2|PronType=Prs",
170
+ "PRON__Number=Sing|Person=3|PronType=Dem",
171
+ "PRON__Number=Sing|Person=3|PronType=Ind",
172
+ "PRON__Number=Sing|Person=3|PronType=Prs",
173
+ "PRON__Number=Sing|PronType=Neg",
174
+ "PRON__Number=Sing|PronType=Rel",
175
+ "PRON__Person=3|PronType=Prs",
176
+ "PRON__Person=3|PronType=Prs|Reflex=Yes",
177
+ "PRON__PronType=Int",
178
+ "PRON__PronType=Rel",
179
+ "PROPN",
180
+ "PROPN__Gender=Fem|Number=Plur",
181
+ "PROPN__Gender=Fem|Number=Sing",
182
+ "PROPN__Gender=Masc",
183
+ "PROPN__Gender=Masc|Number=Plur",
184
+ "PROPN__Gender=Masc|Number=Sing",
185
+ "PROPN__Number=Plur",
186
+ "PROPN__Number=Sing",
187
+ "PUNCT",
188
+ "SCONJ",
189
+ "SCONJ__ExtPos=SCONJ",
190
+ "SYM",
191
+ "SYM__ExtPos=CCONJ",
192
+ "VERB__ExtPos=SCONJ|Gender=Masc|Number=Sing|Tense=Past|VerbForm=Part|Voice=Pass",
193
+ "VERB__Gender=Fem|Number=Plur|Tense=Past|Typo=Yes|VerbForm=Part|Voice=Pass",
194
+ "VERB__Gender=Fem|Number=Plur|Tense=Past|VerbForm=Part",
195
+ "VERB__Gender=Fem|Number=Plur|Tense=Past|VerbForm=Part|Voice=Pass",
196
+ "VERB__Gender=Fem|Number=Sing|Tense=Past|Typo=Yes|VerbForm=Part|Voice=Pass",
197
+ "VERB__Gender=Fem|Number=Sing|Tense=Past|VerbForm=Part",
198
+ "VERB__Gender=Fem|Number=Sing|Tense=Past|VerbForm=Part|Voice=Pass",
199
+ "VERB__Gender=Masc|Number=Plur|Tense=Past|Typo=Yes|VerbForm=Part|Voice=Pass",
200
+ "VERB__Gender=Masc|Number=Plur|Tense=Past|VerbForm=Part",
201
+ "VERB__Gender=Masc|Number=Plur|Tense=Past|VerbForm=Part|Voice=Pass",
202
+ "VERB__Gender=Masc|Number=Sing|Tense=Past|Typo=Yes|VerbForm=Part|Voice=Pass",
203
+ "VERB__Gender=Masc|Number=Sing|Tense=Past|VerbForm=Part",
204
+ "VERB__Gender=Masc|Number=Sing|Tense=Past|VerbForm=Part|Voice=Pass",
205
+ "VERB__Gender=Masc|Tense=Past|VerbForm=Part",
206
+ "VERB__Gender=Masc|Tense=Past|VerbForm=Part|Voice=Pass",
207
+ "VERB__Mood=Cnd|Number=Plur|Person=1|Tense=Pres|VerbForm=Fin",
208
+ "VERB__Mood=Cnd|Number=Plur|Person=2|Tense=Pres|VerbForm=Fin",
209
+ "VERB__Mood=Cnd|Number=Plur|Person=3|Tense=Pres|VerbForm=Fin",
210
+ "VERB__Mood=Cnd|Number=Sing|Person=1|Tense=Pres|VerbForm=Fin",
211
+ "VERB__Mood=Cnd|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin",
212
+ "VERB__Mood=Imp|Number=Plur|Person=1|Tense=Pres|VerbForm=Fin",
213
+ "VERB__Mood=Imp|Number=Plur|Person=2|Tense=Pres|VerbForm=Fin",
214
+ "VERB__Mood=Imp|Tense=Pres|VerbForm=Fin",
215
+ "VERB__Mood=Ind|Number=Plur|Person=1|Tense=Fut|VerbForm=Fin",
216
+ "VERB__Mood=Ind|Number=Plur|Person=1|Tense=Imp|VerbForm=Fin",
217
+ "VERB__Mood=Ind|Number=Plur|Person=1|Tense=Pres|VerbForm=Fin",
218
+ "VERB__Mood=Ind|Number=Plur|Person=2|Tense=Fut|VerbForm=Fin",
219
+ "VERB__Mood=Ind|Number=Plur|Person=2|Tense=Imp|VerbForm=Fin",
220
+ "VERB__Mood=Ind|Number=Plur|Person=2|Tense=Pres|VerbForm=Fin",
221
+ "VERB__Mood=Ind|Number=Plur|Person=3|Tense=Fut|VerbForm=Fin",
222
+ "VERB__Mood=Ind|Number=Plur|Person=3|Tense=Imp|VerbForm=Fin",
223
+ "VERB__Mood=Ind|Number=Plur|Person=3|Tense=Past|VerbForm=Fin",
224
+ "VERB__Mood=Ind|Number=Plur|Person=3|Tense=Pres|Typo=Yes|VerbForm=Fin",
225
+ "VERB__Mood=Ind|Number=Plur|Person=3|Tense=Pres|VerbForm=Fin",
226
+ "VERB__Mood=Ind|Number=Sing|Person=1|Tense=Fut|VerbForm=Fin",
227
+ "VERB__Mood=Ind|Number=Sing|Person=1|Tense=Imp|VerbForm=Fin",
228
+ "VERB__Mood=Ind|Number=Sing|Person=1|Tense=Pres|VerbForm=Fin",
229
+ "VERB__Mood=Ind|Number=Sing|Person=3|Tense=Fut|VerbForm=Fin",
230
+ "VERB__Mood=Ind|Number=Sing|Person=3|Tense=Imp|VerbForm=Fin",
231
+ "VERB__Mood=Ind|Number=Sing|Person=3|Tense=Past|VerbForm=Fin",
232
+ "VERB__Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin",
233
+ "VERB__Mood=Ind|Person=3|Tense=Pres|VerbForm=Fin",
234
+ "VERB__Mood=Ind|Person=3|VerbForm=Fin",
235
+ "VERB__Mood=Ind|VerbForm=Fin",
236
+ "VERB__Mood=Sub|Number=Plur|Person=3|Tense=Pres|VerbForm=Fin",
237
+ "VERB__Mood=Sub|Number=Sing|Person=1|Tense=Pres|VerbForm=Fin",
238
+ "VERB__Mood=Sub|Number=Sing|Person=3|Tense=Past|VerbForm=Fin",
239
+ "VERB__Mood=Sub|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin",
240
+ "VERB__Number=Plur|Tense=Past|VerbForm=Part|Voice=Pass",
241
+ "VERB__Number=Sing|Tense=Past|VerbForm=Part",
242
+ "VERB__Number=Sing|Tense=Past|VerbForm=Part|Voice=Pass",
243
+ "VERB__Tense=Past|VerbForm=Part",
244
+ "VERB__Tense=Past|VerbForm=Part|Voice=Pass",
245
+ "VERB__Tense=Pres|VerbForm=Part",
246
+ "VERB__VerbForm=Inf",
247
+ "X",
248
+ "X__ExtPos=ADJ",
249
+ "X__ExtPos=ADV",
250
+ "X__Foreign=Yes"
251
+ ],
252
+ "neg_prefix":"!",
253
+ "overwrite":false
254
+ }
tagger/model ADDED
Binary file (760 kB). View file
 
tokenizer ADDED
@@ -0,0 +1 @@
 
 
1
+ ��prefix_search�e^§|^%|^=|^—|^–|^\+(?![0-9])|^…|^……|^,|^:|^;|^\!|^\?|^¿|^؟|^¡|^\(|^\)|^\[|^\]|^\{|^\}|^<|^>|^_|^#|^\*|^&|^。|^?|^!|^,|^、|^;|^:|^~|^·|^।|^،|^۔|^؛|^٪|^\.\.+|^…|^\'|^"|^”|^“|^`|^‘|^´|^’|^‚|^,|^„|^»|^«|^「|^」|^『|^』|^(|^)|^〔|^〕|^【|^】|^《|^》|^〈|^〉|^〈|^〉|^⟦|^⟧|^\$|^£|^€|^¥|^฿|^US\$|^C\$|^A\$|^₽|^﷼|^₴|^₠|^₡|^₢|^₣|^₤|^₥|^₦|^₧|^₨|^₩|^₪|^₫|^€|^₭|^₮|^₯|^₰|^₱|^₲|^₳|^₴|^₵|^₶|^₷|^₸|^₹|^₺|^₻|^₼|^₽|^₾|^₿|^[\u00A6\u00A9\u00AE\u00B0\u0482\u058D\u058E\u060E\u060F\u06DE\u06E9\u06FD\u06FE\u07F6\u09FA\u0B70\u0BF3-\u0BF8\u0BFA\u0C7F\u0D4F\u0D79\u0F01-\u0F03\u0F13\u0F15-\u0F17\u0F1A-\u0F1F\u0F34\u0F36\u0F38\u0FBE-\u0FC5\u0FC7-\u0FCC\u0FCE\u0FCF\u0FD5-\u0FD8\u109E\u109F\u1390-\u1399\u1940\u19DE-\u19FF\u1B61-\u1B6A\u1B74-\u1B7C\u2100\u2101\u2103-\u2106\u2108\u2109\u2114\u2116\u2117\u211E-\u2123\u2125\u2127\u2129\u212E\u213A\u213B\u214A\u214C\u214D\u214F\u218A\u218B\u2195-\u2199\u219C-\u219F\u21A1\u21A2\u21A4\u21A5\u21A7-\u21AD\u21AF-\u21CD\u21D0\u21D1\u21D3\u21D5-\u21F3\u2300-\u2307\u230C-\u231F\u2322-\u2328\u232B-\u237B\u237D-\u239A\u23B4-\u23DB\u23E2-\u2426\u2440-\u244A\u249C-\u24E9\u2500-\u25B6\u25B8-\u25C0\u25C2-\u25F7\u2600-\u266E\u2670-\u2767\u2794-\u27BF\u2800-\u28FF\u2B00-\u2B2F\u2B45\u2B46\u2B4D-\u2B73\u2B76-\u2B95\u2B98-\u2BC8\u2BCA-\u2BFE\u2CE5-\u2CEA\u2E80-\u2E99\u2E9B-\u2EF3\u2F00-\u2FD5\u2FF0-\u2FFB\u3004\u3012\u3013\u3020\u3036\u3037\u303E\u303F\u3190\u3191\u3196-\u319F\u31C0-\u31E3\u3200-\u321E\u322A-\u3247\u3250\u3260-\u327F\u328A-\u32B0\u32C0-\u32FE\u3300-\u33FF\u4DC0-\u4DFF\uA490-\uA4C6\uA828-\uA82B\uA836\uA837\uA839\uAA77-\uAA79\uFDFD\uFFE4\uFFE8\uFFED\uFFEE\uFFFC\uFFFD\U00010137-\U0001013F\U00010179-\U00010189\U0001018C-\U0001018E\U00010190-\U0001019B\U000101A0\U000101D0-\U000101FC\U00010877\U00010878\U00010AC8\U0001173F\U00016B3C-\U00016B3F\U00016B45\U0001BC9C\U0001D000-\U0001D0F5\U0001D100-\U0001D126\U0001D129-\U0001D164\U0001D16A-\U0001D16C\U0001D183\U0001D184\U0001D18C-\U0001D1A9\U0001D1AE-\U0001D1E8\U0001D200-\U0001D241\U0001D245\U0001D300-\U0001D356\U0001D800-\U0001D9FF\U0001DA37-\U0001DA3A\U0001DA6D-\U0001DA74\U0001DA76-\U0001DA83\U0001DA85\U0001DA86\U0001ECAC\U0001F000-\U0001F02B\U0001F030-\U0001F093\U0001F0A0-\U0001F0AE\U0001F0B1-\U0001F0BF\U0001F0C1-\U0001F0CF\U0001F0D1-\U0001F0F5\U0001F110-\U0001F16B\U0001F170-\U0001F1AC\U0001F1E6-\U0001F202\U0001F210-\U0001F23B\U0001F240-\U0001F248\U0001F250\U0001F251\U0001F260-\U0001F265\U0001F300-\U0001F3FA\U0001F400-\U0001F6D4\U0001F6E0-\U0001F6EC\U0001F6F0-\U0001F6F9\U0001F700-\U0001F773\U0001F780-\U0001F7D8\U0001F800-\U0001F80B\U0001F810-\U0001F847\U0001F850-\U0001F859\U0001F860-\U0001F887\U0001F890-\U0001F8AD\U0001F900-\U0001F90B\U0001F910-\U0001F93E\U0001F940-\U0001F970\U0001F973-\U0001F976\U0001F97A\U0001F97C-\U0001F9A2\U0001F9B0-\U0001F9B9\U0001F9C0-\U0001F9C2\U0001F9D0-\U0001F9FF\U0001FA60-\U0001FA6D]|^(?:(d|l|n|D|L|N)['’])(?=[A-Za-z\uFF21-\uFF3A\uFF41-\uFF5A\u00C0-\u00D6\u00D8-\u00F6\u00F8-\u00FF\u0100-\u017F\u0180-\u01BF\u01C4-\u024F\u2C60-\u2C7B\u2C7E\u2C7F\uA722-\uA76F\uA771-\uA787\uA78B-\uA78E\uA790-\uA7B9\uA7FA\uAB30-\uAB5A\uAB60-\uAB64\u0250-\u02AF\u1D00-\u1D25\u1D6B-\u1D77\u1D79-\u1D9A\u1E00-\u1EFFёа-яЁА-ЯәөүҗңһӘӨҮҖҢҺα-ωάέίόώήύΑ-ΩΆΈΊΌΏΉΎа-щюяіїєґА-ЩЮЯІЇЄҐѓѕјљњќѐѝЃЅЈЉЊЌЀЍ\u1200-\u137F\u0980-\u09FF\u0591-\u05F4\uFB1D-\uFB4F\u0620-\u064A\u066E-\u06D5\u06E5-\u06FF\u0750-\u077F\u08A0-\u08BD\uFB50-\uFBB1\uFBD3-\uFD3D\uFD50-\uFDC7\uFDF0-\uFDFB\uFE70-\uFEFC\U0001EE00-\U0001EEBB\u0D80-\u0DFF\u0900-\u097F\u0C80-\u0CFF\u0B80-\u0BFF\u0C00-\u0C7F\uAC00-\uD7AF\u1100-\u11FF\u3040-\u309F\u30A0-\u30FFー\u4E00-\u62FF\u6300-\u77FF\u7800-\u8CFF\u8D00-\u9FFF\u3400-\u4DBF\U00020000-\U000215FF\U00021600-\U000230FF\U00023100-\U000245FF\U00024600-\U000260FF\U00026100-\U000275FF\U00027600-\U000290FF\U00029100-\U0002A6DF\U0002A700-\U0002B73F\U0002B740-\U0002B81F\U0002B820-\U0002CEAF\U0002CEB0-\U0002EBEF\u2E80-\u2EFF\u2F00-\u2FDF\u2FF0-\u2FFF\u3000-\u303F\u31C0-\u31EF\u3200-\u32FF\u3300-\u33FF\uF900-\uFAFF\uFE30-\uFE4F\U0001F200-\U0001F2FF\U0002F800-\U0002FA1F])�suffix_search�3�…$|……$|,$|:$|;$|\!$|\?$|¿$|؟$|¡$|\($|\)$|\[$|\]$|\{$|\}$|<$|>$|_$|#$|\*$|&$|。$|?$|!$|,$|、$|;$|:$|~$|·$|।$|،$|۔$|؛$|٪$|…$|\'$|"$|”$|“$|`$|‘$|´$|’$|‚$|,$|„$|»$|«$|「$|」$|『$|』$|($|)$|〔$|〕$|【$|】$|《$|》$|〈$|〉$|〈$|〉$|⟦$|⟧$|(?<=[0-9])\+$|(?<=°[FfCcKk])\.$|(?<=[0-9])%$|(?<=[0-9])(?:\$|£|€|¥|฿|US\$|C\$|A\$|₽|﷼|₴|₠|₡|₢|₣|₤|₥|₦|₧|₨|₩|₪|₫|€|₭|₮|₯|₰|₱|₲|₳|₴|₵|₶|₷|₸|₹|₺|₻|₼|₽|₾|₿)$|(?<=[0-9])(?:km|km²|km³|m|m²|m³|dm|dm²|dm³|cm|cm²|cm³|mm|mm²|mm³|ha|µm|nm|yd|in|ft|kg|g|mg|µg|t|lb|oz|m/s|km/h|kmh|mph|hPa|Pa|mbar|mb|MB|kb|KB|gb|GB|tb|TB|T|G|M|K|%|км|км²|км³|м|м²|м³|дм|дм²|дм³|см|см²|см³|мм|мм²|мм³|нм|кг|г|мг|м/с|км/ч|кПа|Па|мбар|Кб|КБ|кб|Мб|М��|мб|Гб|ГБ|гб|Тб|ТБ|тбكم|كم²|كم³|م|م²|م³|سم|سم²|سم³|مم|مم²|مم³|كم|غرام|جرام|جم|كغ|ملغ|كوب|اكواب)$|(?<=[0-9a-z\uFF41-\uFF5A\u00DF-\u00F6\u00F8-\u00FF\u0101\u0103\u0105\u0107\u0109\u010B\u010D\u010F\u0111\u0113\u0115\u0117\u0119\u011B\u011D\u011F\u0121\u0123\u0125\u0127\u0129\u012B\u012D\u012F\u0131\u0133\u0135\u0137\u0138\u013A\u013C\u013E\u0140\u0142\u0144\u0146\u0148\u0149\u014B\u014D\u014F\u0151\u0153\u0155\u0157\u0159\u015B\u015D\u015F\u0161\u0163\u0165\u0167\u0169\u016B\u016D\u016F\u0171\u0173\u0175\u0177\u017A\u017C\u017E\u017F\u0180\u0183\u0185\u0188\u018C\u018D\u0192\u0195\u0199-\u019B\u019E\u01A1\u01A3\u01A5\u01A8\u01AA\u01AB\u01AD\u01B0\u01B4\u01B6\u01B9\u01BA\u01BD-\u01BF\u01C6\u01C9\u01CC\u01CE\u01D0\u01D2\u01D4\u01D6\u01D8\u01DA\u01DC\u01DD\u01DF\u01E1\u01E3\u01E5\u01E7\u01E9\u01EB\u01ED\u01EF\u01F0\u01F3\u01F5\u01F9\u01FB\u01FD\u01FF\u0201\u0203\u0205\u0207\u0209\u020B\u020D\u020F\u0211\u0213\u0215\u0217\u0219\u021B\u021D\u021F\u0221\u0223\u0225\u0227\u0229\u022B\u022D\u022F\u0231\u0233-\u0239\u023C\u023F\u0240\u0242\u0247\u0249\u024B\u024D\u024F\u2C61\u2C65\u2C66\u2C68\u2C6A\u2C6C\u2C71\u2C73\u2C74\u2C76-\u2C7B\uA723\uA725\uA727\uA729\uA72B\uA72D\uA72F-\uA731\uA733\uA735\uA737\uA739\uA73B\uA73D\uA73F\uA741\uA743\uA745\uA747\uA749\uA74B\uA74D\uA74F\uA751\uA753\uA755\uA757\uA759\uA75B\uA75D\uA75F\uA761\uA763\uA765\uA767\uA769\uA76B\uA76D\uA76F\uA771-\uA778\uA77A\uA77C\uA77F\uA781\uA783\uA785\uA787\uA78C\uA78E\uA791\uA793-\uA795\uA797\uA799\uA79B\uA79D\uA79F\uA7A1\uA7A3\uA7A5\uA7A7\uA7A9\uA7AF\uA7B5\uA7B7\uA7B9\uA7FA\uAB30-\uAB5A\uAB60-\uAB64\u0250-\u02AF\u1D00-\u1D25\u1D6B-\u1D77\u1D79-\u1D9A\u1E01\u1E03\u1E05\u1E07\u1E09\u1E0B\u1E0D\u1E0F\u1E11\u1E13\u1E15\u1E17\u1E19\u1E1B\u1E1D\u1E1F\u1E21\u1E23\u1E25\u1E27\u1E29\u1E2B\u1E2D\u1E2F\u1E31\u1E33\u1E35\u1E37\u1E39\u1E3B\u1E3D\u1E3F\u1E41\u1E43\u1E45\u1E47\u1E49\u1E4B\u1E4D\u1E4F\u1E51\u1E53\u1E55\u1E57\u1E59\u1E5B\u1E5D\u1E5F\u1E61\u1E63\u1E65\u1E67\u1E69\u1E6B\u1E6D\u1E6F\u1E71\u1E73\u1E75\u1E77\u1E79\u1E7B\u1E7D\u1E7F\u1E81\u1E83\u1E85\u1E87\u1E89\u1E8B\u1E8D\u1E8F\u1E91\u1E93\u1E95-\u1E9D\u1E9F\u1EA1\u1EA3\u1EA5\u1EA7\u1EA9\u1EAB\u1EAD\u1EAF\u1EB1\u1EB3\u1EB5\u1EB7\u1EB9\u1EBB\u1EBD\u1EBF\u1EC1\u1EC3\u1EC5\u1EC7\u1EC9\u1ECB\u1ECD\u1ECF\u1ED1\u1ED3\u1ED5\u1ED7\u1ED9\u1EDB\u1EDD\u1EDF\u1EE1\u1EE3\u1EE5\u1EE7\u1EE9\u1EEB\u1EED\u1EEF\u1EF1\u1EF3\u1EF5\u1EF7\u1EF9\u1EFB\u1EFD\u1EFFёа-яәөүҗңһα-ωάέίόώήύа-щюяіїєґѓѕјљњќѐѝ\u1200-\u137F\u0980-\u09FF\u0591-\u05F4\uFB1D-\uFB4F\u0620-\u064A\u066E-\u06D5\u06E5-\u06FF\u0750-\u077F\u08A0-\u08BD\uFB50-\uFBB1\uFBD3-\uFD3D\uFD50-\uFDC7\uFDF0-\uFDFB\uFE70-\uFEFC\U0001EE00-\U0001EEBB\u0D80-\u0DFF\u0900-\u097F\u0C80-\u0CFF\u0B80-\u0BFF\u0C00-\u0C7F\uAC00-\uD7AF\u1100-\u11FF\u3040-\u309F\u30A0-\u30FFー\u4E00-\u62FF\u6300-\u77FF\u7800-\u8CFF\u8D00-\u9FFF\u3400-\u4DBF\U00020000-\U000215FF\U00021600-\U000230FF\U00023100-\U000245FF\U00024600-\U000260FF\U00026100-\U000275FF\U00027600-\U000290FF\U00029100-\U0002A6DF\U0002A700-\U0002B73F\U0002B740-\U0002B81F\U0002B820-\U0002CEAF\U0002CEB0-\U0002EBEF\u2E80-\u2EFF\u2F00-\u2FDF\u2FF0-\u2FFF\u3000-\u303F\u31C0-\u31EF\u3200-\u32FF\u3300-\u33FF\uF900-\uFAFF\uFE30-\uFE4F\U0001F200-\U0001F2FF\U0002F800-\U0002FA1F%²\-\+(?:\'"”“`‘´’‚,„»«「」『』()〔〕【】《》〈〉〈〉⟦⟧)])\.$|(?<=[A-Z\uFF21-\uFF3A\u00C0-\u00D6\u00D8-\u00DE\u0100\u0102\u0104\u0106\u0108\u010A\u010C\u010E\u0110\u0112\u0114\u0116\u0118\u011A\u011C\u011E\u0120\u0122\u0124\u0126\u0128\u012A\u012C\u012E\u0130\u0132\u0134\u0136\u0139\u013B\u013D\u013F\u0141\u0143\u0145\u0147\u014A\u014C\u014E\u0150\u0152\u0154\u0156\u0158\u015A\u015C\u015E\u0160\u0162\u0164\u0166\u0168\u016A\u016C\u016E\u0170\u0172\u0174\u0176\u0178\u0179\u017B\u017D\u0181\u0182\u0184\u0186\u0187\u0189-\u018B\u018E-\u0191\u0193\u0194\u0196-\u0198\u019C\u019D\u019F\u01A0\u01A2\u01A4\u01A6\u01A7\u01A9\u01AC\u01AE\u01AF\u01B1-\u01B3\u01B5\u01B7\u01B8\u01BC\u01C4\u01C7\u01CA\u01CD\u01CF\u01D1\u01D3\u01D5\u01D7\u01D9\u01DB\u01DE\u01E0\u01E2\u01E4\u01E6\u01E8\u01EA\u01EC\u01EE\u01F1\u01F4\u01F6-\u01F8\u01FA\u01FC\u01FE\u0200\u0202\u0204\u0206\u0208\u020A\u020C\u020E\u0210\u0212\u0214\u0216\u0218\u021A\u021C\u021E\u0220\u0222\u0224\u0226\u0228\u022A\u022C\u022E\u0230\u0232\u023A\u023B\u023D\u023E\u0241\u0243-\u0246\u0248\u024A\u024C\u024E\u2C60\u2C62-\u2C64\u2C67\u2C69\u2C6B\u2C6D-\u2C70\u2C72\u2C75\u2C7E\u2C7F\uA722\uA724\uA726\uA728\uA72A\uA72C\uA72E\uA732\uA734\uA736\uA738\uA73A\uA73C\uA73E\uA740\uA742\uA744\uA746\uA748\uA74A\uA74C\uA74E\uA750\uA752\uA754\uA756\uA758\uA75A\uA75C\uA75E\uA760\uA762\uA764\uA766\uA768\uA76A\uA76C\uA76E\uA779\uA77B\uA77D\uA77E\uA780\uA782\uA784\uA786\uA78B\uA78D\uA790\uA792\uA796\uA798\uA79A\uA79C\uA79E\uA7A0\uA7A2\uA7A4\uA7A6\uA7A8\uA7AA-\uA7AE\uA7B0-\uA7B4\uA7B6\uA7B8\u1E00\u1E02\u1E04\u1E06\u1E08\u1E0A\u1E0C\u1E0E\u1E10\u1E12\u1E14\u1E16\u1E18\u1E1A\u1E1C\u1E1E\u1E20\u1E22\u1E24\u1E26\u1E28\u1E2A\u1E2C\u1E2E\u1E30\u1E32\u1E34\u1E36\u1E38\u1E3A\u1E3C\u1E3E\u1E40\u1E42\u1E44\u1E46\u1E48\u1E4A\u1E4C\u1E4E\u1E50\u1E52\u1E54\u1E56\u1E58\u1E5A\u1E5C\u1E5E\u1E60\u1E62\u1E64\u1E66\u1E68\u1E6A\u1E6C\u1E6E\u1E70\u1E72\u1E74\u1E76\u1E78\u1E7A\u1E7C\u1E7E\u1E80\u1E82\u1E84\u1E86\u1E88\u1E8A\u1E8C\u1E8E\u1E90\u1E92\u1E94\u1E9E\u1EA0\u1EA2\u1EA4\u1EA6\u1EA8\u1EAA\u1EAC\u1EAE\u1EB0\u1EB2\u1EB4\u1EB6\u1EB8\u1EBA\u1EBC\u1EBE\u1EC0\u1EC2\u1EC4\u1EC6\u1EC8\u1ECA\u1ECC\u1ECE\u1ED0\u1ED2\u1ED4\u1ED6\u1ED8\u1EDA\u1EDC\u1EDE\u1EE0\u1EE2\u1EE4\u1EE6\u1EE8\u1EEA\u1EEC\u1EEE\u1EF0\u1EF2\u1EF4\u1EF6\u1EF8\u1EFA\u1EFC\u1EFEЁА-ЯӘӨҮҖҢҺΑ-ΩΆΈΊΌΏΉΎА-ЩЮЯІЇЄҐЃЅЈЉЊЌЀЍ\u1200-\u137F\u0980-\u09FF\u0591-\u05F4\uFB1D-\uFB4F\u0620-\u064A\u066E-\u06D5\u06E5-\u06FF\u0750-\u077F\u08A0-\u08BD\uFB50-\uFBB1\uFBD3-\uFD3D\uFD50-\uFDC7\uFDF0-\uFDFB\uFE70-\uFEFC\U0001EE00-\U0001EEBB\u0D80-\u0DFF\u0900-\u097F\u0C80-\u0CFF\u0B80-\u0BFF\u0C00-\u0C7F\uAC00-\uD7AF\u1100-\u11FF\u3040-\u309F\u30A0-\u30FFー\u4E00-\u62FF\u6300-\u77FF\u7800-\u8CFF\u8D00-\u9FFF\u3400-\u4DBF\U00020000-\U000215FF\U00021600-\U000230FF\U00023100-\U000245FF\U00024600-\U000260FF\U00026100-\U000275FF\U00027600-\U000290FF\U00029100-\U0002A6DF\U0002A700-\U0002B73F\U0002B740-\U0002B81F\U0002B820-\U0002CEAF\U0002CEB0-\U0002EBEF\u2E80-\u2EFF\u2F00-\u2FDF\u2FF0-\u2FFF\u3000-\u303F\u31C0-\u31EF\u3200-\u32FF\u3300-\u33FF\uF900-\uFAFF\uFE30-\uFE4F\U0001F200-\U0001F2FF\U0002F800-\U0002FA1F][A-Z\uFF21-\uFF3A\u00C0-\u00D6\u00D8-\u00DE\u0100\u0102\u0104\u0106\u0108\u010A\u010C\u010E\u0110\u0112\u0114\u0116\u0118\u011A\u011C\u011E\u0120\u0122\u0124\u0126\u0128\u012A\u012C\u012E\u0130\u0132\u0134\u0136\u0139\u013B\u013D\u013F\u0141\u0143\u0145\u0147\u014A\u014C\u014E\u0150\u0152\u0154\u0156\u0158\u015A\u015C\u015E\u0160\u0162\u0164\u0166\u0168\u016A\u016C\u016E\u0170\u0172\u0174\u0176\u0178\u0179\u017B\u017D\u0181\u0182\u0184\u0186\u0187\u0189-\u018B\u018E-\u0191\u0193\u0194\u0196-\u0198\u019C\u019D\u019F\u01A0\u01A2\u01A4\u01A6\u01A7\u01A9\u01AC\u01AE\u01AF\u01B1-\u01B3\u01B5\u01B7\u01B8\u01BC\u01C4\u01C7\u01CA\u01CD\u01CF\u01D1\u01D3\u01D5\u01D7\u01D9\u01DB\u01DE\u01E0\u01E2\u01E4\u01E6\u01E8\u01EA\u01EC\u01EE\u01F1\u01F4\u01F6-\u01F8\u01FA\u01FC\u01FE\u0200\u0202\u0204\u0206\u0208\u020A\u020C\u020E\u0210\u0212\u0214\u0216\u0218\u021A\u021C\u021E\u0220\u0222\u0224\u0226\u0228\u022A\u022C\u022E\u0230\u0232\u023A\u023B\u023D\u023E\u0241\u0243-\u0246\u0248\u024A\u024C\u024E\u2C60\u2C62-\u2C64\u2C67\u2C69\u2C6B\u2C6D-\u2C70\u2C72\u2C75\u2C7E\u2C7F\uA722\uA724\uA726\uA728\uA72A\uA72C\uA72E\uA732\uA734\uA736\uA738\uA73A\uA73C\uA73E\uA740\uA742\uA744\uA746\uA748\uA74A\uA74C\uA74E\uA750\uA752\uA754\uA756\uA758\uA75A\uA75C\uA75E\uA760\uA762\uA764\uA766\uA768\uA76A\uA76C\uA76E\uA779\uA77B\uA77D\uA77E\uA780\uA782\uA784\uA786\uA78B\uA78D\uA790\uA792\uA796\uA798\uA79A\uA79C\uA79E\uA7A0\uA7A2\uA7A4\uA7A6\uA7A8\uA7AA-\uA7AE\uA7B0-\uA7B4\uA7B6\uA7B8\u1E00\u1E02\u1E04\u1E06\u1E08\u1E0A\u1E0C\u1E0E\u1E10\u1E12\u1E14\u1E16\u1E18\u1E1A\u1E1C\u1E1E\u1E20\u1E22\u1E24\u1E26\u1E28\u1E2A\u1E2C\u1E2E\u1E30\u1E32\u1E34\u1E36\u1E38\u1E3A\u1E3C\u1E3E\u1E40\u1E42\u1E44\u1E46\u1E48\u1E4A\u1E4C\u1E4E\u1E50\u1E52\u1E54\u1E56\u1E58\u1E5A\u1E5C\u1E5E\u1E60\u1E62\u1E64\u1E66\u1E68\u1E6A\u1E6C\u1E6E\u1E70\u1E72\u1E74\u1E76\u1E78\u1E7A\u1E7C\u1E7E\u1E80\u1E82\u1E84\u1E86\u1E88\u1E8A\u1E8C\u1E8E\u1E90\u1E92\u1E94\u1E9E\u1EA0\u1EA2\u1EA4\u1EA6\u1EA8\u1EAA\u1EAC\u1EAE\u1EB0\u1EB2\u1EB4\u1EB6\u1EB8\u1EBA\u1EBC\u1EBE\u1EC0\u1EC2\u1EC4\u1EC6\u1EC8\u1ECA\u1ECC\u1ECE\u1ED0\u1ED2\u1ED4\u1ED6\u1ED8\u1EDA\u1EDC\u1EDE\u1EE0\u1EE2\u1EE4\u1EE6\u1EE8\u1EEA\u1EEC\u1EEE\u1EF0\u1EF2\u1EF4\u1EF6\u1EF8\u1EFA\u1EFC\u1EFEЁА-ЯӘӨҮҖҢҺΑ-ΩΆΈΊΌΏΉΎА-ЩЮЯІЇЄҐЃЅЈЉЊЌЀЍ\u1200-\u137F\u0980-\u09FF\u0591-\u05F4\uFB1D-\uFB4F\u0620-\u064A\u066E-\u06D5\u06E5-\u06FF\u0750-\u077F\u08A0-\u08BD\uFB50-\uFBB1\uFBD3-\uFD3D\uFD50-\uFDC7\uFDF0-\uFDFB\uFE70-\uFEFC\U0001EE00-\U0001EEBB\u0D80-\u0DFF\u0900-\u097F\u0C80-\u0CFF\u0B80-\u0BFF\u0C00-\u0C7F\uAC00-\uD7AF\u1100-\u11FF\u3040-\u309F\u30A0-\u30FFー\u4E00-\u62FF\u6300-\u77FF\u7800-\u8CFF\u8D00-\u9FFF\u3400-\u4DBF\U00020000-\U000215FF\U00021600-\U000230FF\U00023100-\U000245FF\U00024600-\U000260FF\U00026100-\U000275FF\U00027600-\U000290FF\U00029100-\U0002A6DF\U0002A700-\U0002B73F\U0002B740-\U0002B81F\U0002B820-\U0002CEAF\U0002CEB0-\U0002EBEF\u2E80-\u2EFF\u2F00-\u2FDF\u2FF0-\u2FFF\u3000-\u303F\u31C0-\u31EF\u3200-\u32FF\u3300-\u33FF\uF900-\uFAFF\uFE30-\uFE4F\U0001F200-\U0001F2FF\U0002F800-\U0002FA1F])\.$|(?<=[A-Za-z\uFF21-\uFF3A\uFF41-\uFF5A\u00C0-\u00D6\u00D8-\u00F6\u00F8-\u00FF\u0100-\u017F\u0180-\u01BF\u01C4-\u024F\u2C60-\u2C7B\u2C7E\u2C7F\uA722-\uA76F\uA771-\uA787\uA78B-\uA78E\uA790-\uA7B9\uA7FA\uAB30-\uAB5A\uAB60-\uAB64\u0250-\u02AF\u1D00-\u1D25\u1D6B-\u1D77\u1D79-\u1D9A\u1E00-\u1EFFёа-яЁА-ЯәөүҗңһӘӨҮҖҢҺα-ωάέίόώήύΑ-ΩΆΈΊΌΏΉΎа-щюяіїєґА-ЩЮЯІЇЄҐѓѕјљњќѐѝЃЅЈЉЊЌЀЍ\u1200-\u137F\u0980-\u09FF\u0591-\u05F4\uFB1D-\uFB4F\u0620-\u064A\u066E-\u06D5\u06E5-\u06FF\u0750-\u077F\u08A0-\u08BD\uFB50-\uFBB1\uFBD3-\uFD3D\uFD50-\uFDC7\uFDF0-\uFDFB\uFE70-\uFEFC\U0001EE00-\U0001EEBB\u0D80-\u0DFF\u0900-\u097F\u0C80-\u0CFF\u0B80-\u0BFF\u0C00-\u0C7F\uAC00-\uD7AF\u1100-\u11FF\u3040-\u309F\u30A0-\u30FFー\u4E00-\u62FF\u6300-\u77FF\u7800-\u8CFF\u8D00-\u9FFF\u3400-\u4DBF\U00020000-\U000215FF\U00021600-\U000230FF\U00023100-\U000245FF\U00024600-\U000260FF\U00026100-\U000275FF\U00027600-\U000290FF\U00029100-\U0002A6DF\U0002A700-\U0002B73F\U0002B740-\U0002B81F\U0002B820-\U0002CEAF\U0002CEB0-\U0002EBEF\u2E80-\u2EFF\u2F00-\u2FDF\u2FF0-\u2FFF\u3000-\u303F\u31C0-\u31EF\u3200-\u32FF\u3300-\u33FF\uF900-\uFAFF\uFE30-\uFE4F\U0001F200-\U0001F2FF\U0002F800-\U0002FA1F])[-–—‐‑](ce|clés|elle|en|il|ils|je|là|moi|nous|on|t|vous|CE|CLÉS|ELLE|EN|IL|ILS|JE|LÀ|MOI|NOUS|ON|T|VOUS)$|\.\.\.+$|(?<=\d)[\.]$|(?<=[\.])[\]\)]$|[\)\]](?=[\(\[\.A-Za-zäöüÄÖÜàòèéìù0-9]+)$|(?<=')\.\.$|\.\.\.$|(?<=([A-Za-zäöüÄÖÜàòèéìù]\.){1})\.$|(?<=([A-Za-zäöüÄÖÜàòèéìù]\.){2})\.$|(?<=([A-Za-zäöüÄÖÜàòèéìù]\.){3})\.$|(?<=([A-Za-zäöüÄÖÜàòèéìù]\.){4})\.$|(?<=([A-Za-zäöüÄÖÜàòèéìù]\.){5})\.$|(?<=([A-Za-zäöüÄÖÜàòèéìù]\.){6})\.$|(?<=([A-Za-zäöüÄÖÜàòèéìù]\.){7})\.$|(?<=([A-Za-zäöüÄÖÜàòèéìù]\.){8})\.$|(?<=([A-Za-zäöüÄÖÜàòèéìù]\.){9})\.$|(?<=([A-Za-zäöüÄÖÜàòèéìù]\.){10})\.$|(?<=([A-Za-zäöüÄÖÜàòèéìù]\.){11})\.$|(?<=([A-Za-zäöüÄÖÜàòèéìù]\.){12})\.$|(?<=([A-Za-zäöüÄÖÜàòèéìù]\.){13})\.$|(?<=([A-Za-zäöüÄÖÜàòèéìù]\.){14})\.$|(?<=([A-Za-zäöüÄÖÜàòèéìù]\.){15})\.$|(?<=([A-Za-zäöüÄÖÜàòèéìù]\.){16})\.$|(?<=([A-Za-zäöüÄÖÜàòèéìù]\.){17})\.$|(?<=([A-Za-zäöüÄÖÜàòèéìù]\.){18})\.$|(?<=([A-Za-zäöüÄÖÜàòèéìù]\.){19})\.$|(?<=([A-Za-zäöüÄÖÜàòèéìù]\.){20})\.$|(?<=([A-Za-zäöüÄÖÜàòèéìù]\.){21})\.$|(?<=([A-Za-zäöüÄÖÜàòèéìù]\.){22})\.$|(?<=([A-Za-zäöüÄÖÜàòèéìù]\.){23})\.$|(?<=([A-Za-zäöüÄÖÜàòèéìù]\.){24})\.$|(?<=([A-Za-zäöüÄÖÜàòèéìù]\.){25})\.$|(?<=([A-Za-zäöüÄÖÜàòèéìù]\.){26})\.$|(?<=([A-Za-zäöüÄÖÜàòèéìù]\.){27})\.$|(?<=([A-Za-zäöüÄÖÜàòèéìù]\.){28})\.$|(?<=([A-Za-zäöüÄÖÜàòèéìù]\.){29})\.$|(?<=[A-Z])\.$�infix_finditer�Z\.\.+|…|[\u00A6\u00A9\u00AE\u00B0\u0482\u058D\u058E\u060E\u060F\u06DE\u06E9\u06FD\u06FE\u07F6\u09FA\u0B70\u0BF3-\u0BF8\u0BFA\u0C7F\u0D4F\u0D79\u0F01-\u0F03\u0F13\u0F15-\u0F17\u0F1A-\u0F1F\u0F34\u0F36\u0F38\u0FBE-\u0FC5\u0FC7-\u0FCC\u0FCE\u0FCF\u0FD5-\u0FD8\u109E\u109F\u1390-\u1399\u1940\u19DE-\u19FF\u1B61-\u1B6A\u1B74-\u1B7C\u2100\u2101\u2103-\u2106\u2108\u2109\u2114\u2116\u2117\u211E-\u2123\u2125\u2127\u2129\u212E\u213A\u213B\u214A\u214C\u214D\u214F\u218A\u218B\u2195-\u2199\u219C-\u219F\u21A1\u21A2\u21A4\u21A5\u21A7-\u21AD\u21AF-\u21CD\u21D0\u21D1\u21D3\u21D5-\u21F3\u2300-\u2307\u230C-\u231F\u2322-\u2328\u232B-\u237B\u237D-\u239A\u23B4-\u23DB\u23E2-\u2426\u2440-\u244A\u249C-\u24E9\u2500-\u25B6\u25B8-\u25C0\u25C2-\u25F7\u2600-\u266E\u2670-\u2767\u2794-\u27BF\u2800-\u28FF\u2B00-\u2B2F\u2B45\u2B46\u2B4D-\u2B73\u2B76-\u2B95\u2B98-\u2BC8\u2BCA-\u2BFE\u2CE5-\u2CEA\u2E80-\u2E99\u2E9B-\u2EF3\u2F00-\u2FD5\u2FF0-\u2FFB\u3004\u3012\u3013\u3020\u3036\u3037\u303E\u303F\u3190\u3191\u3196-\u319F\u31C0-\u31E3\u3200-\u321E\u322A-\u3247\u3250\u3260-\u327F\u328A-\u32B0\u32C0-\u32FE\u3300-\u33FF\u4DC0-\u4DFF\uA490-\uA4C6\uA828-\uA82B\uA836\uA837\uA839\uAA77-\uAA79\uFDFD\uFFE4\uFFE8\uFFED\uFFEE\uFFFC\uFFFD\U00010137-\U0001013F\U00010179-\U00010189\U0001018C-\U0001018E\U00010190-\U0001019B\U000101A0\U000101D0-\U000101FC\U00010877\U00010878\U00010AC8\U0001173F\U00016B3C-\U00016B3F\U00016B45\U0001BC9C\U0001D000-\U0001D0F5\U0001D100-\U0001D126\U0001D129-\U0001D164\U0001D16A-\U0001D16C\U0001D183\U0001D184\U0001D18C-\U0001D1A9\U0001D1AE-\U0001D1E8\U0001D200-\U0001D241\U0001D245\U0001D300-\U0001D356\U0001D800-\U0001D9FF\U0001DA37-\U0001DA3A\U0001DA6D-\U0001DA74\U0001DA76-\U0001DA83\U0001DA85\U0001DA86\U0001ECAC\U0001F000-\U0001F02B\U0001F030-\U0001F093\U0001F0A0-\U0001F0AE\U0001F0B1-\U0001F0BF\U0001F0C1-\U0001F0CF\U0001F0D1-\U0001F0F5\U0001F110-\U0001F16B\U0001F170-\U0001F1AC\U0001F1E6-\U0001F202\U0001F210-\U0001F23B\U0001F240-\U0001F248\U0001F250\U0001F251\U0001F260-\U0001F265\U0001F300-\U0001F3FA\U0001F400-\U0001F6D4\U0001F6E0-\U0001F6EC\U0001F6F0-\U0001F6F9\U0001F700-\U0001F773\U0001F780-\U0001F7D8\U0001F800-\U0001F80B\U0001F810-\U0001F847\U0001F850-\U0001F859\U0001F860-\U0001F887\U0001F890-\U0001F8AD\U0001F900-\U0001F90B\U0001F910-\U0001F93E\U0001F940-\U0001F970\U0001F973-\U0001F976\U0001F97A\U0001F97C-\U0001F9A2\U0001F9B0-\U0001F9B9\U0001F9C0-\U0001F9C2\U0001F9D0-\U0001F9FF\U0001FA60-\U0001FA6D]|(?<=[a-z\uFF41-\uFF5A\u00DF-\u00F6\u00F8-\u00FF\u0101\u0103\u0105\u0107\u0109\u010B\u010D\u010F\u0111\u0113\u0115\u0117\u0119\u011B\u011D\u011F\u0121\u0123\u0125\u0127\u0129\u012B\u012D\u012F\u0131\u0133\u0135\u0137\u0138\u013A\u013C\u013E\u0140\u0142\u0144\u0146\u0148\u0149\u014B\u014D\u014F\u0151\u0153\u0155\u0157\u0159\u015B\u015D\u015F\u0161\u0163\u0165\u0167\u0169\u016B\u016D\u016F\u0171\u0173\u0175\u0177\u017A\u017C\u017E\u017F\u0180\u0183\u0185\u0188\u018C\u018D\u0192\u0195\u0199-\u019B\u019E\u01A1\u01A3\u01A5\u01A8\u01AA\u01AB\u01AD\u01B0\u01B4\u01B6\u01B9\u01BA\u01BD-\u01BF\u01C6\u01C9\u01CC\u01CE\u01D0\u01D2\u01D4\u01D6\u01D8\u01DA\u01DC\u01DD\u01DF\u01E1\u01E3\u01E5\u01E7\u01E9\u01EB\u01ED\u01EF\u01F0\u01F3\u01F5\u01F9\u01FB\u01FD\u01FF\u0201\u0203\u0205\u0207\u0209\u020B\u020D\u020F\u0211\u0213\u0215\u0217\u0219\u021B\u021D\u021F\u0221\u0223\u0225\u0227\u0229\u022B\u022D\u022F\u0231\u0233-\u0239\u023C\u023F\u0240\u0242\u0247\u0249\u024B\u024D\u024F\u2C61\u2C65\u2C66\u2C68\u2C6A\u2C6C\u2C71\u2C73\u2C74\u2C76-\u2C7B\uA723\uA725\uA727\uA729\uA72B\uA72D\uA72F-\uA731\uA733\uA735\uA737\uA739\uA73B\uA73D\uA73F\uA741\uA743\uA745\uA747\uA749\uA74B\uA74D\uA74F\uA751\uA753\uA755\uA757\uA759\uA75B\uA75D\uA75F\uA761\uA763\uA765\uA767\uA769\uA76B\uA76D\uA76F\uA771-\uA778\uA77A\uA77C\uA77F\uA781\uA783\uA785\uA787\uA78C\uA78E\uA791\uA793-\uA795\uA797\uA799\uA79B\uA79D\uA79F\uA7A1\uA7A3\uA7A5\uA7A7\uA7A9\uA7AF\uA7B5\uA7B7\uA7B9\uA7FA\uAB30-\uAB5A\uAB60-\uAB64\u0250-\u02AF\u1D00-\u1D25\u1D6B-\u1D77\u1D79-\u1D9A\u1E01\u1E03\u1E05\u1E07\u1E09\u1E0B\u1E0D\u1E0F\u1E11\u1E13\u1E15\u1E17\u1E19\u1E1B\u1E1D\u1E1F\u1E21\u1E23\u1E25\u1E27\u1E29\u1E2B\u1E2D\u1E2F\u1E31\u1E33\u1E35\u1E37\u1E39\u1E3B\u1E3D\u1E3F\u1E41\u1E43\u1E45\u1E47\u1E49\u1E4B\u1E4D\u1E4F\u1E51\u1E53\u1E55\u1E57\u1E59\u1E5B\u1E5D\u1E5F\u1E61\u1E63\u1E65\u1E67\u1E69\u1E6B\u1E6D\u1E6F\u1E71\u1E73\u1E75\u1E77\u1E79\u1E7B\u1E7D\u1E7F\u1E81\u1E83\u1E85\u1E87\u1E89\u1E8B\u1E8D\u1E8F\u1E91\u1E93\u1E95-\u1E9D\u1E9F\u1EA1\u1EA3\u1EA5\u1EA7\u1EA9\u1EAB\u1EAD\u1EAF\u1EB1\u1EB3\u1EB5\u1EB7\u1EB9\u1EBB\u1EBD\u1EBF\u1EC1\u1EC3\u1EC5\u1EC7\u1EC9\u1ECB\u1ECD\u1ECF\u1ED1\u1ED3\u1ED5\u1ED7\u1ED9\u1EDB\u1EDD\u1EDF\u1EE1\u1EE3\u1EE5\u1EE7\u1EE9\u1EEB\u1EED\u1EEF\u1EF1\u1EF3\u1EF5\u1EF7\u1EF9\u1EFB\u1EFD\u1EFFёа-яәөүҗңһα-ωάέίόώήύа-щюяіїєґѓѕјљњќѐѝ\u1200-\u137F\u0980-\u09FF\u0591-\u05F4\uFB1D-\uFB4F\u0620-\u064A\u066E-\u06D5\u06E5-\u06FF\u0750-\u077F\u08A0-\u08BD\uFB50-\uFBB1\uFBD3-\uFD3D\uFD50-\uFDC7\uFDF0-\uFDFB\uFE70-\uFEFC\U0001EE00-\U0001EEBB\u0D80-\u0DFF\u0900-\u097F\u0C80-\u0CFF\u0B80-\u0BFF\u0C00-\u0C7F\uAC00-\uD7AF\u1100-\u11FF\u3040-\u309F\u30A0-\u30FFー\u4E00-\u62FF\u6300-\u77FF\u7800-\u8CFF\u8D00-\u9FFF\u3400-\u4DBF\U00020000-\U000215FF\U00021600-\U000230FF\U00023100-\U000245FF\U00024600-\U000260FF\U00026100-\U000275FF\U00027600-\U000290FF\U00029100-\U0002A6DF\U0002A700-\U0002B73F\U0002B740-\U0002B81F\U0002B820-\U0002CEAF\U0002CEB0-\U0002EBEF\u2E80-\u2EFF\u2F00-\u2FDF\u2FF0-\u2FFF\u3000-\u303F\u31C0-\u31EF\u3200-\u32FF\u3300-\u33FF\uF900-\uFAFF\uFE30-\uFE4F\U0001F200-\U0001F2FF\U0002F800-\U0002FA1F\'"”“`‘´’‚,„»«「」『』()〔〕【】《》〈〉〈〉⟦⟧])\.(?=[A-Z\uFF21-\uFF3A\u00C0-\u00D6\u00D8-\u00DE\u0100\u0102\u0104\u0106\u0108\u010A\u010C\u010E\u0110\u0112\u0114\u0116\u0118\u011A\u011C\u011E\u0120\u0122\u0124\u0126\u0128\u012A\u012C\u012E\u0130\u0132\u0134\u0136\u0139\u013B\u013D\u013F\u0141\u0143\u0145\u0147\u014A\u014C\u014E\u0150\u0152\u0154\u0156\u0158\u015A\u015C\u015E\u0160\u0162\u0164\u0166\u0168\u016A\u016C\u016E\u0170\u0172\u0174\u0176\u0178\u0179\u017B\u017D\u0181\u0182\u0184\u0186\u0187\u0189-\u018B\u018E-\u0191\u0193\u0194\u0196-\u0198\u019C\u019D\u019F\u01A0\u01A2\u01A4\u01A6\u01A7\u01A9\u01AC\u01AE\u01AF\u01B1-\u01B3\u01B5\u01B7\u01B8\u01BC\u01C4\u01C7\u01CA\u01CD\u01CF\u01D1\u01D3\u01D5\u01D7\u01D9\u01DB\u01DE\u01E0\u01E2\u01E4\u01E6\u01E8\u01EA\u01EC\u01EE\u01F1\u01F4\u01F6-\u01F8\u01FA\u01FC\u01FE\u0200\u0202\u0204\u0206\u0208\u020A\u020C\u020E\u0210\u0212\u0214\u0216\u0218\u021A\u021C\u021E\u0220\u0222\u0224\u0226\u0228\u022A\u022C\u022E\u0230\u0232\u023A\u023B\u023D\u023E\u0241\u0243-\u0246\u0248\u024A\u024C\u024E\u2C60\u2C62-\u2C64\u2C67\u2C69\u2C6B\u2C6D-\u2C70\u2C72\u2C75\u2C7E\u2C7F\uA722\uA724\uA726\uA728\uA72A\uA72C\uA72E\uA732\uA734\uA736\uA738\uA73A\uA73C\uA73E\uA740\uA742\uA744\uA746\uA748\uA74A\uA74C\uA74E\uA750\uA752\uA754\uA756\uA758\uA75A\uA75C\uA75E\uA760\uA762\uA764\uA766\uA768\uA76A\uA76C\uA76E\uA779\uA77B\uA77D\uA77E\uA780\uA782\uA784\uA786\uA78B\uA78D\uA790\uA792\uA796\uA798\uA79A\uA79C\uA79E\uA7A0\uA7A2\uA7A4\uA7A6\uA7A8\uA7AA-\uA7AE\uA7B0-\uA7B4\uA7B6\uA7B8\u1E00\u1E02\u1E04\u1E06\u1E08\u1E0A\u1E0C\u1E0E\u1E10\u1E12\u1E14\u1E16\u1E18\u1E1A\u1E1C\u1E1E\u1E20\u1E22\u1E24\u1E26\u1E28\u1E2A\u1E2C\u1E2E\u1E30\u1E32\u1E34\u1E36\u1E38\u1E3A\u1E3C\u1E3E\u1E40\u1E42\u1E44\u1E46\u1E48\u1E4A\u1E4C\u1E4E\u1E50\u1E52\u1E54\u1E56\u1E58\u1E5A\u1E5C\u1E5E\u1E60\u1E62\u1E64\u1E66\u1E68\u1E6A\u1E6C\u1E6E\u1E70\u1E72\u1E74\u1E76\u1E78\u1E7A\u1E7C\u1E7E\u1E80\u1E82\u1E84\u1E86\u1E88\u1E8A\u1E8C\u1E8E\u1E90\u1E92\u1E94\u1E9E\u1EA0\u1EA2\u1EA4\u1EA6\u1EA8\u1EAA\u1EAC\u1EAE\u1EB0\u1EB2\u1EB4\u1EB6\u1EB8\u1EBA\u1EBC\u1EBE\u1EC0\u1EC2\u1EC4\u1EC6\u1EC8\u1ECA\u1ECC\u1ECE\u1ED0\u1ED2\u1ED4\u1ED6\u1ED8\u1EDA\u1EDC\u1EDE\u1EE0\u1EE2\u1EE4\u1EE6\u1EE8\u1EEA\u1EEC\u1EEE\u1EF0\u1EF2\u1EF4\u1EF6\u1EF8\u1EFA\u1EFC\u1EFEЁА-ЯӘӨҮҖҢҺΑ-ΩΆΈΊΌΏΉΎА-ЩЮЯІЇЄҐЃЅЈЉЊЌЀЍ\u1200-\u137F\u0980-\u09FF\u0591-\u05F4\uFB1D-\uFB4F\u0620-\u064A\u066E-\u06D5\u06E5-\u06FF\u0750-\u077F\u08A0-\u08BD\uFB50-\uFBB1\uFBD3-\uFD3D\uFD50-\uFDC7\uFDF0-\uFDFB\uFE70-\uFEFC\U0001EE00-\U0001EEBB\u0D80-\u0DFF\u0900-\u097F\u0C80-\u0CFF\u0B80-\u0BFF\u0C00-\u0C7F\uAC00-\uD7AF\u1100-\u11FF\u3040-\u309F\u30A0-\u30FFー\u4E00-\u62FF\u6300-\u77FF\u7800-\u8CFF\u8D00-\u9FFF\u3400-\u4DBF\U00020000-\U000215FF\U00021600-\U000230FF\U00023100-\U000245FF\U00024600-\U000260FF\U00026100-\U000275FF\U00027600-\U000290FF\U00029100-\U0002A6DF\U0002A700-\U0002B73F\U0002B740-\U0002B81F\U0002B820-\U0002CEAF\U0002CEB0-\U0002EBEF\u2E80-\u2EFF\u2F00-\u2FDF\u2FF0-\u2FFF\u3000-\u303F\u31C0-\u31EF\u3200-\u32FF\u3300-\u33FF\uF900-\uFAFF\uFE30-\uFE4F\U0001F200-\U0001F2FF\U0002F800-\U0002FA1F\'"”“`‘´’‚,„»«「」『』()〔〕【】《》〈〉〈〉⟦⟧])|(?<=[A-Za-z\uFF21-\uFF3A\uFF41-\uFF5A\u00C0-\u00D6\u00D8-\u00F6\u00F8-\u00FF\u0100-\u017F\u0180-\u01BF\u01C4-\u024F\u2C60-\u2C7B\u2C7E\u2C7F\uA722-\uA76F\uA771-\uA787\uA78B-\uA78E\uA790-\uA7B9\uA7FA\uAB30-\uAB5A\uAB60-\uAB64\u0250-\u02AF\u1D00-\u1D25\u1D6B-\u1D77\u1D79-\u1D9A\u1E00-\u1EFFёа-яЁА-ЯәөүҗңһӘӨҮҖҢҺα-ωάέίόώήύΑ-ΩΆΈΊΌΏΉΎа-щюяіїєґА-ЩЮЯІЇЄҐѓѕјљњќѐѝЃЅЈЉЊЌЀЍ\u1200-\u137F\u0980-\u09FF\u0591-\u05F4\uFB1D-\uFB4F\u0620-\u064A\u066E-\u06D5\u06E5-\u06FF\u0750-\u077F\u08A0-\u08BD\uFB50-\uFBB1\uFBD3-\uFD3D\uFD50-\uFDC7\uFDF0-\uFDFB\uFE70-\uFEFC\U0001EE00-\U0001EEBB\u0D80-\u0DFF\u0900-\u097F\u0C80-\u0CFF\u0B80-\u0BFF\u0C00-\u0C7F\uAC00-\uD7AF\u1100-\u11FF\u3040-\u309F\u30A0-\u30FFー\u4E00-\u62FF\u6300-\u77FF\u7800-\u8CFF\u8D00-\u9FFF\u3400-\u4DBF\U00020000-\U000215FF\U00021600-\U000230FF\U00023100-\U000245FF\U00024600-\U000260FF\U00026100-\U000275FF\U00027600-\U000290FF\U00029100-\U0002A6DF\U0002A700-\U0002B73F\U0002B740-\U0002B81F\U0002B820-\U0002CEAF\U0002CEB0-\U0002EBEF\u2E80-\u2EFF\u2F00-\u2FDF\u2FF0-\u2FFF\u3000-\u303F\u31C0-\u31EF\u3200-\u32FF\u3300-\u33FF\uF900-\uFAFF\uFE30-\uFE4F\U0001F200-\U0001F2FF\U0002F800-\U0002FA1F]),(?=[A-Za-z\uFF21-\uFF3A\uFF41-\uFF5A\u00C0-\u00D6\u00D8-\u00F6\u00F8-\u00FF\u0100-\u017F\u0180-\u01BF\u01C4-\u024F\u2C60-\u2C7B\u2C7E\u2C7F\uA722-\uA76F\uA771-\uA787\uA78B-\uA78E\uA790-\uA7B9\uA7FA\uAB30-\uAB5A\uAB60-\uAB64\u0250-\u02AF\u1D00-\u1D25\u1D6B-\u1D77\u1D79-\u1D9A\u1E00-\u1EFFёа-яЁА-ЯәөүҗңһӘӨҮҖҢҺα-ωάέίόώήύΑ-ΩΆΈΊΌΏΉΎа-щюяіїєґА-ЩЮЯІЇЄҐѓѕјљњќѐѝЃЅЈЉЊЌЀЍ\u1200-\u137F\u0980-\u09FF\u0591-\u05F4\uFB1D-\uFB4F\u0620-\u064A\u066E-\u06D5\u06E5-\u06FF\u0750-\u077F\u08A0-\u08BD\uFB50-\uFBB1\uFBD3-\uFD3D\uFD50-\uFDC7\uFDF0-\uFDFB\uFE70-\uFEFC\U0001EE00-\U0001EEBB\u0D80-\u0DFF\u0900-\u097F\u0C80-\u0CFF\u0B80-\u0BFF\u0C00-\u0C7F\uAC00-\uD7AF\u1100-\u11FF\u3040-\u309F\u30A0-\u30FFー\u4E00-\u62FF\u6300-\u77FF\u7800-\u8CFF\u8D00-\u9FFF\u3400-\u4DBF\U00020000-\U000215FF\U00021600-\U000230FF\U00023100-\U000245FF\U00024600-\U000260FF\U00026100-\U000275FF\U00027600-\U000290FF\U00029100-\U0002A6DF\U0002A700-\U0002B73F\U0002B740-\U0002B81F\U0002B820-\U0002CEAF\U0002CEB0-\U0002EBEF\u2E80-\u2EFF\u2F00-\u2FDF\u2FF0-\u2FFF\u3000-\u303F\u31C0-\u31EF\u3200-\u32FF\u3300-\u33FF\uF900-\uFAFF\uFE30-\uFE4F\U0001F200-\U0001F2FF\U0002F800-\U0002FA1F])|(?<=[A-Za-z\uFF21-\uFF3A\uFF41-\uFF5A\u00C0-\u00D6\u00D8-\u00F6\u00F8-\u00FF\u0100-\u017F\u0180-\u01BF\u01C4-\u024F\u2C60-\u2C7B\u2C7E\u2C7F\uA722-\uA76F\uA771-\uA787\uA78B-\uA78E\uA790-\uA7B9\uA7FA\uAB30-\uAB5A\uAB60-\uAB64\u0250-\u02AF\u1D00-\u1D25\u1D6B-\u1D77\u1D79-\u1D9A\u1E00-\u1EFFёа-яЁА-ЯәөүҗңһӘӨҮҖҢҺα-ωάέίόώήύΑ-ΩΆΈΊΌΏΉΎа-щюяіїєґА-ЩЮЯІЇЄҐѓѕјљњќѐѝЃЅЈЉЊЌЀЍ\u1200-\u137F\u0980-\u09FF\u0591-\u05F4\uFB1D-\uFB4F\u0620-\u064A\u066E-\u06D5\u06E5-\u06FF\u0750-\u077F\u08A0-\u08BD\uFB50-\uFBB1\uFBD3-\uFD3D\uFD50-\uFDC7\uFDF0-\uFDFB\uFE70-\uFEFC\U0001EE00-\U0001EEBB\u0D80-\u0DFF\u0900-\u097F\u0C80-\u0CFF\u0B80-\u0BFF\u0C00-\u0C7F\uAC00-\uD7AF\u1100-\u11FF\u3040-\u309F\u30A0-\u30FFー\u4E00-\u62FF\u6300-\u77FF\u7800-\u8CFF\u8D00-\u9FFF\u3400-\u4DBF\U00020000-\U000215FF\U00021600-\U000230FF\U00023100-\U000245FF\U00024600-\U000260FF\U00026100-\U000275FF\U00027600-\U000290FF\U00029100-\U0002A6DF\U0002A700-\U0002B73F\U0002B740-\U0002B81F\U0002B820-\U0002CEAF\U0002CEB0-\U0002EBEF\u2E80-\u2EFF\u2F00-\u2FDF\u2FF0-\u2FFF\u3000-\u303F\u31C0-\u31EF\u3200-\u32FF\u3300-\u33FF\uF900-\uFAFF\uFE30-\uFE4F\U0001F200-\U0001F2FF\U0002F800-\U0002FA1F])(?:-|–|—|--|---|——|~)(?=[A-Za-z\uFF21-\uFF3A\uFF41-\uFF5A\u00C0-\u00D6\u00D8-\u00F6\u00F8-\u00FF\u0100-\u017F\u0180-\u01BF\u01C4-\u024F\u2C60-\u2C7B\u2C7E\u2C7F\uA722-\uA76F\uA771-\uA787\uA78B-\uA78E\uA790-\uA7B9\uA7FA\uAB30-\uAB5A\uAB60-\uAB64\u0250-\u02AF\u1D00-\u1D25\u1D6B-\u1D77\u1D79-\u1D9A\u1E00-\u1EFFёа-яЁА-ЯәөүҗңһӘӨҮҖҢҺα-ωάέίόώήύΑ-ΩΆΈΊΌΏΉΎа-щюяіїєґА-ЩЮЯІЇЄҐѓѕјљњќѐѝЃЅЈЉЊЌЀЍ\u1200-\u137F\u0980-\u09FF\u0591-\u05F4\uFB1D-\uFB4F\u0620-\u064A\u066E-\u06D5\u06E5-\u06FF\u0750-\u077F\u08A0-\u08BD\uFB50-\uFBB1\uFBD3-\uFD3D\uFD50-\uFDC7\uFDF0-\uFDFB\uFE70-\uFEFC\U0001EE00-\U0001EEBB\u0D80-\u0DFF\u0900-\u097F\u0C80-\u0CFF\u0B80-\u0BFF\u0C00-\u0C7F\uAC00-\uD7AF\u1100-\u11FF\u3040-\u309F\u30A0-\u30FFー\u4E00-\u62FF\u6300-\u77FF\u7800-\u8CFF\u8D00-\u9FFF\u3400-\u4DBF\U00020000-\U000215FF\U00021600-\U000230FF\U00023100-\U000245FF\U00024600-\U000260FF\U00026100-\U000275FF\U00027600-\U000290FF\U00029100-\U0002A6DF\U0002A700-\U0002B73F\U0002B740-\U0002B81F\U0002B820-\U0002CEAF\U0002CEB0-\U0002EBEF\u2E80-\u2EFF\u2F00-\u2FDF\u2FF0-\u2FFF\u3000-\u303F\u31C0-\u31EF\u3200-\u32FF\u3300-\u33FF\uF900-\uFAFF\uFE30-\uFE4F\U0001F200-\U0001F2FF\U0002F800-\U0002FA1F])|(?<=[A-Za-z\uFF21-\uFF3A\uFF41-\uFF5A\u00C0-\u00D6\u00D8-\u00F6\u00F8-\u00FF\u0100-\u017F\u0180-\u01BF\u01C4-\u024F\u2C60-\u2C7B\u2C7E\u2C7F\uA722-\uA76F\uA771-\uA787\uA78B-\uA78E\uA790-\uA7B9\uA7FA\uAB30-\uAB5A\uAB60-\uAB64\u0250-\u02AF\u1D00-\u1D25\u1D6B-\u1D77\u1D79-\u1D9A\u1E00-\u1EFFёа-яЁА-ЯәөүҗңһӘӨҮҖҢҺα-ωάέίόώήύΑ-ΩΆΈΊΌΏΉΎа-щюяіїєґА-ЩЮЯІЇЄҐѓѕјљњќѐѝЃЅЈЉЊЌЀЍ\u1200-\u137F\u0980-\u09FF\u0591-\u05F4\uFB1D-\uFB4F\u0620-\u064A\u066E-\u06D5\u06E5-\u06FF\u0750-\u077F\u08A0-\u08BD\uFB50-\uFBB1\uFBD3-\uFD3D\uFD50-\uFDC7\uFDF0-\uFDFB\uFE70-\uFEFC\U0001EE00-\U0001EEBB\u0D80-\u0DFF\u0900-\u097F\u0C80-\u0CFF\u0B80-\u0BFF\u0C00-\u0C7F\uAC00-\uD7AF\u1100-\u11FF\u3040-\u309F\u30A0-\u30FFー\u4E00-\u62FF\u6300-\u77FF\u7800-\u8CFF\u8D00-\u9FFF\u3400-\u4DBF\U00020000-\U000215FF\U00021600-\U000230FF\U00023100-\U000245FF\U00024600-\U000260FF\U00026100-\U000275FF\U00027600-\U000290FF\U00029100-\U0002A6DF\U0002A700-\U0002B73F\U0002B740-\U0002B81F\U0002B820-\U0002CEAF\U0002CEB0-\U0002EBEF\u2E80-\u2EFF\u2F00-\u2FDF\u2FF0-\u2FFF\u3000-\u303F\u31C0-\u31EF\u3200-\u32FF\u3300-\u33FF\uF900-\uFAFF\uFE30-\uFE4F\U0001F200-\U0001F2FF\U0002F800-\U0002FA1F0-9])[:<>=/](?=[A-Za-z\uFF21-\uFF3A\uFF41-\uFF5A\u00C0-\u00D6\u00D8-\u00F6\u00F8-\u00FF\u0100-\u017F\u0180-\u01BF\u01C4-\u024F\u2C60-\u2C7B\u2C7E\u2C7F\uA722-\uA76F\uA771-\uA787\uA78B-\uA78E\uA790-\uA7B9\uA7FA\uAB30-\uAB5A\uAB60-\uAB64\u0250-\u02AF\u1D00-\u1D25\u1D6B-\u1D77\u1D79-\u1D9A\u1E00-\u1EFFёа-яЁА-ЯәөүҗңһӘӨҮҖҢҺα-ωάέίόώήύΑ-ΩΆΈΊΌΏΉΎа-щюяіїєґА-ЩЮЯІЇЄҐѓѕјљњќѐѝЃЅЈЉЊЌЀЍ\u1200-\u137F\u0980-\u09FF\u0591-\u05F4\uFB1D-\uFB4F\u0620-\u064A\u066E-\u06D5\u06E5-\u06FF\u0750-\u077F\u08A0-\u08BD\uFB50-\uFBB1\uFBD3-\uFD3D\uFD50-\uFDC7\uFDF0-\uFDFB\uFE70-\uFEFC\U0001EE00-\U0001EEBB\u0D80-\u0DFF\u0900-\u097F\u0C80-\u0CFF\u0B80-\u0BFF\u0C00-\u0C7F\uAC00-\uD7AF\u1100-\u11FF\u3040-\u309F\u30A0-\u30FFー\u4E00-\u62FF\u6300-\u77FF\u7800-\u8CFF\u8D00-\u9FFF\u3400-\u4DBF\U00020000-\U000215FF\U00021600-\U000230FF\U00023100-\U000245FF\U00024600-\U000260FF\U00026100-\U000275FF\U00027600-\U000290FF\U00029100-\U0002A6DF\U0002A700-\U0002B73F\U0002B740-\U0002B81F\U0002B820-\U0002CEAF\U0002CEB0-\U0002EBEF\u2E80-\u2EFF\u2F00-\u2FDF\u2FF0-\u2FFF\u3000-\u303F\u31C0-\u31EF\u3200-\u32FF\u3300-\u33FF\uF900-\uFAFF\uFE30-\uFE4F\U0001F200-\U0001F2FF\U0002F800-\U0002FA1F])|(?<=[A-Za-z\uFF21-\uFF3A\uFF41-\uFF5A\u00C0-\u00D6\u00D8-\u00F6\u00F8-\u00FF\u0100-\u017F\u0180-\u01BF\u01C4-\u024F\u2C60-\u2C7B\u2C7E\u2C7F\uA722-\uA76F\uA771-\uA787\uA78B-\uA78E\uA790-\uA7B9\uA7FA\uAB30-\uAB5A\uAB60-\uAB64\u0250-\u02AF\u1D00-\u1D25\u1D6B-\u1D77\u1D79-\u1D9A\u1E00-\u1EFFёа-яЁА-ЯәөүҗңһӘӨҮҖҢҺα-ωάέίόώήύΑ-ΩΆΈΊΌΏΉΎа-щюяіїєґА-ЩЮЯІЇЄҐѓѕјљњќѐѝЃЅЈЉЊЌЀЍ\u1200-\u137F\u0980-\u09FF\u0591-\u05F4\uFB1D-\uFB4F\u0620-\u064A\u066E-\u06D5\u06E5-\u06FF\u0750-\u077F\u08A0-\u08BD\uFB50-\uFBB1\uFBD3-\uFD3D\uFD50-\uFDC7\uFDF0-\uFDFB\uFE70-\uFEFC\U0001EE00-\U0001EEBB\u0D80-\u0DFF\u0900-\u097F\u0C80-\u0CFF\u0B80-\u0BFF\u0C00-\u0C7F\uAC00-\uD7AF\u1100-\u11FF\u3040-\u309F\u30A0-\u30FFー\u4E00-\u62FF\u6300-\u77FF\u7800-\u8CFF\u8D00-\u9FFF\u3400-\u4DBF\U00020000-\U000215FF\U00021600-\U000230FF\U00023100-\U000245FF\U00024600-\U000260FF\U00026100-\U000275FF\U00027600-\U000290FF\U00029100-\U0002A6DF\U0002A700-\U0002B73F\U0002B740-\U0002B81F\U0002B820-\U0002CEAF\U0002CEB0-\U0002EBEF\u2E80-\u2EFF\u2F00-\u2FDF\u2FF0-\u2FFF\u3000-\u303F\u31C0-\u31EF\u3200-\u32FF\u3300-\u33FF\uF900-\uFAFF\uFE30-\uFE4F\U0001F200-\U0001F2FF\U0002F800-\U0002FA1F]['’])(?=[A-Za-z\uFF21-\uFF3A\uFF41-\uFF5A\u00C0-\u00D6\u00D8-\u00F6\u00F8-\u00FF\u0100-\u017F\u0180-\u01BF\u01C4-\u024F\u2C60-\u2C7B\u2C7E\u2C7F\uA722-\uA76F\uA771-\uA787\uA78B-\uA78E\uA790-\uA7B9\uA7FA\uAB30-\uAB5A\uAB60-\uAB64\u0250-\u02AF\u1D00-\u1D25\u1D6B-\u1D77\u1D79-\u1D9A\u1E00-\u1EFFёа-яЁА-ЯәөүҗңһӘӨҮҖҢҺα-ωάέίόώήύΑ-ΩΆΈΊΌΏΉΎа-щюяіїєґА-ЩЮЯІЇЄҐѓѕјљњќѐѝЃЅЈЉЊЌЀЍ\u1200-\u137F\u0980-\u09FF\u0591-\u05F4\uFB1D-\uFB4F\u0620-\u064A\u066E-\u06D5\u06E5-\u06FF\u0750-\u077F\u08A0-\u08BD\uFB50-\uFBB1\uFBD3-\uFD3D\uFD50-\uFDC7\uFDF0-\uFDFB\uFE70-\uFEFC\U0001EE00-\U0001EEBB\u0D80-\u0DFF\u0900-\u097F\u0C80-\u0CFF\u0B80-\u0BFF\u0C00-\u0C7F\uAC00-\uD7AF\u1100-\u11FF\u3040-\u309F\u30A0-\u30FFー\u4E00-\u62FF\u6300-\u77FF\u7800-\u8CFF\u8D00-\u9FFF\u3400-\u4DBF\U00020000-\U000215FF\U00021600-\U000230FF\U00023100-\U000245FF\U00024600-\U000260FF\U00026100-\U000275FF\U00027600-\U000290FF\U00029100-\U0002A6DF\U0002A700-\U0002B73F\U0002B740-\U0002B81F\U0002B820-\U0002CEAF\U0002CEB0-\U0002EBEF\u2E80-\u2EFF\u2F00-\u2FDF\u2FF0-\u2FFF\u3000-\u303F\u31C0-\u31EF\u3200-\u32FF\u3300-\u33FF\uF900-\uFAFF\uFE30-\uFE4F\U0001F200-\U0001F2FF\U0002F800-\U0002FA1F])|(?<=[0-9])[+\*^](?=[0-9])|[\(\[\]\)]|(?<=\.--)\.|\.(?=[A-Za-zäöüÄÖÜàòèéìù]{3,20})|'\.\.|(?<!www\.)(?<=([a-zA-ZäöüÄÖÜ]){3})\.(?!(ch|at|de|com|edu|org|gov|net|fr|uk|be|es|pl|it|eu|nl|ba|cz|dk|al|ad|bg|by|fi|gr|ie|li|lu|no|pt|ro|rs|ru|se|si|sk))|(?<!www\.)(?<=([a-zA-ZäöüÄÖÜ]){4})\.(?!(ch|at|de|com|edu|org|gov|net|fr|uk|be|es|pl|it|eu|nl|ba|cz|dk|al|ad|bg|by|fi|gr|ie|li|lu|no|pt|ro|rs|ru|se|si|sk))|(?<!www\.)(?<=([a-zA-ZäöüÄÖÜ]){5})\.(?!(ch|at|de|com|edu|org|gov|net|fr|uk|be|es|pl|it|eu|nl|ba|cz|dk|al|ad|bg|by|fi|gr|ie|li|lu|no|pt|ro|rs|ru|se|si|sk))|(?<!www\.)(?<=([a-zA-ZäöüÄÖÜ]){6})\.(?!(ch|at|de|com|edu|org|gov|net|fr|uk|be|es|pl|it|eu|nl|ba|cz|dk|al|ad|bg|by|fi|gr|ie|li|lu|no|pt|ro|rs|ru|se|si|sk))|(?<!www\.)(?<=([a-zA-ZäöüÄÖÜ]){7})\.(?!(ch|at|de|com|edu|org|gov|net|fr|uk|be|es|pl|it|eu|nl|ba|cz|dk|al|ad|bg|by|fi|gr|ie|li|lu|no|pt|ro|rs|ru|se|si|sk))|(?<!www\.)(?<=([a-zA-ZäöüÄÖÜ]){8})\.(?!(ch|at|de|com|edu|org|gov|net|fr|uk|be|es|pl|it|eu|nl|ba|cz|dk|al|ad|bg|by|fi|gr|ie|li|lu|no|pt|ro|rs|ru|se|si|sk))|(?<!www\.)(?<=([a-zA-ZäöüÄÖÜ]){9})\.(?!(ch|at|de|com|edu|org|gov|net|fr|uk|be|es|pl|it|eu|nl|ba|cz|dk|al|ad|bg|by|fi|gr|ie|li|lu|no|pt|ro|rs|ru|se|si|sk))|(?<!www\.)(?<=([a-zA-ZäöüÄÖÜ]){10})\.(?!(ch|at|de|com|edu|org|gov|net|fr|uk|be|es|pl|it|eu|nl|ba|cz|dk|al|ad|bg|by|fi|gr|ie|li|lu|no|pt|ro|rs|ru|se|si|sk))|(?<!www\.)(?<=([a-zA-ZäöüÄÖÜ]){11})\.(?!(ch|at|de|com|edu|org|gov|net|fr|uk|be|es|pl|it|eu|nl|ba|cz|dk|al|ad|bg|by|fi|gr|ie|li|lu|no|pt|ro|rs|ru|se|si|sk))|(?<!www\.)(?<=([a-zA-ZäöüÄÖÜ]){12})\.(?!(ch|at|de|com|edu|org|gov|net|fr|uk|be|es|pl|it|eu|nl|ba|cz|dk|al|ad|bg|by|fi|gr|ie|li|lu|no|pt|ro|rs|ru|se|si|sk))|(?<!www\.)(?<=([a-zA-ZäöüÄÖÜ]){13})\.(?!(ch|at|de|com|edu|org|gov|net|fr|uk|be|es|pl|it|eu|nl|ba|cz|dk|al|ad|bg|by|fi|gr|ie|li|lu|no|pt|ro|rs|ru|se|si|sk))|(?<!www\.)(?<=([a-zA-ZäöüÄÖÜ]){14})\.(?!(ch|at|de|com|edu|org|gov|net|fr|uk|be|es|pl|it|eu|nl|ba|cz|dk|al|ad|bg|by|fi|gr|ie|li|lu|no|pt|ro|rs|ru|se|si|sk))|(?<!www\.)(?<=([a-zA-ZäöüÄÖÜ]){15})\.(?!(ch|at|de|com|edu|org|gov|net|fr|uk|be|es|pl|it|eu|nl|ba|cz|dk|al|ad|bg|by|fi|gr|ie|li|lu|no|pt|ro|rs|ru|se|si|sk))|(?<!www\.)(?<=([a-zA-ZäöüÄÖÜ]){16})\.(?!(ch|at|de|com|edu|org|gov|net|fr|uk|be|es|pl|it|eu|nl|ba|cz|dk|al|ad|bg|by|fi|gr|ie|li|lu|no|pt|ro|rs|ru|se|si|sk))|(?<!www\.)(?<=([a-zA-ZäöüÄÖÜ]){17})\.(?!(ch|at|de|com|edu|org|gov|net|fr|uk|be|es|pl|it|eu|nl|ba|cz|dk|al|ad|bg|by|fi|gr|ie|li|lu|no|pt|ro|rs|ru|se|si|sk))|(?<!www\.)(?<=([a-zA-ZäöüÄÖÜ]){18})\.(?!(ch|at|de|com|edu|org|gov|net|fr|uk|be|es|pl|it|eu|nl|ba|cz|dk|al|ad|bg|by|fi|gr|ie|li|lu|no|pt|ro|rs|ru|se|si|sk))|(?<!www\.)(?<=([a-zA-ZäöüÄÖÜ]){19})\.(?!(ch|at|de|com|edu|org|gov|net|fr|uk|be|es|pl|it|eu|nl|ba|cz|dk|al|ad|bg|by|fi|gr|ie|li|lu|no|pt|ro|rs|ru|se|si|sk))|(?<!www\.)(?<=([a-zA-ZäöüÄÖÜ]){20})\.(?!(ch|at|de|com|edu|org|gov|net|fr|uk|be|es|pl|it|eu|nl|ba|cz|dk|al|ad|bg|by|fi|gr|ie|li|lu|no|pt|ro|rs|ru|se|si|sk))|(?<!www\.)(?<=([a-zA-ZäöüÄÖÜ]){21})\.(?!(ch|at|de|com|edu|org|gov|net|fr|uk|be|es|pl|it|eu|nl|ba|cz|dk|al|ad|bg|by|fi|gr|ie|li|lu|no|pt|ro|rs|ru|se|si|sk))|(?<!www\.)(?<=([a-zA-ZäöüÄÖÜ]){22})\.(?!(ch|at|de|com|edu|org|gov|net|fr|uk|be|es|pl|it|eu|nl|ba|cz|dk|al|ad|bg|by|fi|gr|ie|li|lu|no|pt|ro|rs|ru|se|si|sk))|(?<!www\.)(?<=([a-zA-ZäöüÄÖÜ]){23})\.(?!(ch|at|de|com|edu|org|gov|net|fr|uk|be|es|pl|it|eu|nl|ba|cz|dk|al|ad|bg|by|fi|gr|ie|li|lu|no|pt|ro|rs|ru|se|si|sk))|(?<!www\.)(?<=([a-zA-ZäöüÄÖÜ]){24})\.(?!(ch|at|de|com|edu|org|gov|net|fr|uk|be|es|pl|it|eu|nl|ba|cz|dk|al|ad|bg|by|fi|gr|ie|li|lu|no|pt|ro|rs|ru|se|si|sk))|(?<!www\.)(?<=([a-zA-ZäöüÄÖÜ]){25})\.(?!(ch|at|de|com|edu|org|gov|net|fr|uk|be|es|pl|it|eu|nl|ba|cz|dk|al|ad|bg|by|fi|gr|ie|li|lu|no|pt|ro|rs|ru|se|si|sk))|(?<!www\.)(?<=([a-zA-ZäöüÄÖÜ]){26})\.(?!(ch|at|de|com|edu|org|gov|net|fr|uk|be|es|pl|it|eu|nl|ba|cz|dk|al|ad|bg|by|fi|gr|ie|li|lu|no|pt|ro|rs|ru|se|si|sk))|(?<!www\.)(?<=([a-zA-ZäöüÄÖÜ]){27})\.(?!(ch|at|de|com|edu|org|gov|net|fr|uk|be|es|pl|it|eu|nl|ba|cz|dk|al|ad|bg|by|fi|gr|ie|li|lu|no|pt|ro|rs|ru|se|si|sk))|(?<!www\.)(?<=([a-zA-ZäöüÄÖÜ]){28})\.(?!(ch|at|de|com|edu|org|gov|net|fr|uk|be|es|pl|it|eu|nl|ba|cz|dk|al|ad|bg|by|fi|gr|ie|li|lu|no|pt|ro|rs|ru|se|si|sk))|(?<!www\.)(?<=([a-zA-ZäöüÄÖÜ]){29})\.(?!(ch|at|de|com|edu|org|gov|net|fr|uk|be|es|pl|it|eu|nl|ba|cz|dk|al|ad|bg|by|fi|gr|ie|li|lu|no|pt|ro|rs|ru|se|si|sk))|[A-Z](?=\. )|(?<=([0-3][1-9]\.[0-1][1-9]\.[1-2][0-9]{3}))\.�token_match�^\[$�url_match��exceptions�B�Br..��A�Br.�A�.�CO..��A�CO.�A�.�Ch..��A�Ch.�A�.�Chr..��A�Chr.�A�.�Cie..��A�Cie.�A�.�Co..��A�Co.�A�.�Corp..��A�Corp.�A�.�Dr..��A�Dr.�A�.�Fa..��A�Fa.�A�.�Gen..��A�Gen.�A�.�HRegV..��A�HRegV.�A�.�Inc..��A�Inc.�A�.�Ing..��A�Ing.�A�.�Inh..��A�Inh.�A�.�Int..��A�Int.�A�.�Jr..��A�Jr.�A�.�LTD..��A�LTD.�A�.�Liq..��A�Liq.�A�.�Ltd..��A�Ltd.�A�.�M.Sc..��A�M.Sc.�A�.�Psy-K..��A�Psy-K.�A�.�R.l.��A�R.l.�S.a.g.l.��A�S.a.g.l.�S.a.r.l.��A�S.a.r.l.�S.c.r.l.��A�S.c.r.l.�S.r.��A�S.r.�S.r.l.��A�S.r.l.�S.à.r.l.��A�S.à.r.l.�Std..��A�Std.�A�.�Ti..��A�Ti.�A�.�Var..��A�Var.�A�.�ag..��A�ag.�A�.�ass..��A�ass.�A�.�b.v..��A�b.v.�A�.�cf.��A�cf.�co..��A�co.�A�.�d.o.o..��A�d.o.o.�A�.�dgl..��A�dgl.�A�.�dipl.fed..��A�dipl.fed.�A�.�div..��A�div.�A�.�ecc..��A�ecc.�A�.�ect..��A�ect.�A�.�ehf..��A�ehf.�A�.�env..��A�env.�A�.�etc.��A�etc.�fed..��A�fed.�A�.�g.l.��A�g.l.�iu..��A�iu.�A�.�ltd..��A�ltd.�A�.�méd..��A�méd.�A�.�r.l.��A�r.l.�s.a..��A�s.a.�A�.�s.a.r.l.��A�s.a.r.l.�s.à.r.l.��A�s.à.r.l.�s.àr.l.��A�s.àr.l.�sen..��A�sen.�A�.�sf..��A�sf.�A�.�succ..��A�succ.�A�.�u.a.��A�u.a.�u.a.m.��A�u.a.m.�u.d.g.��A�u.d.g.�u.s.w.��A�u.s.w.�u.ä.��A�u.ä.�usw.��A�usw.�v.k.s.s..��A�v.k.s.s.�A�.�ä..��A�ä.�A�.�faster_heuristics�
trainable_lemmatizer/cfg ADDED
@@ -0,0 +1,330 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "labels":[
3
+ 0,
4
+ 3,
5
+ 5,
6
+ 7,
7
+ 9,
8
+ 11,
9
+ 13,
10
+ 15,
11
+ 19,
12
+ 21,
13
+ 23,
14
+ 25,
15
+ 27,
16
+ 29,
17
+ 31,
18
+ 32,
19
+ 34,
20
+ 36,
21
+ 37,
22
+ 40,
23
+ 42,
24
+ 44,
25
+ 46,
26
+ 48,
27
+ 50,
28
+ 53,
29
+ 56,
30
+ 59,
31
+ 62,
32
+ 66,
33
+ 68,
34
+ 70,
35
+ 73,
36
+ 75,
37
+ 77,
38
+ 78,
39
+ 80,
40
+ 82,
41
+ 84,
42
+ 85,
43
+ 87,
44
+ 91,
45
+ 93,
46
+ 95,
47
+ 98,
48
+ 99,
49
+ 102,
50
+ 104,
51
+ 105,
52
+ 107,
53
+ 108,
54
+ 110,
55
+ 113,
56
+ 117,
57
+ 119,
58
+ 120,
59
+ 122,
60
+ 124,
61
+ 126,
62
+ 127,
63
+ 128,
64
+ 130,
65
+ 132,
66
+ 133,
67
+ 135,
68
+ 136,
69
+ 137,
70
+ 139,
71
+ 141,
72
+ 143,
73
+ 145,
74
+ 146,
75
+ 149,
76
+ 151,
77
+ 154,
78
+ 155,
79
+ 157,
80
+ 159,
81
+ 161,
82
+ 163,
83
+ 164,
84
+ 166,
85
+ 167,
86
+ 169,
87
+ 171,
88
+ 173,
89
+ 175,
90
+ 177,
91
+ 179,
92
+ 182,
93
+ 184,
94
+ 186,
95
+ 188,
96
+ 190,
97
+ 193,
98
+ 195,
99
+ 196,
100
+ 198,
101
+ 201,
102
+ 203,
103
+ 207,
104
+ 211,
105
+ 213,
106
+ 215,
107
+ 217,
108
+ 219,
109
+ 221,
110
+ 223,
111
+ 224,
112
+ 227,
113
+ 229,
114
+ 231,
115
+ 233,
116
+ 235,
117
+ 236,
118
+ 239,
119
+ 240,
120
+ 241,
121
+ 242,
122
+ 245,
123
+ 247,
124
+ 248,
125
+ 249,
126
+ 254,
127
+ 257,
128
+ 258,
129
+ 261,
130
+ 262,
131
+ 264,
132
+ 266,
133
+ 268,
134
+ 269,
135
+ 272,
136
+ 274,
137
+ 276,
138
+ 277,
139
+ 279,
140
+ 282,
141
+ 284,
142
+ 286,
143
+ 288,
144
+ 291,
145
+ 292,
146
+ 294,
147
+ 296,
148
+ 297,
149
+ 300,
150
+ 302,
151
+ 304,
152
+ 306,
153
+ 310,
154
+ 312,
155
+ 314,
156
+ 315,
157
+ 316,
158
+ 318,
159
+ 320,
160
+ 322,
161
+ 324,
162
+ 326,
163
+ 328,
164
+ 330,
165
+ 332,
166
+ 335,
167
+ 337,
168
+ 340,
169
+ 343,
170
+ 345,
171
+ 349,
172
+ 351,
173
+ 354,
174
+ 356,
175
+ 358,
176
+ 359,
177
+ 360,
178
+ 361,
179
+ 365,
180
+ 368,
181
+ 370,
182
+ 371,
183
+ 374,
184
+ 377,
185
+ 379,
186
+ 382,
187
+ 383,
188
+ 384,
189
+ 386,
190
+ 388,
191
+ 390,
192
+ 392,
193
+ 393,
194
+ 394,
195
+ 396,
196
+ 397,
197
+ 398,
198
+ 400,
199
+ 403,
200
+ 406,
201
+ 408,
202
+ 410,
203
+ 412,
204
+ 413,
205
+ 415,
206
+ 417,
207
+ 418,
208
+ 420,
209
+ 421,
210
+ 422,
211
+ 424,
212
+ 426,
213
+ 427,
214
+ 431,
215
+ 433,
216
+ 436,
217
+ 438,
218
+ 439,
219
+ 441,
220
+ 442,
221
+ 444,
222
+ 446,
223
+ 447,
224
+ 449,
225
+ 452,
226
+ 457,
227
+ 460,
228
+ 462,
229
+ 463,
230
+ 464,
231
+ 465,
232
+ 467,
233
+ 468,
234
+ 469,
235
+ 470,
236
+ 471,
237
+ 472,
238
+ 473,
239
+ 474,
240
+ 475,
241
+ 477,
242
+ 479,
243
+ 481,
244
+ 482,
245
+ 483,
246
+ 485,
247
+ 487,
248
+ 488,
249
+ 489,
250
+ 491,
251
+ 493,
252
+ 494,
253
+ 495,
254
+ 496,
255
+ 497,
256
+ 498,
257
+ 499,
258
+ 500,
259
+ 501,
260
+ 502,
261
+ 503,
262
+ 504,
263
+ 506,
264
+ 507,
265
+ 508,
266
+ 509,
267
+ 510,
268
+ 511,
269
+ 512,
270
+ 513,
271
+ 515,
272
+ 516,
273
+ 517,
274
+ 518,
275
+ 519,
276
+ 521,
277
+ 523,
278
+ 525,
279
+ 529,
280
+ 530,
281
+ 532,
282
+ 534,
283
+ 536,
284
+ 537,
285
+ 538,
286
+ 542,
287
+ 544,
288
+ 547,
289
+ 549,
290
+ 551,
291
+ 554,
292
+ 556,
293
+ 559,
294
+ 561,
295
+ 564,
296
+ 565,
297
+ 569,
298
+ 571,
299
+ 573,
300
+ 575,
301
+ 577,
302
+ 580,
303
+ 581,
304
+ 584,
305
+ 586,
306
+ 588,
307
+ 590,
308
+ 591,
309
+ 592,
310
+ 594,
311
+ 597,
312
+ 599,
313
+ 602,
314
+ 603,
315
+ 605,
316
+ 607,
317
+ 608,
318
+ 610,
319
+ 612,
320
+ 614,
321
+ 616,
322
+ 617,
323
+ 618,
324
+ 619,
325
+ 620,
326
+ 622,
327
+ 624,
328
+ 626
329
+ ]
330
+ }
trainable_lemmatizer/model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7e805d213d14e2ffc16874198aea9f567442bb90a708f6b184fb094c7a266d20
3
+ size 1003429
trainable_lemmatizer/trees ADDED
Binary file (101 kB). View file
 
use_custom_tokenizer.py ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from spacy.util import registry
2
+
3
+ from commercial_registry_ner.spacy.custom_tokenizer.custom_tokenizer import (
4
+ custom_tokenizer,
5
+ )
6
+
7
+
8
+ @registry.tokenizers("customize_tokenizer")
9
+ def make_customize_tokenizer():
10
+ def customize_tokenizer(nlp):
11
+ return custom_tokenizer(nlp)
12
+
13
+ return customize_tokenizer
vocab/key2row ADDED
@@ -0,0 +1 @@
 
 
1
+
vocab/lookups.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:76be8b528d0075f7aae98d6fa57a6d3c83ae480a8469e668d7b0af968995ac71
3
+ size 1
vocab/strings.json ADDED
The diff for this file is too large to render. See raw diff
 
vocab/vectors ADDED
Binary file (128 Bytes). View file
 
vocab/vectors.cfg ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ {
2
+ "mode":"default"
3
+ }