metadata

language:
  - bar
library_name: flair
pipeline_tag: token-classification
base_model: deepset/gbert-large
widget:
  - text: >-
      Dochau ( amtli : Dochau ) is a Grouße Kroasstod in Obabayern nordwestli vo
      Minga und liagt im gleichnoming Landkroas .
tags:
  - flair
  - token-classification
  - sequence-tagger-model
  - arxiv:2403.12749
  - O'zapft is!
  - 🥨
license: apache-2.0

Flair NER Model for Recognizing Named Entities in Bavarian Dialectal Data (Wikipedia)

This (unofficial) Flair NER model was trained on annotated Bavarian Wikipedia articles from the BarNER dataset that was proposed in the "Sebastian, Basti, Wastl?! Recognizing Named Entities in Bavarian Dialectal Data" LREC-COLING 2024 paper (and on arXiv) by Siyao Peng, Zihang Sun, Huangyan Shan, Marie Kolm, Verena Blaschke, Ekaterina Artemova and Barbara Plank.

The released dataset is used in the coarse setting that is shown in Table 3 in the paper. The following Named Entities are available:

PER
LOC
ORG
MISC

Fine-Tuning

We perform a hyper-parameter search over the following parameters:

Batch Sizes: [32, 16]
Learning Rates: [7e-06, 8e-06, 9e-06, 1e-05]
Epochs: [20]
Subword Pooling: [first]

As base model we use GBERT Large. We use three different seeds to report the averaged F1-Score on the development set:

Configuration	Run 1	Run 2	Run 3	Avg.
`bs32-e20-lr1e-05`	76.96	77	77.71	77.22 ± 0.34
`bs32-e20-lr8e-06`	76.75	76.21	77.38	76.78 ± 0.48
`bs16-e20-lr1e-05`	76.81	76.29	76.02	76.37 ± 0.33
`bs32-e20-lr7e-06`	75.44	76.71	75.9	76.02 ± 0.52
`bs32-e20-lr9e-06`	75.69	75.99	76.2	75.96 ± 0.21
`bs16-e20-lr8e-06`	74.82	76.83	76.14	75.93 ± 0.83
`bs16-e20-lr7e-06`	76.77	74.82	76.04	75.88 ± 0.8
`bs16-e20-lr9e-06`	76.55	74.25	76.54	75.78 ± 1.08

The hyper-parameter configuration bs32-e20-lr1e-05 yields to best results on the development set and we use this configuration to report the averaged F1-Score on the test set:

Configuration	Run 1	Run 2	Run 3	Avg.
`bs32-e20-lr1e-05`	72.1	74.33	72.97	73.13 ± 0.92

Our averaged result on test set is higher than the reported 72.17 in the original paper (see Table 5, in-domain training results).

For upload we used the best performing model on the development set, which is marked in bold. It achieves 72.97 on final test set.

Flair Demo

The following snippet shows how to use the fine-tuned NER models with Flair:

from flair.data import Sentence
from flair.models import SequenceTagger

# load tagger
tagger = SequenceTagger.load("stefan-it/flair-barner-wiki-coarse-gbert-large")

# make example sentence
sentence = Sentence("Dochau ( amtli : Dochau ) is a Grouße Kroasstod in Obabayern nordwestli vo Minga und liagt im gleichnoming Landkroas .")

# predict NER tags
tagger.predict(sentence)

# print sentence
print(sentence)

# print predicted NER spans
print('The following NER tags are found:')
# iterate over entities and print
for entity in sentence.get_spans('ner'):
    print(entity)