Arabic NER Model for AQMAR dataset

Training was conducted over 86 epochs, using a linear decaying learning rate of 2e-05, starting from 0.3 and a batch size of 48 with fastText and Flair forward and backward embeddings.

Original Dataset:

Results:

  • F1-score (micro) 0.9323
  • F1-score (macro) 0.9272
True Posititves False Positives False Negatives Precision Recall class-F1
LOC 164 7 13 0.9591 0.9266 0.9425
MISC 398 22 37 0.9476 0.9149 0.9310
ORG 65 6 9 0.9155 0.8784 0.8966
PER 199 13 13 0.9387 0.9387 0.9387

Usage

from flair.data import Sentence
from flair.models import SequenceTagger
import pyarabic.araby as araby
from icecream import ic

arTagger = SequenceTagger.load('megantosh/flair-arabic-MSA-aqmar')

sentence = Sentence('George Washington went to Washington .')
arSentence = Sentence('عمرو عادلي أستاذ للاقتصاد السياسي المساعد في الجامعة الأمريكية  بالقاهرة .')


# predict NER tags
tagger.predict(sentence)
arTagger.predict(arSentence)

# print sentence with predicted tags
ic(sentence.to_tagged_string)
ic(arSentence.to_tagged_string)

Example

see an example from a similar NER model in Flair

Model Configuration

  (embeddings): StackedEmbeddings(
    (list_embedding_0): WordEmbeddings('ar')
    (list_embedding_1): FlairEmbeddings(
      (lm): LanguageModel(
        (drop): Dropout(p=0.1, inplace=False)
        (encoder): Embedding(7125, 100)
        (rnn): LSTM(100, 2048)
        (decoder): Linear(in_features=2048, out_features=7125, bias=True)
      )
    )
    (list_embedding_2): FlairEmbeddings(
      (lm): LanguageModel(
        (drop): Dropout(p=0.1, inplace=False)
        (encoder): Embedding(7125, 100)
        (rnn): LSTM(100, 2048)
        (decoder): Linear(in_features=2048, out_features=7125, bias=True)
      )
    )
  )
  (word_dropout): WordDropout(p=0.05)
  (locked_dropout): LockedDropout(p=0.5)
  (embedding2nn): Linear(in_features=4396, out_features=4396, bias=True)
  (rnn): LSTM(4396, 256, batch_first=True, bidirectional=True)
  (linear): Linear(in_features=512, out_features=14, bias=True)
  (beta): 1.0
  (weights): None
  (weight_tensor) None
)"
2021-03-31 22:19:50,654 ----------------------------------------------------------------------------------------------------
2021-03-31 22:19:50,654 Corpus: "Corpus: 3025 train + 336 dev + 373 test sentences"
2021-03-31 22:19:50,654 ----------------------------------------------------------------------------------------------------
2021-03-31 22:19:50,654 Parameters:
2021-03-31 22:19:50,654  - learning_rate: "0.3"
2021-03-31 22:19:50,654  - mini_batch_size: "48"
2021-03-31 22:19:50,654  - patience: "3"
2021-03-31 22:19:50,654  - anneal_factor: "0.5"
2021-03-31 22:19:50,654  - max_epochs: "150"
2021-03-31 22:19:50,654  - shuffle: "True"
2021-03-31 22:19:50,654  - train_with_dev: "False"
2021-03-31 22:19:50,654  - batch_growth_annealing: "False"
2021-03-31 22:19:50,655 ------------------------------------

Due to the right-to-left in left-to-right context, some formatting errors might occur. and your code might appear like this, (link accessed on 2020-10-27)

Citation

if you use this model, please consider citing this work:

@unpublished{MMHU21
author = "M. Megahed",
title = "Sequence Labeling Architectures in Diglossia",
year = {2021},
doi = "10.13140/RG.2.2.34961.10084"
url = {https://www.researchgate.net/publication/358956953_Sequence_Labeling_Architectures_in_Diglossia_-_a_case_study_of_Arabic_and_its_dialects}
}
Downloads last month
2
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.