|
--- |
|
license: mit |
|
language: |
|
- en |
|
metrics: |
|
- f1 |
|
- precision |
|
- recall |
|
- accuracy |
|
library_name: transformers |
|
pipeline_tag: text-classification |
|
tags: |
|
- history |
|
- historical |
|
- holocaust |
|
- war |
|
--- |
|
|
|
# LaBSE-Malach-Multilabel |
|
|
|
A multilabel text classification model fine-tuned on a small English subset (Malach ASR) of the Visual History Archive. |
|
Based on LaBSE pretrained weights but it uses the general Hugging Face framework, not sentence-transformers. |
|
Input text segments consisted of ~350 words on average. |
|
|
|
Given an input string, the model predicts probablites for 1063 keyword IDs from the VHA ontology. |
|
Typically, probabilities >= 0.5 are "True" if encoding them in a binary vector. |
|
|
|
Due to the small training data, the most likely predictions are usually correct but do not meet the threshold. |
|
|
|
The mapping from keyword IDs to labels will be added to the repository later. |