File size: 872 Bytes
a3b0340
 
 
 
 
 
 
 
 
 
 
 
 
63183e8
 
 
a3b0340
 
716dd68
a3b0340
62951ca
53a3396
83d7498
 
62951ca
83d7498
b5a1511
62951ca
83d7498
b5a1511
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
---
license: mit
language:
- en
metrics:
- f1
- precision
- recall
- accuracy
library_name: transformers
pipeline_tag: text-classification
tags:
- history
- historical
- holocaust
- war
---

# LaBSE-Malach-Multilabel

A multilabel text classification model fine-tuned on a small English subset (Malach ASR) of the Visual History Archive.
Based on LaBSE pretrained weights but it uses the general Hugging Face framework, not sentence-transformers.
Input text segments consisted of ~350 words on average.

Given an input string, the model predicts probablites for 1063 keyword IDs from the VHA ontology.
Typically, probabilities >= 0.5 are "True" if encoding them in a binary vector.

Due to the small training data, the most likely predictions are usually correct but do not meet the threshold.

The mapping from keyword IDs to labels will be added to the repository later.