|
--- |
|
license: cc-by-sa-4.0 |
|
tags: |
|
- generated_from_trainer |
|
metrics: |
|
- precision |
|
- recall |
|
- f1 |
|
- accuracy |
|
model-index: |
|
- name: roberta-large-finetuned-abbr |
|
results: [] |
|
language: |
|
- en |
|
--- |
|
|
|
<!-- This model card has been generated automatically according to the information the Trainer had access to. You |
|
should probably proofread and complete it, then remove this comment. --> |
|
|
|
# roberta-large-finetuned-abbr-unfiltered-plod |
|
|
|
This model is a fine-tuned version of [roberta-large](https://huggingface.co./roberta-large) on the [PLODv2 unfiltered dataset](https://github.com/shenbinqian/PLODv2-CLM4AbbrDetection). |
|
It is released with our LREC-COLING 2024 publication [Using character-level models for efficient abbreviation and long-form detection](https://aclanthology.org/2024.lrec-main.270/). It achieves the following results on the test set: |
|
|
|
Results on abbreviations: |
|
- Precision: 0.8916 |
|
- Recall: 0.9152 |
|
- F1: 0.9033 |
|
|
|
|
|
Results on long forms: |
|
- Precision: 0.8607 |
|
- Recall: 0.9142 |
|
- F1: 0.8867 |
|
|
|
|
|
## Model description |
|
|
|
More information needed |
|
|
|
## Intended uses & limitations |
|
|
|
More information needed |
|
|
|
## Training and evaluation data |
|
|
|
More information needed |
|
|
|
## Training procedure |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- learning_rate: 2e-05 |
|
- train_batch_size: 4 |
|
- eval_batch_size: 4 |
|
- seed: 42 |
|
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 |
|
- lr_scheduler_type: linear |
|
- num_epochs: 6 |
|
|
|
### Training results |
|
|
|
| Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1 | Accuracy | |
|
|:-------------:|:-----:|:------:|:---------------:|:---------:|:------:|:------:|:--------:| |
|
| 0.167 | 0.25 | 7000 | 0.1616 | 0.9484 | 0.9366 | 0.9424 | 0.9376 | |
|
| 0.1673 | 0.49 | 14000 | 0.1459 | 0.9504 | 0.9370 | 0.9437 | 0.9389 | |
|
| 0.1472 | 0.74 | 21000 | 0.1560 | 0.9531 | 0.9373 | 0.9451 | 0.9398 | |
|
| 0.1519 | 0.98 | 28000 | 0.1434 | 0.9551 | 0.9382 | 0.9466 | 0.9415 | |
|
| 0.1388 | 1.23 | 35000 | 0.1472 | 0.9516 | 0.9374 | 0.9444 | 0.9400 | |
|
| 0.1291 | 1.48 | 42000 | 0.1416 | 0.9557 | 0.9403 | 0.9479 | 0.9431 | |
|
| 0.1298 | 1.72 | 49000 | 0.1394 | 0.9577 | 0.9459 | 0.9517 | 0.9470 | |
|
| 0.1269 | 1.97 | 56000 | 0.1401 | 0.9587 | 0.9446 | 0.9516 | 0.9468 | |
|
| 0.1128 | 2.21 | 63000 | 0.1410 | 0.9568 | 0.9497 | 0.9533 | 0.9486 | |
|
| 0.1154 | 2.46 | 70000 | 0.1366 | 0.9583 | 0.9495 | 0.9539 | 0.9493 | |
|
| 0.1138 | 2.71 | 77000 | 0.1413 | 0.9600 | 0.9502 | 0.9551 | 0.9506 | |
|
| 0.1117 | 2.95 | 84000 | 0.1313 | 0.9605 | 0.9501 | 0.9552 | 0.9508 | |
|
| 0.0997 | 3.2 | 91000 | 0.1503 | 0.9577 | 0.9527 | 0.9552 | 0.9507 | |
|
| 0.1008 | 3.44 | 98000 | 0.1360 | 0.9587 | 0.9536 | 0.9561 | 0.9515 | |
|
| 0.0909 | 3.69 | 105000 | 0.1435 | 0.9619 | 0.9520 | 0.9569 | 0.9525 | |
|
| 0.0903 | 3.93 | 112000 | 0.1482 | 0.9619 | 0.9522 | 0.9570 | 0.9528 | |
|
| 0.075 | 4.18 | 119000 | 0.1603 | 0.9616 | 0.9546 | 0.9581 | 0.9537 | |
|
| 0.0804 | 4.43 | 126000 | 0.1512 | 0.9600 | 0.9560 | 0.9580 | 0.9536 | |
|
| 0.0811 | 4.67 | 133000 | 0.1435 | 0.9628 | 0.9543 | 0.9585 | 0.9540 | |
|
| 0.0778 | 4.92 | 140000 | 0.1384 | 0.9616 | 0.9566 | 0.9591 | 0.9548 | |
|
| 0.065 | 5.16 | 147000 | 0.1640 | 0.9622 | 0.9567 | 0.9595 | 0.9550 | |
|
| 0.0607 | 5.41 | 154000 | 0.1755 | 0.9632 | 0.9562 | 0.9597 | 0.9554 | |
|
| 0.0587 | 5.66 | 161000 | 0.1643 | 0.9622 | 0.9575 | 0.9599 | 0.9555 | |
|
| 0.062 | 5.9 | 168000 | 0.1663 | 0.9628 | 0.9569 | 0.9598 | 0.9556 | |
|
|
|
|
|
### Framework versions |
|
|
|
- Transformers 4.16.2 |
|
- Pytorch 1.11.0 |
|
- Datasets 2.1.0 |
|
- Tokenizers 0.10.3 |