Mariia
commited on
Commit
•
638e249
1
Parent(s):
25e9432
Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,38 @@
|
|
1 |
---
|
2 |
license: afl-3.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: afl-3.0
|
3 |
+
language:
|
4 |
+
- es
|
5 |
+
tags:
|
6 |
+
- biomedical
|
7 |
+
- social media
|
8 |
+
- ner
|
9 |
+
metrics:
|
10 |
+
- f1
|
11 |
+
widget:
|
12 |
+
- text: "La semana que viene estaremos en el I Congreso para personas con cáncer y familiares ☺ #aecc #Congreso #finde "
|
13 |
+
example_title: "Oncology"
|
14 |
+
- text: "No dejéis de leer esta interesantísima entrada del Dr. Martínez-Lage donde reivindica los errores médicos a la hora de diagnosticar #Alzheimer u otros tipos de #demencias."
|
15 |
+
example_title: "Alzheimer"
|
16 |
+
- text: "Cada vez hay más CCAA que se suman la regulación de #desfibriladores (#DESA) en espacios deportivos, lamentamos este caso de parada cardíaca que afectó de nuevo a un deportista."
|
17 |
+
example_title: "cardiac arrest"
|
18 |
+
- text: "La jaqueca o la migraña puede llegar a ser muy desesperante, algunas veces los remedios para dolor de cabeza de origen farmacéutico son ineficientes y por más analgésicos que tomemos el malestar no cede."
|
19 |
+
example_title: "Migraine"
|
20 |
+
- text: "Os sorprenderíais la de mensajes que me llegan cada día (sobre todo cuando se acerca el verano) preguntándome como eliminar la celulitis, como hacer que desaparezca mágicamente la grasita… "
|
21 |
+
example_title: "Celulitis"
|
22 |
---
|
23 |
+
|
24 |
+
# Disease mention recognizer for Spanish Social Media texts 🦠💬
|
25 |
+
This resource derives from the participation of the SINAI team in [Mining Social Media Content for Disease Mention (SocialDisNER)](https://temu.bsc.es/socialdisner/) shared task. This task focused on the recognition of disease mentions in tweets written in Spanish with the aim of using Twitter as a proxy to better understand societal perception of disease. This task brought the community effort to developing named entity recognition (NER) approaches to detect **all kinds** of disease mentions in social media text.
|
26 |
+
|
27 |
+
Our approach is based on a [model pre-trained on general-domain text](https://huggingface.co/PlanTL-GOB-ES/roberta-base-bne). In order to leverage large scale additional [Silver Standard data](https://zenodo.org/record/6803567/preview/SocialDisNER_LargeScale_additionaldata.zip#tree_item0) with automatically generated labels provided by task’s organisers we designed a two-stage fine-tuning framework. The figure below illustrated the fine-tuning process:
|
28 |
+
|
29 |
+
# Results
|
30 |
+
The model contained in this repository constitutes the fundament of the NER system presented by the SINAI team on SocialDisNER. Enhanced with data [`pysentimiento`](https://github.com/pysentimiento/pysentimiento) pre-processing and rule-based submission post-processing, it obtained encouraging results during the official evaluation, which are summarised in the table below.
|
31 |
+
|
32 |
+
| Precision | Recall | F1-score |
|
33 |
+
|-----------|--------|----------|
|
34 |
+
| 0.756 |0. 795 | 0.770 |
|
35 |
+
|
36 |
+
|
37 |
+
# System description paper and citation
|
38 |
+
The system description paper will be published at Social Media Mining for Health Application (#SMM4H) held on COLING22 in October 2022.
|