mT5-base-turkish-qa
This model is a fine-tuned version of google/mt5-base on the ucsahin/TR-Extractive-QA-82K dataset. It achieves the following results on the evaluation set:
- Loss: 0.5109
- Rouge1: 79.3283
- Rouge2: 68.0845
- Rougel: 79.3474
- Rougelsum: 79.2937
Model description
mT5-base model is trained with manually curated Turkish dataset consisting of 65K training samples with ("question", "answer", "context") triplets.
Intended uses & limitations
The intended use of the model is extractive question answering.
In order to use the inference widget, enter your input in the format:
Soru: question_text
Metin: context_text
Generated response by the model:
Cevap: answer_text
Use with Transformers:
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
from datasets import load_dataset
# Load the dataset
qa_tr_datasets = load_dataset("ucsahin/TR-Extractive-QA-82K")
# Load model and tokenizer
model_checkpoint = "ucsahin/mT5-base-turkish-qa"
tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)
model = AutoModelForSeq2SeqLM.from_pretrained(model_checkpoint)
inference_dataset = qa_tr_datasets["test"].select(range(10))
for input in inference_dataset:
input_question = "Soru: " + input["question"]
input_context = "Metin: " + input["context"]
tokenized_inputs = tokenizer(input_question, input_context, max_length=512, truncation=True, return_tensors="pt")
outputs = model.generate(input_ids=tokenized_inputs["input_ids"], max_new_tokens=32)
output_text = tokenizer.batch_decode(outputs, skip_special_tokens=True)
print(f"Reference answer: {input['answer']}, Model Answer: {output_text}")
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 1
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
---|---|---|---|---|---|---|---|
2.0454 | 0.13 | 500 | 0.6771 | 73.1040 | 59.8915 | 73.1819 | 73.0558 |
0.8012 | 0.26 | 1000 | 0.6012 | 76.3357 | 64.1967 | 76.3796 | 76.2688 |
0.7703 | 0.39 | 1500 | 0.5844 | 76.8932 | 65.2509 | 76.9932 | 76.9418 |
0.6783 | 0.51 | 2000 | 0.5587 | 76.7284 | 64.8453 | 76.7416 | 76.6720 |
0.6546 | 0.64 | 2500 | 0.5362 | 78.2261 | 66.5893 | 78.2515 | 78.2142 |
0.6289 | 0.77 | 3000 | 0.5133 | 78.6917 | 67.1534 | 78.6852 | 78.6319 |
0.6292 | 0.9 | 3500 | 0.5109 | 79.3283 | 68.0845 | 79.3474 | 79.2937 |
Framework versions
- Transformers 4.36.2
- Pytorch 2.1.0+cu118
- Datasets 2.16.1
- Tokenizers 0.15.0
- Downloads last month
- 6
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for ucsahin/mT5-base-turkish-qa
Base model
google/mt5-base