metadata

license: mit
base_model: gpt2
tags:
  - generated_from_trainer
model-index:
  - name: gpt2_finetuned_10000recipe_chicken
    results: []

gpt2_finetuned_10000recipe_chicken

This model is a fine-tuned version of gpt2 on an nlg dataset from https://github.com/Glorf/recipenlg/tree/main which has been subset into recipes containing chicken. It achieves the following results on the evaluation set:

Loss: 1.5802

Model description

This model is a fine-tuned version of gpt2 using 10,000 chicken recipes extracted from nlg dataset.
It achieves the following results on the evaluation set:

Loss: 1.3647

Intended uses & limitations

The use is for personal and educational purposes.

Training and evaluation data

The model uses 10043 recipes for its training data and 100 recipes for its evaluation data.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 4
eval_batch_size: 4
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
num_epochs: 3

Training results

Training Loss	Epoch	Step	Validation Loss
1.866	1.0	2511	1.7299
1.5425	2.0	5022	1.6135
1.3647	3.0	7533	1.5802

Framework versions

Transformers 4.31.0
Pytorch 2.0.1+cpu
Datasets 2.14.4
Tokenizers 0.11.0

Reference

@inproceedings{bien-etal-2020-recipenlg, title = "{R}ecipe{NLG}: A Cooking Recipes Dataset for Semi-Structured Text Generation", author = "Bie{'n}, Micha{\l} and Gilski, Micha{\l} and Maciejewska, Martyna and Taisner, Wojciech and Wisniewski, Dawid and Lawrynowicz, Agnieszka", booktitle = "Proceedings of the 13th International Conference on Natural Language Generation", month = dec, year = "2020", address = "Dublin, Ireland", publisher = "Association for Computational Linguistics", url = "https://www.aclweb.org/anthology/2020.inlg-1.4", pages = "22--28", }