English
Spanish
Italian

How to Best Fine-Tune TaxoLLaMA for German Lexical Semantics?

#1
by Safer99 - opened

Hello Hugging Face Community! ๐Ÿ‘‹

Iโ€™m working on fine-tuning the TaxoLLaMA model to support German taxonomy-related tasks (e.g., generating hypernyms for hyponyms in German)

Iโ€™d appreciate any guidance, shared experiences, or resources! Thank you in advance. ๐Ÿ˜Š

Hello!

Thank you for your interest in our model and paper! Sure, we are happy to share our insights.
Firstly, our model is tuned with QLoRA, so you may start from the existing checkpoint and continue learning only the LoRA adapters. However, be aware of the hyperparameters, as setting too high learning rate might cause the model to collapse and output plain text (not hypernyms). As well, we have noticed the issue with batch size, sometimes setting higher or lower batch size results in better results, and it is hard to predict why the observed behavior happens, so the heuristics like "larger batch better results" might not work.
Overall, the recommendation is to properly tune the training hyperparameters, as they might affect the performance seriously.

As the first step to adapting for German, maybe you would like to try few-shot setting, as we have observed fast language adaptation with in-context learning in our paper, sometimes stronger than all previous baselines. It would be a cheap and easy way to start with!

Thank you for the insights! Iโ€™ll start with few-shot learning for German and carefully adjust hyperparameters. Much appreciated!

Sign up or log in to comment