How to Best Fine-Tune TaxoLLaMA for German Lexical Semantics?
Hello Hugging Face Community! ๐
Iโm working on fine-tuning the TaxoLLaMA model to support German taxonomy-related tasks (e.g., generating hypernyms for hyponyms in German)
Iโd appreciate any guidance, shared experiences, or resources! Thank you in advance. ๐
Hello!
Thank you for your interest in our model and paper! Sure, we are happy to share our insights.
Firstly, our model is tuned with QLoRA, so you may start from the existing checkpoint and continue learning only the LoRA adapters. However, be aware of the hyperparameters, as setting too high learning rate might cause the model to collapse and output plain text (not hypernyms). As well, we have noticed the issue with batch size, sometimes setting higher or lower batch size results in better results, and it is hard to predict why the observed behavior happens, so the heuristics like "larger batch better results" might not work.
Overall, the recommendation is to properly tune the training hyperparameters, as they might affect the performance seriously.
As the first step to adapting for German, maybe you would like to try few-shot setting, as we have observed fast language adaptation with in-context learning in our paper, sometimes stronger than all previous baselines. It would be a cheap and easy way to start with!
Thank you for the insights! Iโll start with few-shot learning for German and carefully adjust hyperparameters. Much appreciated!