Beginner Questions about Formatting Dataset and hardware
#11
by
LogicBombaklot
- opened
I would like to train a new low-resource language in ALMA-R. I am very new to LLM's but I am intrigued by the possibilities.
Can you please advise on how my monolingual and parallel datasets should be formatted for fine-tuning?
Can you also advise on what kind of vram I will need, and if I can use multiple gpu's?
Thank you, and great job on the model and the papers. I found CPO really interesting.