Non-English language capabilities
Curious to know how good it performs in non English and non Latin base scripts. As a base model for multilingual fine-tuning.
It would be nice to report a list of languages included in the training data and the amount of tokens in millions.
It would also be interesting to run this benchmark: https://huggingface.co./datasets/caro-holt/MultiQ
Measure accuracy in different languages + fidelity (replying in the same language as the query).
for italian:
Model Arc-c HellaS MMUL
LLama3 8b instruct 44.3 59.9 55.7
for italian:
Model Arc-c HellaS MMUL
LLama3 8b instruct 44.3 59.9 55.7
@FinancialSupport How did you test so I can test for other languages?
On a very quick test on private german and french data it beats ybelkada/Mixtral-8x7B-Instruct-v0.1-AWQ