Post
386
How do I test an LLM for my unique needs?
If you work in finance, law, or medicine, generic benchmarks are not enough.
This blog post uses Argilla, Distilllabel and 🌤️Lighteval to generate evaluation dataset and evaluate models.
https://github.com/argilla-io/argilla-cookbook/blob/main/domain-eval/README.md
If you work in finance, law, or medicine, generic benchmarks are not enough.
This blog post uses Argilla, Distilllabel and 🌤️Lighteval to generate evaluation dataset and evaluate models.
https://github.com/argilla-io/argilla-cookbook/blob/main/domain-eval/README.md