Draig-Fach-v0.1
This is a proof of concept model*
Model Details
Model Description
Draig-Fach-v0.1 is an instruction fine-tuned small language model based on the Mistral-7b architecture, specifically developed to understand and generate the Welsh language. This model represents an effort to support and preserve the Welsh language, leveraging AI and machine learning technologies. It has been trained on a bespoke dataset compiled from a variety of sources, including literature, websites, and conversational transcripts in Welsh.
- Developed by: Eryrilabs.com
- License: cc-by-nc-4.0
- Finetuned from model: mistralai/Mistral-7B-Instruct-v0.2
How to use
You can use this model directly with a Hugging Face pipeline:
from transformers import pipeline, Conversation
import torch
base_model_name = "EryriLabs/Draig-Fach-v0.1"
chatbot = pipeline("conversational", model=base_model_name, torch_dtype=torch.float16, device_map="auto")
conversation = Conversation("Sut wyt ti?")
conversation = chatbot(conversation)
print(conversation.messages[-1]["content"])
Uses
Draig-Fach-v0.1 is intended for:
Natural language understanding and generation in Welsh
Supporting developers and researchers interested in the Welsh language
Serving as a tool for education and language preservation
Bias, Risks, and Limitations
As a proof of concept, Draig-Fach-v0.1 has several limitations:
The model's understanding and generation capabilities in Welsh are basic and may not accurately reflect complex nuances.
Performance may vary across different types of Welsh text, especially with colloquialisms or regional dialects.
Some Welsh sentences might not make complete sense and the model does hallucinate at times.
Training Details
Training Data
The small set of training data for Draig-Fach-v0.1 was sourced from a variety of Welsh language materials, including but not limited to:
Published literature
Online articles
Conversational transcripts
Training
The model was fine-tuned on a Mistral-7b base, utilizing a custom dataset specifically curated for this project. Fine-tuning was conducted with an emphasis on understanding and generating conversational Welsh.
About EryriLabs
EryriLabs® is a dynamic tech startup located in the picturesque heart of Snowdonia, also known as Eryri in Welsh. At EryriLabs, our specialisation lies in creating tailor-made LLM models that cater to the unique requirements of our clients
Let us know if you use our model. Also, if you need any help or more information, feel free to contact us at [email protected]
- Downloads last month
- 14