--- library_name: transformers tags: - llm - 7b license: cc-by-nc-4.0 --- # Draig-Fach-v0.1 **This is a proof of concept model***
![Draig-Fach](draig-fach.jpg)
## Model Details ### Model Description Draig-Fach-v0.1 is an instruction fine-tuned small language model based on the Mistral-7b architecture, specifically developed to understand and generate the Welsh language. This model represents an effort to support and preserve the Welsh language, leveraging AI and machine learning technologies. It has been trained on a bespoke dataset compiled from a variety of sources, including literature, websites, and conversational transcripts in Welsh. - **Developed by:** Eryrilabs.com - **License:** cc-by-nc-4.0 - **Finetuned from model:** mistralai/Mistral-7B-Instruct-v0.2 ## How to use You can use this model directly with a Hugging Face pipeline: ```python from transformers import pipeline, Conversation import torch base_model_name = "EryriLabs/Draig-Fach-v0.1" chatbot = pipeline("conversational", model=base_model_name, torch_dtype=torch.float16, device_map="auto") conversation = Conversation("Sut wyt ti?") conversation = chatbot(conversation) print(conversation.messages[-1]["content"]) ``` ## Uses Draig-Fach-v0.1 is intended for: Natural language understanding and generation in Welsh Supporting developers and researchers interested in the Welsh language Serving as a tool for education and language preservation ## Bias, Risks, and Limitations As a proof of concept, Draig-Fach-v0.1 has several limitations: The model's understanding and generation capabilities in Welsh are basic and may not accurately reflect complex nuances. Performance may vary across different types of Welsh text, especially with colloquialisms or regional dialects. Some Welsh sentences might not make complete sense and the model does hallucinate at times. ## Training Details ### Training Data The small set of training data for Draig-Fach-v0.1 was sourced from a variety of Welsh language materials, including but not limited to: Published literature Online articles Conversational transcripts ### Training The model was fine-tuned on a Mistral-7b base, utilizing a custom dataset specifically curated for this project. Fine-tuning was conducted with an emphasis on understanding and generating conversational Welsh. ## About EryriLabs EryriLabs® is a dynamic tech startup located in the picturesque heart of Snowdonia, also known as Eryri in Welsh. At EryriLabs, our specialisation lies in creating tailor-made LLM models that cater to the unique requirements of our clients Let us know if you use our model. Also, if you need any help or more information, feel free to contact us at queries@eryrilabs.com