---
library_name: transformers
tags:
- llm
- 7b
license: cc-by-nc-4.0
---

# Draig-Fach-v0.1

**This is a proof of concept model***

<figure>

![Draig-Fach](draig-fach.jpg)  

</figure>

## Model Details

### Model Description

Draig-Fach-v0.1 is an instruction fine-tuned small language model based on the Mistral-7b architecture, specifically developed to understand and generate the Welsh language. This model represents an effort to support and preserve the Welsh language, leveraging AI and machine learning technologies. It has been trained on a bespoke dataset compiled from a variety of sources, including literature, websites, and conversational transcripts in Welsh.

- **Developed by:** Eryrilabs.com
- **License:** cc-by-nc-4.0
- **Finetuned from model:** mistralai/Mistral-7B-Instruct-v0.2


## How to use

You can use this model directly with a Hugging Face pipeline:
```python

from transformers import pipeline, Conversation
import torch

base_model_name = "EryriLabs/Draig-Fach-v0.1"
chatbot = pipeline("conversational", model=base_model_name, torch_dtype=torch.float16, device_map="auto")
conversation = Conversation("Sut wyt ti?")
conversation = chatbot(conversation)
print(conversation.messages[-1]["content"])

```

## Uses

Draig-Fach-v0.1 is intended for:

    Natural language understanding and generation in Welsh
    Supporting developers and researchers interested in the Welsh language
    Serving as a tool for education and language preservation


## Bias, Risks, and Limitations

As a proof of concept, Draig-Fach-v0.1 has several limitations:

    The model's understanding and generation capabilities in Welsh are basic and may not accurately reflect complex nuances.
    Performance may vary across different types of Welsh text, especially with colloquialisms or regional dialects.
    Some Welsh sentences might not make complete sense and the model does hallucinate at times. 

## Training Details

### Training Data

The small set of training data for Draig-Fach-v0.1 was sourced from a variety of Welsh language materials, including but not limited to:

    Published literature
    Online articles
    Conversational transcripts

### Training

The model was fine-tuned on a Mistral-7b base, utilizing a custom dataset specifically curated for this project. Fine-tuning was conducted with an emphasis on understanding and generating conversational Welsh.

## About EryriLabs

EryriLabs® is a dynamic tech startup located in the picturesque heart of Snowdonia, also known as Eryri in Welsh. At EryriLabs, our specialisation lies in creating tailor-made LLM models that cater to the unique requirements of our clients

Let us know if you use our model. Also, if you need any help or more information, feel free to contact us at queries@eryrilabs.com