taidopurason
commited on
Added a link to Alpaca-est
Browse files
README.md
CHANGED
@@ -14,7 +14,7 @@ Llama-2-7B finetuned in two stages:
|
|
14 |
1. 5B tokens of CulturaX with 75% of documents in Estonain and 25% in English (see [Llammas-base](https://huggingface.co/tartuNLP/Llammas-base)),
|
15 |
2. Alpaca-cleaned, Alpaca-est, OASST1 top-1 English conversations, CoT and FLAN-V2 following open-instruct (both 10,000), WMT18 English-Estonian translation development data (as documents), general MTee validation English-Estonian held-out data.
|
16 |
|
17 |
-
Alpaca-est is an instruction dataset generated for Estonian with *gpt-3.5-turbo-0613*, following Alpaca. More details in our [paper](https://arxiv.org/abs/2404.04042).
|
18 |
|
19 |
|
20 |
### Using the model
|
|
|
14 |
1. 5B tokens of CulturaX with 75% of documents in Estonain and 25% in English (see [Llammas-base](https://huggingface.co/tartuNLP/Llammas-base)),
|
15 |
2. Alpaca-cleaned, Alpaca-est, OASST1 top-1 English conversations, CoT and FLAN-V2 following open-instruct (both 10,000), WMT18 English-Estonian translation development data (as documents), general MTee validation English-Estonian held-out data.
|
16 |
|
17 |
+
[Alpaca-est](https://github.com/TartuNLP/alpaca-est) is an instruction dataset generated for Estonian with *gpt-3.5-turbo-0613*, following Alpaca. More details in our [paper](https://arxiv.org/abs/2404.04042).
|
18 |
|
19 |
|
20 |
### Using the model
|