Reason for used data set

by appliedstuff - opened Jul 28, 2023

Jul 28, 2023

Hi,
I just checked the data set you used for the german adapation. I just found training examples that are either in German or English language. Is that the way one makes them understand another language (i.e. german)?

Do you have some background knowledge how you come up with your approach and reasoning how you decide what data to use and how you want make the model usable in german?

Do you checked the quality? Any metrics to measure your results?

flozi00

Owner Jul 28, 2023

I am always improving the dataset and continuing training more and more models.
So for example I already reached eval losses of less than 0.2
Since I am focused on German the largest part of this dataset is German.
If you want to have special use cases to support 2 languages in same quality I would either add translation prompts (translate x to y) between these languages or translate each entry into both languages so each entry is doubled but in seperate langs.

ulymp

Jul 29, 2023

Could not successfully try this out yet. But does it even make sense to try to finetune Llama-2 in any other language than English? AFAIK it was only trained on English language sources.

IMO Falcon would be much better suited for this task since it has been trained on multiple languages, including German.

flozi00

Owner Jul 29, 2023

Llama can be adapted to German pretty fast, less than 12 hours to reach good answers.
The llama 2 13b model reaches an 0.18 eval loss while falcon 7b is around 0.3.

appliedstuff

Jul 30, 2023

Llama can be adapted to German pretty fast, less than 12 hours to reach good answers.
The llama 2 13b model reaches an 0.18 eval loss while falcon 7b is around 0.3.

Great, to get a discussion started here. So, good to hear that it only takes this small time for finetuning and also the loss seems a good indicator it learns some German :-)

But what do you mean with "reach good answers". When I take a look at The Blokes GGML version of your model (see https://huggingface.co/TheBloke/llama-2-13B-German-Assistant-v2-GGML/discussions/1) the results are not usable. So can you ask the original model what it replies to the question "Wie weit ist der Abstand zwischen Erde und Sonne?"

That would be very helpful! Thank you so much for your work!!! Very interesting!

flozi00

Owner Jul 30, 2023

At the moment my inference instance is offline but we are working on getting it online again, then you can test it there

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment