Verah's picture
Update README.md
fea0882 verified
metadata
license: apache-2.0

This is a linear model merge of:

60% https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2

40% https://huggingface.co/stabilityai/japanese-stablelm-instruct-gamma-7b

I recommend following the Mistral chat template and prompting in English.

Evaluation

Tested on correct en-jp translation identification on the first 10k rows of https://huggingface.co/datasets/Verah/tatoeba_dedupe_en-jp_2024-March-01

Desired behaviour is to not accept any translation when we deliberaly test incorrect pairings from the dataset, and to not reject any translation when shown only correctly paired examples.

Model False Admissions False Rejections
Mistral Instruct 41 600
(This Model) 13 1839
JP Stable LM Gamma 9679 138
Hermes2DPO 20 598

I made the test harder by concatenating 3 paired sentences together, in the false admissions case 1 out of those 3 was incorrectly paired.

Model False Admissions False* Rejections
(This Model) 89 5508
Hermes2DPO 537 1458

This model also wanted to reject many "correct" translations, however 3 unrelated sentences back to back isn't a very correct thing to be doing, either.