Edit model card

Model Card for Finnish-Viking-Alpaca-V1-7B

This is a merge of mpasila/Finnish-Viking-Alpaca-V1-LoRA-7B.

LoRA trained with text-generation-webui in 4-bit using LumiOpen/Viking-7B as the base model for 1 epoch. Dataset used with the LoRA is pinzhenchen/alpaca-cleaned-fi.

It uses Alpaca format but with a translated instruction at the start:

{
    "instruction,output": "Alla on ohje, jossa kuvataan tehtävä. Kirjoita vastaus, joka täyttää pyynnön asianmukaisesti.\n\n### Instruction:\n%instruction%\n\n### Response:\n%output%",
    "instruction,input,output": "Alla on ohje, jossa kuvataan tehtävä ja joka on yhdistetty kontekstia lisäävään syötteeseen. Kirjoita vastaus, joka täyttää pyynnön asianmukaisesti.\n\n### Instruction:\n%instruction%\n\n### Input:\n%input%\n\n### Response:\n%output%"
}

Merged using this Colab notebook. It might not be the best way to merge a quantized LoRA on to a float16 model but I just wanted to quickly do something. You can try merging it better if you want.

Evaluation

Model Size Type FIN-bench (score)
mpasila/Finnish-Viking-Alpaca-V1-7B 7B Instruct 0.3943
mpasila/Finnish-Alpaca-Small-7B 7B Instruct 0.3586
mpasila/Finnish-Alpaca-Tiny-V2-7B 7B Instruct 0.4654
mpasila/Alpacazord-Viking-7B 7B Instruct 0.4123
mpasila/NordicAlpaca-Finnish-V1-7B 7B Instruct 0.3891
Finnish-NLP/llama-7b-finnish-instruct-v0.1 7B Instruct 0.4365
Finnish-NLP/llama-7b-finnish-instruct-v0.2 7B Instruct 0.3993
Finnish-NLP/llama-7b-finnish 7B Base 0.2350
LumiOpen/Viking-7B (1000B) 7B Base 0.3721
HPLT/gpt-7b-nordic-prerelease 7B Base 0.3169

Source

FIN-bench scores:

Task Version Metric Value Stderr
bigbench_analogies 0 multiple_choice_grade 0.6308 ± 0.0425
bigbench_arithmetic_1_digit_addition 0 multiple_choice_grade 0.6400 ± 0.0482
bigbench_arithmetic_1_digit_division 0 multiple_choice_grade 0.7391 ± 0.0936
bigbench_arithmetic_1_digit_multiplication 0 multiple_choice_grade 0.2800 ± 0.0451
bigbench_arithmetic_1_digit_subtraction 0 multiple_choice_grade 0.5000 ± 0.0503
bigbench_arithmetic_2_digit_addition 0 multiple_choice_grade 0.1800 ± 0.0386
bigbench_arithmetic_2_digit_division 0 multiple_choice_grade 0.4800 ± 0.0502
bigbench_arithmetic_2_digit_multiplication 0 multiple_choice_grade 0.0800 ± 0.0273
bigbench_arithmetic_2_digit_subtraction 0 multiple_choice_grade 0.2500 ± 0.0435
bigbench_arithmetic_3_digit_addition 0 multiple_choice_grade 0.1800 ± 0.0386
bigbench_arithmetic_3_digit_division 0 multiple_choice_grade 0.2500 ± 0.0435
bigbench_arithmetic_3_digit_multiplication 0 multiple_choice_grade 0.1700 ± 0.0378
bigbench_arithmetic_3_digit_subtraction 0 multiple_choice_grade 0.5000 ± 0.0503
bigbench_arithmetic_4_digit_addition 0 multiple_choice_grade 0.2600 ± 0.0441
bigbench_arithmetic_4_digit_division 0 multiple_choice_grade 0.2500 ± 0.0435
bigbench_arithmetic_4_digit_multiplication 0 multiple_choice_grade 0.2100 ± 0.0409
bigbench_arithmetic_4_digit_subtraction 0 multiple_choice_grade 0.5200 ± 0.0502
bigbench_arithmetic_5_digit_addition 0 multiple_choice_grade 0.3900 ± 0.0490
bigbench_arithmetic_5_digit_division 0 multiple_choice_grade 0.1600 ± 0.0368
bigbench_arithmetic_5_digit_multiplication 0 multiple_choice_grade 0.1000 ± 0.0302
bigbench_arithmetic_5_digit_subtraction 0 multiple_choice_grade 0.6100 ± 0.0490
bigbench_cause_and_effect_one_sentence 0 multiple_choice_grade 0.6471 ± 0.0676
bigbench_cause_and_effect_one_sentence_no_prompt 0 multiple_choice_grade 0.6863 ± 0.0656
bigbench_cause_and_effect_two_sentences 0 multiple_choice_grade 0.3922 ± 0.0690
bigbench_emotions 0 multiple_choice_grade 0.2812 ± 0.0357
bigbench_empirical_judgments 0 multiple_choice_grade 0.2828 ± 0.0455
bigbench_general_knowledge 0 multiple_choice_grade 0.4000 ± 0.0590
bigbench_hhh_alignment_harmless 0 multiple_choice_grade 0.3621 ± 0.0637
bigbench_hhh_alignment_helpful 0 multiple_choice_grade 0.3559 ± 0.0629
bigbench_hhh_alignment_honest 0 multiple_choice_grade 0.3729 ± 0.0635
bigbench_hhh_alignment_other 0 multiple_choice_grade 0.5581 ± 0.0766
bigbench_intent_recognition 0 multiple_choice_grade 0.1879 ± 0.0149
bigbench_misconceptions 0 multiple_choice_grade 0.5373 ± 0.0432
bigbench_paraphrase 0 multiple_choice_grade 0.5150 ± 0.0354
bigbench_sentence_ambiguity 0 multiple_choice_grade 0.5000 ± 0.0651
bigbench_similarities_abstraction 0 multiple_choice_grade 0.7368 ± 0.0508

Framework versions

  • PEFT 0.8.2
Downloads last month
26
Safetensors
Model size
7.55B params
Tensor type
FP16
·
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for mpasila/Finnish-Viking-Alpaca-V1-7B

Base model

LumiOpen/Viking-7B
Finetuned
this model
Merges
1 model
Quantizations
1 model

Dataset used to train mpasila/Finnish-Viking-Alpaca-V1-7B

Collection including mpasila/Finnish-Viking-Alpaca-V1-7B