Finnish-NLP
/

llama-7b-finnish-instruct-v0.2

@@ -1,201 +1,225 @@
 ---
 library_name: transformers
-tags: []
 ---
-# Model Card for Model ID
-<!-- Provide a quick summary of what the model is/does. -->
-## Model Details
-### Model Description
-<!-- Provide a longer summary of what this model is. -->
-This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
-- **Developed by:** [More Information Needed]
-- **Funded by [optional]:** [More Information Needed]
-- **Shared by [optional]:** [More Information Needed]
-- **Model type:** [More Information Needed]
-- **Language(s) (NLP):** [More Information Needed]
-- **License:** [More Information Needed]
-- **Finetuned from model [optional]:** [More Information Needed]
-### Model Sources [optional]
-<!-- Provide the basic links for the model. -->
-- **Repository:** [More Information Needed]
-- **Paper [optional]:** [More Information Needed]
-- **Demo [optional]:** [More Information Needed]
-## Uses
-<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
-### Direct Use
-<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
-[More Information Needed]
-### Downstream Use [optional]
-<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
-[More Information Needed]
-### Out-of-Scope Use
-<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
-[More Information Needed]
-## Bias, Risks, and Limitations
-<!-- This section is meant to convey both technical and sociotechnical limitations. -->
-[More Information Needed]
-### Recommendations
-<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
-Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
-## How to Get Started with the Model
-Use the code below to get started with the model.
-[More Information Needed]
-## Training Details
-### Training Data
-<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-[More Information Needed]
-### Training Procedure
-<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
-#### Preprocessing [optional]
-[More Information Needed]
-#### Training Hyperparameters
-- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
-#### Speeds, Sizes, Times [optional]
-<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
-[More Information Needed]
-## Evaluation
-<!-- This section describes the evaluation protocols and provides the results. -->
-### Testing Data, Factors & Metrics
-#### Testing Data
-<!-- This should link to a Dataset Card if possible. -->
-[More Information Needed]
-#### Factors
-<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
-[More Information Needed]
-#### Metrics
-<!-- These are the evaluation metrics being used, ideally with a description of why. -->
-[More Information Needed]
-### Results
-[More Information Needed]
-#### Summary
-## Model Examination [optional]
-<!-- Relevant interpretability work for the model goes here -->
-[More Information Needed]
-## Environmental Impact
-<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
-Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
-- **Hardware Type:** [More Information Needed]
-- **Hours used:** [More Information Needed]
-- **Cloud Provider:** [More Information Needed]
-- **Compute Region:** [More Information Needed]
-- **Carbon Emitted:** [More Information Needed]
-## Technical Specifications [optional]
-### Model Architecture and Objective
-[More Information Needed]
-### Compute Infrastructure
-[More Information Needed]
-#### Hardware
-[More Information Needed]
-#### Software
-[More Information Needed]
-## Citation [optional]
-<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
-**BibTeX:**
-[More Information Needed]
-**APA:**
-[More Information Needed]
-## Glossary [optional]
-<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
-[More Information Needed]
-## More Information [optional]
-[More Information Needed]
-## Model Card Authors [optional]
-[More Information Needed]
-## Model Card Contact
-[More Information Needed]

 ---
 library_name: transformers
+license: apache-2.0
+tags:
+- finnish
+- llama
+inference: true
+pipeline_tag: text-generation
 ---
+# Llama-7b-instruct-v0.2 for Finnish
+- This is 0.2 version release of our Instruct finetuned model from https://huggingface.co/Finnish-NLP/llama-7b-finnish
+- Model was trained for 3 epochs using 21946 samples and for this release we chose checkpoint at 8000 steps.
+- Future DPO/SFT+DPO variants are in the pipeline. Also we are investigating and testing different merging techiques
+For finetuning we try to select well known and widely used dataset and then filter/translate those with multiple methods:
+For this version we used a mix 21946 samples in total from the the following datasets:
+ - LIMA from https://github.com/TurkuNLP/finnish-instructions
+ - Dolly from https://github.com/TurkuNLP/finnish-instructions
+ - OASST from https://github.com/TurkuNLP/finnish-instructions
+ - Ultrachat https://huggingface.co/datasets/HuggingFaceH4/ultrafeedback_binarized/viewer/default/train_sft translated with deepl
+ - facebook/belebele Finnish subset
+ - google/boolq translated with deepl
+ - LDJnr/Capybara translated with deepl
+ - allenai/ai2_arc translated with deepl
+### How to use
+Here is an example of using this model with Unsloth with some generation arguments you can modify:
+```python
+import torch
+from unsloth import FastLlamaModel
+max_seq_length = 2048
+dtype = None # None for auto detection. Float16 for Tesla T4, V100, Bfloat16 for Ampere+
+load_in_4bit = True # Use 4bit quantization to reduce memory usage. Can be False.
+use_unsloth = True
+# use_transformers = True
+# LOADING MODEL USIINIG TRANSFORMERS assumes at least 16GB of memory. Tested with this configuration
+# If you have less memory use load_in_4bit or load_in_8_bit as needed
+if use_transformers:
+  major_version, minor_version = torch.cuda.get_device_capability()
+  model = AutoModelForCausalLM.from_pretrained("Finnish-NLP/llama-7b-finnish-instruct-v0.2", device_map='cuda:0', torch_dtype = torch.bfloat16 if major_version >=8 else torch.float16)
+  tokenizer = AutoTokenizer.from_pretrained("Finnish-NLP/llama-7b-finnish-instruct-v0.2")
+# USING UNSLOTH, tested with load_in_4bit
+if use_unsloth:
+  model, tokenizer = FastLlamaModel.from_pretrained(
+      model_name = "Finnish-NLP/llama-7b-finnish-instruct-v0.2"
+      max_seq_length = max_seq_length,
+      dtype = dtype,
+      load_in_4bit = load_in_4bit
+  )
+alpaca_prompt = """<|alku|> Olet tekoälyavustaja. Seuraavaksi saat kysymyksen tai tehtävän. Kirjoita vastaus parhaasi mukaan siten että se täyttää kysymyksen tai tehtävän vaatimukset.
+<|ihminen|> Kysymys/Tehtävä:
+{}
+<|avustaja|> Vastauksesi:
+"""
+sample_questions = ["Ketkä ovat Aku Ankan luona asuvat kolme ankanpoikaa?",\
+"Mikä on Suomen korkein tunturi?",\
+"Suomi soti Neuvostoliittoa vastaan talvisodan 1939-1940. Kuinka monta päivää sota kesti?",\
+"Luettele viisi yleistä Suomessa yleisesti käytettyä pojan nimeä. Nimet:",\
+"Luettele lyhyt, maksimissaan 50 sanan mittainen runo Suomesta. Runo:",\
+]
+from transformers import GenerationConfig
+generation_config = GenerationConfig(
+    pad_token_id=tokenizer.eos_token_id,
+    eos_token_id=tokenizer.convert_tokens_to_ids("<|loppu|>"),
+)
+for sample_question in sample_questions:
+  model.eval()
+  inputs = tokenizer(
+[
+    alpaca_prompt.format(
+        sample_question, # instruction
+    )
+]*1, return_tensors = "pt").to("cuda")
+  with torch.no_grad():
+      generated_ids = model.generate(
+      input_ids=inputs["input_ids"],
+      attention_mask=inputs["attention_mask"],
+      generation_config=generation_config, **{
+        "temperature": 0.1,
+        "penalty_alpha": 0.6,
+        "top_k": 3,
+        "do_sample": True,
+        "repetition_penalty": 1.28,
+        "min_length": 10,
+        "max_new_tokens": 200
+      })
+  generated_text = tokenizer.batch_decode(generated_ids, skip_special_tokens=True, clean_up_tokenization_spaces=True)[0]
+  print(len(generated_ids[0]))
+  print("KYSYMYS:")
+  print(generated_text.split('<|avustaja|>')[0])
+  print("VASTAUS:")
+  print(generated_text.split('<|avustaja|> Vastauksesi:')[1])
+  print('##################################')
+'''
+-->
+<s><|alku|> Olet tekoälyavustaja. Seuraavaksi saat kysymyksen tai tehtävän. Kirjoita vastaus parhaasi mukaan siten että se täyttää kysymyksen tai tehtävän vaatimukset.
+<|ihminen|> Kysymys/Tehtävä:
+ Aku Ankan luona asuu kolme ankanpoikaa. Mitkä ovat heidän nimet?
+VASTAUS:
+ Ankka Akun kanssa asuvat pojat ovat nimeltään Tupu, Hupu ja Lupu <|loppu|>
+##################################
+KYSYMYS:
+<s><|alku|> Olet tekoälyavustaja. Seuraavaksi saat kysymyksen tai tehtävän. Kirjoita vastaus parhaasi mukaan siten että se täyttää kysymyksen tai tehtävän vaatimukset.
+<|ihminen|> Kysymys/Tehtävä:
+ Mikä on Suomen korkein tunturi?
+VASTAUS:
+ Suomen korkein tunturihuippu on Haltitunturi (1 324 metriä). <|loppu|>
+##################################
+KYSYMYS:
+<s><|alku|> Olet tekoälyavustaja. Seuraavaksi saat kysymyksen tai tehtävän. Kirjoita vastaus parhaasi mukaan siten että se täyttää kysymyksen tai tehtävän vaatimukset.
+<|ihminen|> Kysymys/Tehtävä:
+ Suomi soti Neuvostoliittoa vastaan talvisodan 1939-1940. Kuinka monta päivää sota kesti?
+VASTAUS:
+ Talvisodan aikana Neuvostoliitto hyökkäsi Suomeen 30. marraskuuta ja 13. maaliskuuta välisenä aikana. Tämä tarkoittaa, että talvisota kesti 105 päivää. <|loppu|>
+##################################
+KYSYMYS:
+<s><|alku|> Olet tekoälyavustaja. Seuraavaksi saat kysymyksen tai tehtävän. Kirjoita vastaus parhaasi mukaan siten että se täyttää kysymyksen tai tehtävän vaatimukset.
+<|ihminen|> Kysymys/Tehtävä:
+ Luettele viisi yleistä Suomessa yleisesti käytettyä pojan nimeä. Nimet:
+VASTAUS:
+ Yleisiä suomalaisia poikien nimiä ovat Eino, Onni, Olavi, Väinö ja Ilmari. <|loppu|>
+##################################
+KYSYMYS:
+<s><|alku|> Olet tekoälyavustaja. Seuraavaksi saat kysymyksen tai tehtävän. Kirjoita vastaus parhaasi mukaan siten että se täyttää kysymyksen tai tehtävän vaatimukset.
+<|ihminen|> Kysymys/Tehtävä:
+ Luettele lyhyt, maksimissaan 50 sanan mittainen runo Suomesta. Runo:
+VASTAUS:
+ Olipa kerran kaunis maa,
+ jossa ihmiset elivät sopusoinnussa.
+ Se oli täynnä metsiä ja järviä,
+ ja siellä asui onnellisia ja ystävällisiä ihmisiä. <|loppu|>
+```
+### Limitations and bias
+The training data used for this model contains a lot of content from the internet, which is far from neutral.
+Therefore, the model can have biased predictions. This bias will also affect all fine-tuned versions of this model.
+To reduce toxic content, the pretrained version of thiis model was trained with dataset filtered with a toxicity classifier but it cannot truly eliminate all toxic text.
+### Finetuning
+Training was conducted on RTX 4080 using Unsloth framework https://github.com/unslothai/unsloth \
+Training script is available in this repo.
+## Evaluation results
+This model was evaluated using [FIN-bench by TurkuNLP](https://github.com/TurkuNLP/FIN-bench) with zero-shot setting, but \
+the evaluation script had some problems running succesfully, so the results reported below should perhaps be viewed with some caution.
+[llama-7b-finnish-instruct-v0.2](https://huggingface.co/Finnish-NLP/llama-7b-finnish-instruct-v0.2):
+|                      Task                      |Version|       Metric        |Value |   |Stderr|
+|------------------------------------------------|------:|---------------------|-----:|---|-----:|
+|bigbench_analogies                              |      0|multiple_choice_grade|0.5385|±  |0.0439|
+|bigbench_arithmetic_1_digit_addition            |      0|multiple_choice_grade|0.3400|±  |0.0476|
+|bigbench_arithmetic_1_digit_division            |      0|multiple_choice_grade|0.4783|±  |0.1065|
+|bigbench_arithmetic_1_digit_multiplication      |      0|multiple_choice_grade|0.5200|±  |0.0502|
+|bigbench_arithmetic_1_digit_subtraction         |      0|multiple_choice_grade|0.3400|±  |0.0476|
+|bigbench_arithmetic_2_digit_addition            |      0|multiple_choice_grade|0.3200|±  |0.0469|
+|bigbench_arithmetic_2_digit_division            |      0|multiple_choice_grade|0.3400|±  |0.0476|
+|bigbench_arithmetic_2_digit_multiplication      |      0|multiple_choice_grade|0.2200|±  |0.0416|
+|bigbench_arithmetic_2_digit_subtraction         |      0|multiple_choice_grade|0.2800|±  |0.0451|
+|bigbench_arithmetic_3_digit_addition            |      0|multiple_choice_grade|0.3000|±  |0.0461|
+|bigbench_arithmetic_3_digit_division            |      0|multiple_choice_grade|0.2500|±  |0.0435|
+|bigbench_arithmetic_3_digit_multiplication      |      0|multiple_choice_grade|0.2200|±  |0.0416|
+|bigbench_arithmetic_3_digit_subtraction         |      0|multiple_choice_grade|0.4000|±  |0.0492|
+|bigbench_arithmetic_4_digit_addition            |      0|multiple_choice_grade|0.3500|±  |0.0479|
+|bigbench_arithmetic_4_digit_division            |      0|multiple_choice_grade|0.2600|±  |0.0441|
+|bigbench_arithmetic_4_digit_multiplication      |      0|multiple_choice_grade|0.2100|±  |0.0409|
+|bigbench_arithmetic_4_digit_subtraction         |      0|multiple_choice_grade|0.4400|±  |0.0499|
+|bigbench_arithmetic_5_digit_addition            |      0|multiple_choice_grade|0.4500|±  |0.0500|
+|bigbench_arithmetic_5_digit_division            |      0|multiple_choice_grade|0.1800|±  |0.0386|
+|bigbench_arithmetic_5_digit_multiplication      |      0|multiple_choice_grade|0.2000|±  |0.0402|
+|bigbench_arithmetic_5_digit_subtraction         |      0|multiple_choice_grade|0.5000|±  |0.0503|
+|bigbench_cause_and_effect_one_sentence          |      0|multiple_choice_grade|0.5294|±  |0.0706|
+|bigbench_cause_and_effect_one_sentence_no_prompt|      0|multiple_choice_grade|0.8627|±  |0.0487|
+|bigbench_cause_and_effect_two_sentences         |      0|multiple_choice_grade|0.4314|±  |0.0700|
+|bigbench_emotions                               |      0|multiple_choice_grade|0.4750|±  |0.0396|
+|bigbench_empirical_judgments                    |      0|multiple_choice_grade|0.4141|±  |0.0498|
+|bigbench_general_knowledge                      |      0|multiple_choice_grade|0.4429|±  |0.0598|
+|bigbench_hhh_alignment_harmless                 |      0|multiple_choice_grade|0.3793|±  |0.0643|
+|bigbench_hhh_alignment_helpful                  |      0|multiple_choice_grade|0.3220|±  |0.0614|
+|bigbench_hhh_alignment_honest                   |      0|multiple_choice_grade|0.3898|±  |0.0640|
+|bigbench_hhh_alignment_other                    |      0|multiple_choice_grade|0.5581|±  |0.0766|
+|bigbench_intent_recognition                     |      0|multiple_choice_grade|0.2717|±  |0.0169|
+|bigbench_misconceptions                         |      0|multiple_choice_grade|0.5373|±  |0.0432|
+|bigbench_paraphrase                             |      0|multiple_choice_grade|0.5000|±  |0.0354|
+|bigbench_sentence_ambiguity                     |      0|multiple_choice_grade|0.5333|±  |0.0649|
+|bigbench_similarities_abstraction               |      0|multiple_choice_grade|0.5921|±  |0.0567|
+## Team Members
+- Aapo Tanskanen, [Hugging Face profile](https://huggingface.co/aapot), [LinkedIn profile](https://www.linkedin.com/in/aapotanskanen/)
+- Rasmus Toivanen, [Hugging Face profile](https://huggingface.co/RASMUS), [LinkedIn profile](https://www.linkedin.com/in/rasmustoivanen/)
+Feel free to contact us for more details 🤗