ehristoforu commited on
Commit
27dd2b5
1 Parent(s): 27f21b4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +85 -39
README.md CHANGED
@@ -8,15 +8,14 @@ base_model:
8
  - cognitivecomputations/dolphin-2.9-llama3-8b
9
  - NeuralNovel/Llama-3-NeuralPaca-8b
10
  datasets:
11
- - IlyaGusev/ru_turbo_saiga
12
- - IlyaGusev/ru_sharegpt_cleaned
13
- - IlyaGusev/oasst1_ru_main_branch
14
- - IlyaGusev/gpt_roleplay_realm
15
- - lksy/ru_instruct_gpt4
16
- - mlabonne/chatml-OpenHermes2.5-dpo-binarized-alpha
17
- - tatsu-lab/alpaca
18
- - vicgalle/configurable-system-prompt-multitask
19
-
20
  library_name: transformers
21
  tags:
22
  - llama
@@ -27,43 +26,90 @@ tags:
27
  language:
28
  - en
29
  - ru
 
30
  ---
31
- # merge
32
 
33
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
34
 
35
- ## Merge Details
36
- ### Merge Method
37
 
38
- This model was merged using the passthrough merge method.
39
 
40
- ### Models Merged
41
 
42
- The following models were included in the merge:
43
- * [ehristoforu/0001lp](https://huggingface.co/ehristoforu/0001lp)
44
- * [vicgalle/Configurable-Llama-3-8B-v0.2](https://huggingface.co/vicgalle/Configurable-Llama-3-8B-v0.2)
45
- * [cognitivecomputations/dolphin-2.9-llama3-8b](https://huggingface.co/cognitivecomputations/dolphin-2.9-llama3-8b)
46
- * [NeuralNovel/Llama-3-NeuralPaca-8b](https://huggingface.co/NeuralNovel/Llama-3-NeuralPaca-8b)
47
 
48
- ### Configuration
49
 
50
- The following YAML configuration was used to produce this model:
 
 
 
51
 
52
- ```yaml
53
- slices:
54
- - sources:
55
- - model: ehristoforu/0001lp
56
- layer_range: [0, 32]
57
- - sources:
58
- - model: NeuralNovel/Llama-3-NeuralPaca-8b
59
- layer_range: [24, 32]
60
- - sources:
61
- - model: cognitivecomputations/dolphin-2.9-llama3-8b
62
- layer_range: [26, 32]
63
- - sources:
64
- - model: vicgalle/Configurable-Llama-3-8B-v0.2
65
- layer_range: [28, 32]
66
- merge_method: passthrough
67
- dtype: bfloat16
68
 
69
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  - cognitivecomputations/dolphin-2.9-llama3-8b
9
  - NeuralNovel/Llama-3-NeuralPaca-8b
10
  datasets:
11
+ - mlabonne/chatml-OpenHermes2.5-dpo-binarized-alpha
12
+ - tatsu-lab/alpaca
13
+ - vicgalle/configurable-system-prompt-multitask
14
+ - IlyaGusev/ru_turbo_saiga
15
+ - IlyaGusev/ru_sharegpt_cleaned
16
+ - IlyaGusev/oasst1_ru_main_branch
17
+ - IlyaGusev/gpt_roleplay_realm
18
+ - lksy/ru_instruct_gpt4
 
19
  library_name: transformers
20
  tags:
21
  - llama
 
26
  language:
27
  - en
28
  - ru
29
+ pipeline_tag: text-generation
30
  ---
 
31
 
32
+ # Llama3 from 8B to 12B
33
 
 
 
34
 
35
+ We created a model from other cool models to combine everything into one cool model.
36
 
 
37
 
38
+ ## Model Details
 
 
 
 
39
 
40
+ ### Model Description
41
 
42
+ - **Developed by:** [@ehristoforu](https://huggingface.co/ehristoforu)
43
+ - **Model type:** Text Generation (conversational
44
+ - **Language(s) (NLP):** English, Russian
45
+ - **Finetuned from model [optional]:** [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct)
46
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
47
 
48
+ ## How to Get Started with the Model
49
+
50
+ Use the code below to get started with the model.
51
+
52
+ ```py
53
+ from transformers import AutoTokenizer, AutoModelForCausalLM
54
+ import torch
55
+
56
+ model_id = "ehristoforu/llama-3-12b-instruct"
57
+
58
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
59
+ model = AutoModelForCausalLM.from_pretrained(
60
+ model_id,
61
+ torch_dtype=torch.bfloat16,
62
+ device_map="auto",
63
+ )
64
+
65
+ messages = [
66
+ {"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
67
+ {"role": "user", "content": "Who are you?"},
68
+ ]
69
+
70
+ input_ids = tokenizer.apply_chat_template(
71
+ messages,
72
+ add_generation_prompt=True,
73
+ return_tensors="pt"
74
+ ).to(model.device)
75
+
76
+ terminators = [
77
+ tokenizer.eos_token_id,
78
+ tokenizer.convert_tokens_to_ids("<|eot_id|>")
79
+ ]
80
+
81
+ outputs = model.generate(
82
+ input_ids,
83
+ max_new_tokens=256,
84
+ eos_token_id=terminators,
85
+ do_sample=True,
86
+ temperature=0.6,
87
+ top_p=0.9,
88
+ )
89
+ response = outputs[0][input_ids.shape[-1]:]
90
+ print(tokenizer.decode(response, skip_special_tokens=True))
91
+ ```
92
+
93
+
94
+ ## About merge
95
+
96
+ Base model: Meta-Llama-3-8B-Instruct
97
+
98
+ Merge models:
99
+ - Muhammad2003/Llama3-8B-OpenHermes-DPO
100
+ - IlyaGusev/saiga_llama3_8b
101
+ - NousResearch/Meta-Llama-3-8B-Instruct
102
+ - abacusai/Llama-3-Smaug-8B
103
+ - vicgalle/Configurable-Llama-3-8B-v0.2
104
+ - cognitivecomputations/dolphin-2.9-llama3-8b
105
+ - NeuralNovel/Llama-3-NeuralPaca-8b
106
+
107
+ Merge datasets:
108
+ - mlabonne/chatml-OpenHermes2.5-dpo-binarized-alpha
109
+ - tatsu-lab/alpaca
110
+ - vicgalle/configurable-system-prompt-multitask
111
+ - IlyaGusev/ru_turbo_saiga
112
+ - IlyaGusev/ru_sharegpt_cleaned
113
+ - IlyaGusev/oasst1_ru_main_branch
114
+ - IlyaGusev/gpt_roleplay_realm
115
+ - lksy/ru_instruct_gpt4