mlabonne commited on
Commit
6fa3518
1 Parent(s): 13c9759

Upload folder using huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +109 -2
README.md CHANGED
@@ -1,3 +1,110 @@
1
- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/X1tDlFYMMFPNI_YkDXYbE.png)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
 
3
- # Meta-Llama-3-220B-Instruct
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ tags:
4
+ - merge
5
+ - mergekit
6
+ - lazymergekit
7
+ base_model:
8
+ - meta-llama/Meta-Llama-3-120B-Instruct
9
+ - meta-llama/Meta-Llama-3-120B-Instruct
10
+ - meta-llama/Meta-Llama-3-120B-Instruct
11
+ - meta-llama/Meta-Llama-3-120B-Instruct
12
+ - meta-llama/Meta-Llama-3-120B-Instruct
13
+ - meta-llama/Meta-Llama-3-120B-Instruct
14
+ - meta-llama/Meta-Llama-3-120B-Instruct
15
+ - meta-llama/Meta-Llama-3-120B-Instruct
16
+ - meta-llama/Meta-Llama-3-120B-Instruct
17
+ - meta-llama/Meta-Llama-3-120B-Instruct
18
+ - meta-llama/Meta-Llama-3-120B-Instruct
19
+ - meta-llama/Meta-Llama-3-120B-Instruct
20
+ - meta-llama/Meta-Llama-3-120B-Instruct
21
+ ---
22
 
23
+ # Meta-Llama-3-220B-Instruct
24
+
25
+ Meta-Llama-3-220B-Instruct is a merge of the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing):
26
+ * [meta-llama/Meta-Llama-3-120B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-120B-Instruct)
27
+ * [meta-llama/Meta-Llama-3-120B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-120B-Instruct)
28
+ * [meta-llama/Meta-Llama-3-120B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-120B-Instruct)
29
+ * [meta-llama/Meta-Llama-3-120B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-120B-Instruct)
30
+ * [meta-llama/Meta-Llama-3-120B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-120B-Instruct)
31
+ * [meta-llama/Meta-Llama-3-120B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-120B-Instruct)
32
+ * [meta-llama/Meta-Llama-3-120B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-120B-Instruct)
33
+ * [meta-llama/Meta-Llama-3-120B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-120B-Instruct)
34
+ * [meta-llama/Meta-Llama-3-120B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-120B-Instruct)
35
+ * [meta-llama/Meta-Llama-3-120B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-120B-Instruct)
36
+ * [meta-llama/Meta-Llama-3-120B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-120B-Instruct)
37
+ * [meta-llama/Meta-Llama-3-120B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-120B-Instruct)
38
+ * [meta-llama/Meta-Llama-3-120B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-120B-Instruct)
39
+
40
+ ## 🧩 Configuration
41
+
42
+ ```yaml
43
+ slices:
44
+ - sources:
45
+ - layer_range: [0, 20]
46
+ model: meta-llama/Meta-Llama-3-120B-Instruct
47
+ - sources:
48
+ - layer_range: [10, 30]
49
+ model: meta-llama/Meta-Llama-3-120B-Instruct
50
+ - sources:
51
+ - layer_range: [20, 40]
52
+ model: meta-llama/Meta-Llama-3-120B-Instruct
53
+ - sources:
54
+ - layer_range: [30, 50]
55
+ model: meta-llama/Meta-Llama-3-120B-Instruct
56
+ - sources:
57
+ - layer_range: [40, 60]
58
+ model: meta-llama/Meta-Llama-3-120B-Instruct
59
+ - sources:
60
+ - layer_range: [50, 70]
61
+ model: meta-llama/Meta-Llama-3-120B-Instruct
62
+ - sources:
63
+ - layer_range: [60, 80]
64
+ model: meta-llama/Meta-Llama-3-120B-Instruct
65
+ - sources:
66
+ - layer_range: [70, 90]
67
+ model: meta-llama/Meta-Llama-3-120B-Instruct
68
+ - sources:
69
+ - layer_range: [80, 100]
70
+ model: meta-llama/Meta-Llama-3-120B-Instruct
71
+ - sources:
72
+ - layer_range: [90, 110]
73
+ model: meta-llama/Meta-Llama-3-120B-Instruct
74
+ - sources:
75
+ - layer_range: [100, 120]
76
+ model: meta-llama/Meta-Llama-3-120B-Instruct
77
+ - sources:
78
+ - layer_range: [110, 130]
79
+ model: meta-llama/Meta-Llama-3-120B-Instruct
80
+ - sources:
81
+ - layer_range: [120, 140]
82
+ model: meta-llama/Meta-Llama-3-120B-Instruct
83
+ merge_method: passthrough
84
+ dtype: float16
85
+ ```
86
+
87
+ ## 💻 Usage
88
+
89
+ ```python
90
+ !pip install -qU transformers accelerate
91
+
92
+ from transformers import AutoTokenizer
93
+ import transformers
94
+ import torch
95
+
96
+ model = "mlabonne/Meta-Llama-3-220B-Instruct"
97
+ messages = [{"role": "user", "content": "What is a large language model?"}]
98
+
99
+ tokenizer = AutoTokenizer.from_pretrained(model)
100
+ prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
101
+ pipeline = transformers.pipeline(
102
+ "text-generation",
103
+ model=model,
104
+ torch_dtype=torch.float16,
105
+ device_map="auto",
106
+ )
107
+
108
+ outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
109
+ print(outputs[0]["generated_text"])
110
+ ```