Commit
•
20db41b
1
Parent(s):
c9ca570
Update README.md
Browse files
README.md
CHANGED
@@ -1,7 +1,18 @@
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
---
|
4 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
5 |
slices:
|
6 |
- sources:
|
7 |
- layer_range: [0, 8]
|
@@ -26,4 +37,4 @@ slices:
|
|
26 |
model: karakuri-ai/karakuri-lm-8x7b-chat-v0.1
|
27 |
merge_method: passthrough
|
28 |
dtype: bfloat16
|
29 |
-
|
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
---
|
4 |
+
Meta-Llama-3-70bのセルフマージ120Bにパラメーター数を拡大したモデルの高性能化が報告されています
|
5 |
+
今回高品質な日本語LLMである、[karakuri-ai/karakuri-lm-8x7b-chat-v0.1](https://huggingface.co/karakuri-ai/karakuri-lm-8x7b-chat-v0.1)の精度を更に高めるために、"num_hidden_layers": 32、から、56への自己拡張マージを行いました。
|
6 |
+
|
7 |
+
It was inspired by large merges like:
|
8 |
+
- [Meta-Llama-3-120B-Instruct](https://huggingface.co/mlabonne/Meta-Llama-3-120B-Instruct/)
|
9 |
+
- [alpindale/goliath-120b](https://huggingface.co/alpindale/goliath-120b)
|
10 |
+
- [nsfwthrowitaway69/Venus-120b-v1.0](https://huggingface.co/nsfwthrowitaway69/Venus-120b-v1.0)
|
11 |
+
- [cognitivecomputations/MegaDolphin-120b](https://huggingface.co/cognitivecomputations/MegaDolphin-120b)
|
12 |
+
- [wolfram/miquliz-120b-v2.0](https://huggingface.co/wolfram/miquliz-120b-v2.0).
|
13 |
+
|
14 |
+
|
15 |
+
```
|
16 |
slices:
|
17 |
- sources:
|
18 |
- layer_range: [0, 8]
|
|
|
37 |
model: karakuri-ai/karakuri-lm-8x7b-chat-v0.1
|
38 |
merge_method: passthrough
|
39 |
dtype: bfloat16
|
40 |
+
```
|