aixsatoshi's picture
Update README.md
20db41b verified
---
license: apache-2.0
---
Meta-Llama-3-70bのセルフマージ120Bにパラメーター数を拡大したモデルの高性能化が報告されています
今回高品質な日本語LLMである、[karakuri-ai/karakuri-lm-8x7b-chat-v0.1](https://huggingface.co/karakuri-ai/karakuri-lm-8x7b-chat-v0.1)の精度を更に高めるために、"num_hidden_layers": 32、から、56への自己拡張マージを行いました。
It was inspired by large merges like:
- [Meta-Llama-3-120B-Instruct](https://huggingface.co/mlabonne/Meta-Llama-3-120B-Instruct/)
- [alpindale/goliath-120b](https://huggingface.co/alpindale/goliath-120b)
- [nsfwthrowitaway69/Venus-120b-v1.0](https://huggingface.co/nsfwthrowitaway69/Venus-120b-v1.0)
- [cognitivecomputations/MegaDolphin-120b](https://huggingface.co/cognitivecomputations/MegaDolphin-120b)
- [wolfram/miquliz-120b-v2.0](https://huggingface.co/wolfram/miquliz-120b-v2.0).
```
slices:
- sources:
- layer_range: [0, 8]
model: karakuri-ai/karakuri-lm-8x7b-chat-v0.1
- sources:
- layer_range: [4, 12]
model: karakuri-ai/karakuri-lm-8x7b-chat-v0.1
- sources:
- layer_range: [8, 16]
model: karakuri-ai/karakuri-lm-8x7b-chat-v0.1
- sources:
- layer_range: [12, 20]
model: karakuri-ai/karakuri-lm-8x7b-chat-v0.1
- sources:
- layer_range: [16, 24]
model: karakuri-ai/karakuri-lm-8x7b-chat-v0.1
- sources:
- layer_range: [20, 28]
model: karakuri-ai/karakuri-lm-8x7b-chat-v0.1
- sources:
- layer_range: [24, 32]
model: karakuri-ai/karakuri-lm-8x7b-chat-v0.1
merge_method: passthrough
dtype: bfloat16
```