aixsatoshi
/

Ex-karakuri-8x12B-chat-v1

Text Generation

Inference Endpoints

text-generation-inference

Model card Files Files and versions Community

aixsatoshi commited on May 8

Commit

20db41b

•

1 Parent(s): c9ca570

Update README.md

Files changed (1) hide show

README.md +13 -2

README.md CHANGED Viewed

@@ -1,7 +1,18 @@
 ---
 license: apache-2.0
 ---
-'''
 slices:
 - sources:
   - layer_range: [0, 8]
@@ -26,4 +37,4 @@ slices:
     model: karakuri-ai/karakuri-lm-8x7b-chat-v0.1
 merge_method: passthrough
 dtype: bfloat16
-''''

 ---
 license: apache-2.0
 ---
+Meta-Llama-3-70bのセルフマージ120Bにパラメーター数を拡大したモデルの高性能化が報告されています
+今回高品質な日本語LLMである、[karakuri-ai/karakuri-lm-8x7b-chat-v0.1](https://huggingface.co/karakuri-ai/karakuri-lm-8x7b-chat-v0.1)の精度を更に高めるために、"num_hidden_layers": 32、から、56への自己拡張マージを行いました。
+It was inspired by large merges like:
+- [Meta-Llama-3-120B-Instruct](https://huggingface.co/mlabonne/Meta-Llama-3-120B-Instruct/)
+- [alpindale/goliath-120b](https://huggingface.co/alpindale/goliath-120b)
+- [nsfwthrowitaway69/Venus-120b-v1.0](https://huggingface.co/nsfwthrowitaway69/Venus-120b-v1.0)
+- [cognitivecomputations/MegaDolphin-120b](https://huggingface.co/cognitivecomputations/MegaDolphin-120b)
+- [wolfram/miquliz-120b-v2.0](https://huggingface.co/wolfram/miquliz-120b-v2.0).
+```
 slices:
 - sources:
   - layer_range: [0, 8]
     model: karakuri-ai/karakuri-lm-8x7b-chat-v0.1
 merge_method: passthrough
 dtype: bfloat16
+```