CausalLM
/

8x7B-MoE-test-NOT-MIXTRAL

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

JosephusCheung commited on Dec 16, 2023

Commit

330f2fa

•

1 Parent(s): 8530c98

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -5,6 +5,8 @@ license: gpl-3.0
 A Chat Model, Testing only, no performance guaranteeeee...
 Only intended for conceptual validation, however the expert models do not seem to be working as expected. The model could output text and complete the conversation normally, but the performance of the expert model was not significant.
 There are 8 completely different expert models based on Qwen-7B / CausalLM, six of which are specific domain models that have seen 50~100 billion tokens, including: a Toolformer/Agent expert model, a multilingual translation expert model, a mathematics expert model, a visual expert model, a coding and computer expert model, and an uncensored knowledge model — together forming the MoE model along with Qwen-Chat and Qwen-Base.

 A Chat Model, Testing only, no performance guaranteeeee...
+In short: CausalLM / Qwen 8x7B MoE in Mixtral Arch, 8 real experts in different domains.
 Only intended for conceptual validation, however the expert models do not seem to be working as expected. The model could output text and complete the conversation normally, but the performance of the expert model was not significant.
 There are 8 completely different expert models based on Qwen-7B / CausalLM, six of which are specific domain models that have seen 50~100 billion tokens, including: a Toolformer/Agent expert model, a multilingual translation expert model, a mathematics expert model, a visual expert model, a coding and computer expert model, and an uncensored knowledge model — together forming the MoE model along with Qwen-Chat and Qwen-Base.