Model Card for Model ID

This model represents an advanced implementation of a Mixture of Experts (MoE) approach, where cyberagent/calm2-7b serves as the foundational base model, and cyberagent/calm2-7b-chat is incorporated as an chat model. The model is designed to combine the general-purpose language processing capabilities of the calm2-7b with the specialized conversational abilities of the calm2-7b-chat.

Model Details

The model uses the following expert models for generating responses:

Source Model: cyberagent/calm2-7b-chat
- Positive Prompts: ["USER: ", "ASSISTANT: "]
  - This source model is utilized to provide responses in a chat-based context, taking both user and assistant inputs into account.
Source Model: cyberagent/calm2-7b
- Positive Prompts: [""]
  - This source model contributes to generating responses without specific chat context, serving as a general-purpose language model.

Model size: 11.3B
Context length: 32768
Language(s): Japanese, English

Model Sources [optional]

Repository: https://huggingface.co/cyberagent/calm2-7b
Repository: https://huggingface.co/cyberagent/calm2-7b-chat

Limitations and Considerations

While this MoE model integrates the strengths of cyberagent/calm2-7b-chat and cyberagent/calm2-7b, it's important to note that it is an experimental model and has not been fine-tuned post-composition. As such, users are advised to perform their own tuning and optimization to adapt the model to their specific use cases and requirements.

aixsatoshi
/

calm2-7b-chat-7b-moe

Model Card for Model ID

Model Details

Model Sources [optional]

Limitations and Considerations

Space using aixsatoshi/calm2-7b-chat-7b-moe 1