Model Card for Model ID
This model represents an advanced implementation of a Mixture of Experts (MoE) approach, where cyberagent/calm2-7b serves as the foundational base model, and cyberagent/calm2-7b-chat is incorporated as an chat model. The model is designed to combine the general-purpose language processing capabilities of the calm2-7b with the specialized conversational abilities of the calm2-7b-chat.
Model Details
The model uses the following expert models for generating responses:
Source Model:
cyberagent/calm2-7b-chat
- Positive Prompts: ["USER: ", "ASSISTANT: "]
- This source model is utilized to provide responses in a chat-based context, taking both user and assistant inputs into account.
- Positive Prompts: ["USER: ", "ASSISTANT: "]
Source Model:
cyberagent/calm2-7b
- Positive Prompts: [""]
- This source model contributes to generating responses without specific chat context, serving as a general-purpose language model.
- Positive Prompts: [""]
Model size: 11.3B
Context length: 32768
Language(s): Japanese, English
Model Sources [optional]
- Repository: https://huggingface.co/cyberagent/calm2-7b
- Repository: https://huggingface.co/cyberagent/calm2-7b-chat
Limitations and Considerations
While this MoE model integrates the strengths of cyberagent/calm2-7b-chat and cyberagent/calm2-7b, it's important to note that it is an experimental model and has not been fine-tuned post-composition. As such, users are advised to perform their own tuning and optimization to adapt the model to their specific use cases and requirements.
- Downloads last month
- 2