Edit model card

Model Card for Model ID

This model represents an advanced implementation of a Mixture of Experts (MoE) approach, where cyberagent/calm2-7b serves as the foundational base model, and cyberagent/calm2-7b-chat is incorporated as an chat model. The model is designed to combine the general-purpose language processing capabilities of the calm2-7b with the specialized conversational abilities of the calm2-7b-chat.

Model Details

The model uses the following expert models for generating responses:

  1. Source Model: cyberagent/calm2-7b-chat

    • Positive Prompts: ["USER: ", "ASSISTANT: "]
      • This source model is utilized to provide responses in a chat-based context, taking both user and assistant inputs into account.
  2. Source Model: cyberagent/calm2-7b

    • Positive Prompts: [""]
      • This source model contributes to generating responses without specific chat context, serving as a general-purpose language model.

Model size: 11.3B
Context length: 32768
Language(s): Japanese, English

Model Sources [optional]

Limitations and Considerations

While this MoE model integrates the strengths of cyberagent/calm2-7b-chat and cyberagent/calm2-7b, it's important to note that it is an experimental model and has not been fine-tuned post-composition. As such, users are advised to perform their own tuning and optimization to adapt the model to their specific use cases and requirements.

Downloads last month
2
Safetensors
Model size
11.3B params
Tensor type
BF16
·

Space using aixsatoshi/calm2-7b-chat-7b-moe 1