Edit model card

Model Card for Model ID

This model represents an advanced implementation of a Mixture of Experts (MoE) approach, where cyberagent/calm2-7b serves as the foundational base model, and cyberagent/calm2-7b-chat is incorporated as an chat model. The model is designed to combine the general-purpose language processing capabilities of the calm2-7b with the specialized conversational abilities of the calm2-7b-chat.

Model Details

The model uses the following expert models for generating responses:

  1. Source Model: cyberagent/calm2-7b-chat

    • Positive Prompts: ["USER: ", "ASSISTANT: "]
      • This source model is utilized to provide responses in a chat-based context, taking both user and assistant inputs into account.
  2. Source Model: cyberagent/calm2-7b

    • Positive Prompts: [""]
      • This source model contributes to generating responses without specific chat context, serving as a general-purpose language model.

Model size: 11.3B
Context length: 32768
Language(s): Japanese, English

Model Sources [optional]

Limitations and Considerations

While this MoE model integrates the strengths of cyberagent/calm2-7b-chat and cyberagent/calm2-7b, it's important to note that it is an experimental model and has not been fine-tuned post-composition. As such, users are advised to perform their own tuning and optimization to adapt the model to their specific use cases and requirements.

Downloads last month
Model size
11.3B params
Tensor type

Space using aixsatoshi/calm2-7b-chat-7b-moe 1