Mistral-22B-v0.1 / README.md
Vezora's picture
Update README.md
d8d65a6 verified
|
raw
history blame
1.66 kB
metadata
license: apache-2.0

Currently Uploading 6:13 AM EST should be done in an hour or so

Mistral-22b-V.01 Release Announcement ๐Ÿš€

Date: April 11

Creator Nicolas Mejia-Petit

Overview

Just one day after the release of Mixtral-8x-22b, we are excited to introduce our handcrafted experimental model, Mistral-22b-V.01. This model is a culmination of equal knowledge distilled from all experts into a single, dense 22b model. This model is not a single trained expert, rather its a compressed MOE model, turning it into a dense 22b mode. This is the first working MOE to Dense model conversion.

Capabilities

  • Math Proficiency: The model exhibits strong mathematical abilities.
  • Reasoning: It possesses decent reasoning skills.
  • User Interaction: It responds effectively to user prompts.

Experimental Nature

Please note that Mistral-22b-V.01 is an experimental model. It has been fine-tuned with fewer examples compared to the model set for release tomorrow. We encourage you to explore its capabilities and provide feedback.

Upcoming Release: V.2

Stay tuned for the release of V.2 tomorrow, which will feature enhancements in:

  • Multi-turn conversations
  • Multiturn coding
  • JSON mode
  • Agent abilities

Background

The decision to release this experimental version was prompted by someone attempting to replicate my experiment based on my tweets. We wanted to ensure our community has access to the official version first.

Stay Updated

Keep an eye out for V.2, it's going to be a game-changer! And is currently training, will be done in the next ~24 hours. ๐ŸŒŸPaper Coming Soon๐ŸŒŸ