Intrinsic-Data
/

mistral7b-base

@@ -1,52 +1,21 @@
-# **Model Details**
-The Mistral AI-7B-v0.1 Large Language Model (LLM) is a pretrained generative text model with 7 billion parameters. Mistral AI-7B-v0.1 outperforms Llama 2 13B on all benchmarks we tested.
-**Model Developers** Mistral AI.
-**Variations** None.
-**Input** Text only.
-**Output** Text only.
-**Model Architecture** Mistral AI-7B-v0.1 is a transformer model, with the following architecture choices:
 - Grouped-Query Attention
 - Sliding-Window Attention
 - Byte-fallback BPE tokenizer
-**Model Dates** Mistral AI-7B-v0.1 was trained between June and September 2023.
-**Status** This is a static model. Future models will have new version numbers.
-**License** Apache 2.0 license.
-**Research Paper** TODO: Coming soon.
-**Where to send questions or comments about the model** TODO: How do people send comments?
-# **Intended Use**
-**Intended Use Cases** Mistral AI-7B-v0.1 is for commercial and research use. It can be adapted for a variety of natural language generation tasks.
-# **Evaluation Results**
-We report the standard benchmark results for Mistral AI-7B-v0.1. We use a custom evaluation library to produce the results.
-| Model           | Size | hellaswag | winogrande | piqa   | boolq  | arc_easy | arc_challenge | naturalqs | naturalqs_5shot | triviaqa_5shot | triviaqa | humaneval_pass@1 | mbpp_pass@1 | mmlu   | math   | gsm8k  |
-|-----------------|------|-----------|------------|--------|--------|----------|---------------|-----------|-----------------|----------------|----------|------------------|-------------|--------|--------|--------|
-| Mistral-7B-v0.1 | 7B   | 81.19%    | 75.53%     | 82.92% | 83.52% | 80.01%   | 55.38%        | 23.96%    | 28.92%          | 69.88%         | 63.22%   | 29.88%           | 47.86%      | 59.99% | 11.94% | 39.35% |
-**Theme-based grouping**
--   Commonsense Reasoning: 0-shot average of Hellaswag, Winogrande, PIQA, SIQA, OpenbookQA, ARC-Easy, ARC-Challenge, and CommonsenseQA.
--   World Knowledge: 5-shot average of NaturalQuestions and TriviaQA.
--   Reading Comprehension: 0-shot average of BoolQ and QuAC.
--   Math: Average of 8-shot GSM8K with maj@8 and 4-shot MATH with maj@4
--   Code: Average of 0-shot Humaneval and 3-shot MBPP
--   Popular aggregated results: 5-shot MMLU, 3-shot BBH, and 3-5-shot AGI Eval (English multiple-choice questions only)
-# **Ethical Considerations and Limitations**
-TODO: what do we say here?

+---
+license: apache-2.0
+pipeline_tag: text-generation
+---
+# Model Card for Mistral-7B-v0.1
+The Mistral AI-7B-v0.1 Large Language Model (LLM) is a pretrained generative text model with 7 billion parameters.
+Mistral AI-7B-v0.1 outperforms Llama 2 13B on all benchmarks we tested.
+For full details of this model please read our [Release blog post](https://mistral.ai/news/announcing-mistral-7b-v0.1/)
+## Model Architecture
+Mistral AI-7B-v0.1 is a transformer model, with the following architecture choices:
 - Grouped-Query Attention
 - Sliding-Window Attention
 - Byte-fallback BPE tokenizer
+## Model Developers
+The Mistral AI Team:
+Albert Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lélio Renard Lavaud, Lucile Saulnier, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Thibaut Lavril, Thomas Wang, Timothée Lacroix, William El Sayed.