license: other
license_name: mrl
license_link: https://mistral.ai/licenses/MRL-0.1.md
language:
- en
- fr
- de
- es
- it
- pt
- zh
- ja
- ru
- ko
Mistral-Large-218B-Instruct
Mistral-Large-218B-Instruct is an advanced dense Large Language Model (LLM) with 218 billion parameters, featuring state-of-the-art reasoning, knowledge, and coding capabilities.
Self-merged from the original Mistral Large 2, see mergekit config below.
Key features
- Massive scale: With 218 billion parameters, this model pushes the boundaries of language model capabilities.
- Multi-lingual by design: Supports dozens of languages, including English, French, German, Spanish, Italian, Chinese, Japanese, Korean, Portuguese, Dutch, and Polish.
- Proficient in coding: Trained on 80+ coding languages such as Python, Java, C, C++, JavaScript, and Bash, as well as more specific languages like Swift and Fortran.
- Agentic-centric: Best-in-class agentic capabilities with native function calling and JSON outputting.
- Advanced Reasoning: State-of-the-art mathematical and reasoning capabilities.
- Mistral Research License: Allows usage and modification for research and non-commercial purposes.
- Large Context: Features a large 128k context window for handling extensive input.
Metrics
Note: The following metrics are based on the original model and may differ for this 218B parameter version. Updated benchmarks will be provided when available.
Base Pretrained Benchmarks
Benchmark | Score |
---|---|
MMLU | 84.0% |
Base Pretrained Multilingual Benchmarks (MMLU)
Benchmark | Score |
---|---|
French | 82.8% |
German | 81.6% |
Spanish | 82.7% |
Italian | 82.7% |
Dutch | 80.7% |
Portuguese | 81.6% |
Russian | 79.0% |
Korean | 60.1% |
Japanese | 78.8% |
Chinese | 74.8% |
Instruction Benchmarks
Benchmark | Score |
---|---|
MT Bench | 8.63 |
Wild Bench | 56.3 |
Arena Hard | 73.2 |
Code & Reasoning Benchmarks
Benchmark | Score |
---|---|
Human Eval | 92% |
Human Eval Plus | 87% |
MBPP Base | 80% |
MBPP Plus | 69% |
Math Benchmarks
Benchmark | Score |
---|---|
GSM8K | 93% |
Math Instruct (0-shot, no CoT) | 70% |
Math Instruct (0-shot, CoT) | 71.5% |
Usage
This model can be used with standard LLM frameworks and libraries. Specific usage instructions will be provided upon release.
Hardware Requirements
Given the size of this model (218B parameters), it requires substantial computational resources for inference:
- Recommended: 8xH100 (640GB)
- Alternatively: Distributed inference setup across multiple machines.
Limitations
- This model does not have built-in moderation mechanisms. Users should implement appropriate safeguards for deployment in production environments.
- Due to its size, inference may be computationally expensive and require significant hardware resources.
- As with all large language models, it may exhibit biases present in its training data.
- The model's outputs should be critically evaluated, especially for sensitive applications.
Notes
This was just a fun testing model, merged with the merge.py
script in the base of the repo. Find GGUFs at leafspark/Mistral-Large-218B-Instruct-GGUF
Compatible mergekit
config:
slices:
- sources:
- layer_range: [0, 20]
model: mistralai/Mistral-Large-Instruct-2407
- sources:
- layer_range: [10, 30]
model: mistralai/Mistral-Large-Instruct-2407
- sources:
- layer_range: [20, 40]
model: mistralai/Mistral-Large-Instruct-2407
- sources:
- layer_range: [30, 50]
model: mistralai/Mistral-Large-Instruct-2407
- sources:
- layer_range: [40, 60]
model: mistralai/Mistral-Large-Instruct-2407
- sources:
- layer_range: [50, 70]
model: mistralai/Mistral-Large-Instruct-2407
- sources:
- layer_range: [60, 80]
model: mistralai/Mistral-Large-Instruct-2407
- sources:
- layer_range: [70, 87]
model: mistralai/Mistral-Large-Instruct-2407
merge_method: passthrough
dtype: bfloat16