teleprint-me
/

mixtral-8x7b-instruct-v0.1

Text Generation

mixture-of-experts

code-generation

Inference Endpoints

Model card Files Files and versions Community

aberrio commited on Sep 29, 2024

Commit

3cea8fc

·

verified ·

1 Parent(s): 3dd6374

Create README.md

Files changed (1) hide show

README.md +66 -0

README.md ADDED Viewed

	@@ -0,0 +1,66 @@

+---
+license: apache-2.0
+license_link: https://github.com/mistralai/mistral-common/blob/main/LICENCE
+library: llama.cpp
+library_link: https://github.com/ggerganov/llama.cpp
+base_model:
+  - mistralai/Mixtral-8x7B-v0.1
+language:
+  - fr
+  - it
+  - de
+  - es
+  - en
+pipeline_tag: text-generation
+tags:
+  - nlp
+  - code
+  - gguf
+  - sparse
+  - mixture-of-experts
+  - code-generation
+---
+## Mixtral 8x7B Instruct v0.1
+### Quantized Model Files
+The Mixtral 8x7B Sparse Mixture of Experts (SMoE) model is available in two formats:
+- **ggml-model-q4_0.gguf**: 4-bit quantization for reduced memory and compute overhead.
+- **ggml-model-q8_0.gguf**: 8-bit quantization, offering balanced performance and precision.
+These quantized formats ensure flexibility for deployment on various hardware configurations, from lightweight devices to large-scale inference servers.
+### Model Information
+Mixtral 8x7B is a generative Sparse Mixture of Experts (SMoE) model designed to deliver high-quality outputs with significant computational efficiency. Leveraging a routing mechanism, it dynamically activates a subset of experts per input, reducing computational costs while maintaining the performance of a much larger model.
+**Key Features:**
+- **Architecture:** Decoder-only SMoE with 46.7B total parameters but only 12.9B parameters active per token.
+- **Context Window:** Supports up to 32k tokens, making it suitable for long-context applications.
+- **Multilingual Capabilities:** Trained on French, Italian, German, Spanish, and English, making it robust for diverse linguistic tasks.
+- **Performance:** Matches or exceeds Llama 2 70B and GPT-3.5 across several industry-standard benchmarks.
+- **Fine-Tuning Potential:** Optimized for instruction-following use cases, with finetuning yielding strong improvements in dialogue and safety alignment.
+**Developer**: Mistral AI
+**Training Data**: Open web data, curated for quality and diverse representation.
+**Application Areas**: Code generation, multilingual dialogue, and long-context processing.
+### Core Library
+Mixtral 8x7B Instruct can be deployed using `vLLM` or `transformers`. Current support focuses on Hugging Face `transformers` for initial integrations.
+**Primary Framework**: `transformers`
+**Alternate Framework**: `vLLM` (for specialized inference optimizations)
+**Model Availability**: Source weights and pre-converted formats are available under [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0).
+### Safety and Responsible Use
+Mixtral 8x7B has been trained with an emphasis on ethical use and safety. It includes:
+1. **Guardrails for Sensitive Content**: Optional system prompts to guide outputs.
+2. **Self-Reflection Prompting**: Mechanism for internal assessment of generated outputs, allowing the model to classify its responses as suitable or unsuitable for deployment.
+Developers should always consider additional tuning or filtering depending on their application and context.