Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -168,6 +168,24 @@ You are a Motoko expert for the Internet Computer. Write clean, compilable Motok
|
|
| 168 |
3. **Mention state**: "Use Map for storage" guides the model toward correct patterns
|
| 169 |
4. **Temperature 0.1** for reliable code, **0.7** for creative variations
|
| 170 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 171 |
## Known Limitations
|
| 172 |
|
| 173 |
- Standalone function prompts without context may reference undefined types
|
|
|
|
| 168 |
3. **Mention state**: "Use Map for storage" guides the model toward correct patterns
|
| 169 |
4. **Temperature 0.1** for reliable code, **0.7** for creative variations
|
| 170 |
|
| 171 |
+
## Hardware Requirements
|
| 172 |
+
|
| 173 |
+
This is a **MoE (Mixture of Experts) model** — 30B total parameters but only **3B active** per forward pass, making it much lighter than a dense 30B model.
|
| 174 |
+
|
| 175 |
+
| Setup | VRAM | Precision | Works? |
|
| 176 |
+
|-------|------|-----------|--------|
|
| 177 |
+
| 1x RTX 5090 / A100 40GB | 32-40GB | INT8 | ✅ Recommended |
|
| 178 |
+
| 2x RTX 5090 / 1x A100 80GB | 64-80GB | bf16 | ✅ Full precision |
|
| 179 |
+
| 1x RTX 5080 / 4090 / 4080 | 16-24GB | AWQ 4-bit | ✅ Quantized |
|
| 180 |
+
| Apple M4 Pro/Max | 36-128GB unified | MLX / llama.cpp | ✅ |
|
| 181 |
+
|
| 182 |
+
**Supported frameworks:**
|
| 183 |
+
- `transformers` + `peft` (recommended, tested)
|
| 184 |
+
- `vLLM` for serving
|
| 185 |
+
- `llama.cpp` / `Ollama` (with GGUF conversion)
|
| 186 |
+
|
| 187 |
+
> **Note:** This model is NOT compatible with Unsloth due to MoE architecture limitations.
|
| 188 |
+
|
| 189 |
## Known Limitations
|
| 190 |
|
| 191 |
- Standalone function prompts without context may reference undefined types
|