thomadev0's picture
Update README.md
f1d96ff verified
metadata
language:
  - en
license: apache-2.0
tags:
  - Mixtral
  - instruct
  - finetune
  - chatml
  - DPO
  - RLHF
  - gpt4
  - synthetic data
  - distillation
  - mlx
base_model: mistralai/Mixtral-8x7B-v0.1
model-index:
  - name: Nous-Hermes-2-Mixtral-8x7B-DPO
    results: []

mlx-community/Nous-Hermes-2-Mixtral-8x7B-DPO-4bit

This model was converted to MLX format from NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO. Refer to the original model card for more details on the model.

Use with mlx

pip install mlx-lm
from mlx_lm import load, generate

model, tokenizer = load("mlx-community/Nous-Hermes-2-Mixtral-8x7B-DPO-4bit")
response = generate(model, tokenizer, prompt="hello", verbose=True)

Use with mlx_lm cli

pip install -U mlx-lm
python3 -m mlx_lm.generate --model mlx-community/Nous-Hermes-2-Mixtral-8x7B-DPO-4bit --prompt "<|im_start|>system\nYou are an accurate, educational, and helpful information assistant<|im_end|>\n<|im_start|>user\nWhat is the difference between awq vs gptq quantitization?<|im_end|>\n<|im_start|>assistant\n" --max-tokens 2048