Edit model card

I don't know why so many downloads about this model. Please share your cases, thanks.

Now this model is improved by DPO to cloudyu/Pluto_24B_DPO_200

  • Metrics improved by DPO Metrsc improment Metrsc improment

Mixtral MOE 4x7B

MOE the following models by mergekit:

Metrics

gpu code example

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
import math

## v2 models
model_path = "cloudyu/Mixtral_7Bx4_MOE_24B"

tokenizer = AutoTokenizer.from_pretrained(model_path, use_default_system_prompt=False)
model = AutoModelForCausalLM.from_pretrained(
    model_path, torch_dtype=torch.float32, device_map='auto',local_files_only=False, load_in_4bit=True
)
print(model)
prompt = input("please input prompt:")
while len(prompt) > 0:
  input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to("cuda")

  generation_output = model.generate(
    input_ids=input_ids, max_new_tokens=500,repetition_penalty=1.2
  )
  print(tokenizer.decode(generation_output[0]))
  prompt = input("please input prompt:")

CPU example

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
import math

## v2 models
model_path = "cloudyu/Mixtral_7Bx4_MOE_24B"

tokenizer = AutoTokenizer.from_pretrained(model_path, use_default_system_prompt=False)
model = AutoModelForCausalLM.from_pretrained(
    model_path, torch_dtype=torch.float32, device_map='cpu',local_files_only=False
)
print(model)
prompt = input("please input prompt:")
while len(prompt) > 0:
  input_ids = tokenizer(prompt, return_tensors="pt").input_ids

  generation_output = model.generate(
    input_ids=input_ids, max_new_tokens=500,repetition_penalty=1.2
  )
  print(tokenizer.decode(generation_output[0]))
  prompt = input("please input prompt:")

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 68.83
AI2 Reasoning Challenge (25-Shot) 65.27
HellaSwag (10-Shot) 85.28
MMLU (5-Shot) 62.84
TruthfulQA (0-shot) 59.85
Winogrande (5-shot) 77.66
GSM8k (5-shot) 62.09
Downloads last month
698
Safetensors
Model size
24.2B params
Tensor type
BF16
Β·
FP16
Β·
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for cloudyu/Mixtral_7Bx4_MOE_24B

Quantizations
2 models

Spaces using cloudyu/Mixtral_7Bx4_MOE_24B 11

Evaluation results