Text Generation
Transformers
Safetensors
mixtral
conversational
Inference Endpoints
text-generation-inference

Benchmarks?

#2
by rombodawg - opened

Can we get this submitted to open llm leaderboard? A humaneval score would be nice too

Looks like someone submitted it to the leaderboard. I can run some additional benchmarks once the DPO version finishes, to compare both. It seems there's some sort of issue with the model's performance on gsm8k however.

jondurbin changed discussion status to closed

Sign up or log in to comment