Text Generation
Transformers
Safetensors
Finnish
English
bloom
Inference Endpoints
text-generation-inference

Eval results?

#2
by rombodawg - opened

I know this isnt a finished model, but im still curious of the benchmarks. Mainly mmlu and human eval.

LumiOpen org

We will be publishing more detailed results soon, but MMLU on the final checkpoint is 46.29 and HumanEval Pass@10 is 37.20. We hope to release an instruction tuned version soon, but are still evaluating open dataset options.

jonabur changed discussion status to closed

Sign up or log in to comment