How was the evaluation executed?

by iarbel - opened Mar 21, 2024

Mar 21, 2024

Thanks for sharing this model and data. I've read through the article, however I couldn't find clear references to the benchmark. It states that
we center our analysis on the legal domain, with a specific focus on: international law, professional law, and jurisprudence. Those tasks respectively contain 120, 1500, and 110 examples.
How can I find these examples to benchmark models?

PierreColombo

Equall org Mar 21, 2024

cais/mmlu ? here ? we did not used this one though

PierreColombo changed discussion status to closed Mar 21, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment