Evaluate LLMs using Kazakh MC tasks
VLMEvalKit Evaluation Results Collection
Browse and submit language model benchmarks
Display and run auto evaluation logs