Yotam Perlitz's picture

3 4 7

Yotam Perlitz

per

·

AI & ML interests

None yet

Recent Activity

authored a paper 3 months ago

Holmes: Benchmark the Linguistic Competence of Language Models

authored a paper 3 months ago

JuStRank: Benchmarking LLM Judges for System Ranking

published an article 3 months ago

Bamba: Inference-Efficient Hybrid Mamba2 Model

View all activity

Organizations

per's activity

authored 2 papers 3 months ago

Holmes: Benchmark the Linguistic Competence of Language Models

Paper • 2404.18923 • Published Apr 29, 2024

JuStRank: Benchmarking LLM Judges for System Ranking

Paper • 2412.09569 • Published Dec 12, 2024 • 20

published an article 3 months ago

Article

Bamba: Inference-Efficient Hybrid Mamba2 Model

Dec 18, 2024

• 45

liked a Space 3 months ago

Safety BAT

updated a Space 3 months ago

JuStRank

Display ranked judges for rating systems

commented a paper 3 months ago

JuStRank: Benchmarking LLM Judges for System Ranking

Paper • 2412.09569 • Published Dec 12, 2024 • 20 •

upvoted a paper 3 months ago

JuStRank: Benchmarking LLM Judges for System Ranking

Paper • 2412.09569 • Published Dec 12, 2024 • 20

liked a Space 3 months ago

JuStRank

Display ranked judges for rating systems

liked a Space 4 months ago

Open LLM Leaderboard

Track, rank and evaluate open LLMs and chatbots

updated a Space 4 months ago

BenchBench Leaderboad

liked a Space 5 months ago

BenchBench Leaderboad

Rate new benchmarks against existing ones

updated a Space 5 months ago

BenchBench Leaderboad

Rate new benchmarks against existing ones

upvoted a paper 7 months ago

Benchmark Agreement Testing Done Right: A Guide for LLM Benchmark Evaluation

Paper • 2407.13696 • Published Jul 18, 2024 • 5

authored 2 papers 8 months ago

Efficient Benchmarking (of Language Models)

Paper • 2308.11696 • Published Aug 22, 2023

Benchmark Agreement Testing Done Right: A Guide for LLM Benchmark Evaluation

Paper • 2407.13696 • Published Jul 18, 2024 • 5

New activity in SEACrowd/flores200 8 months ago

fix small bug in instructions

#1 opened 8 months ago by

updated a collection 8 months ago

✨ Highlights

4 items • Updated Aug 15, 2024 • 1

New activity in per/benchbench 8 months ago

Update README.md

#1 opened 8 months ago by

liked a Space 8 months ago

BenchBench Leaderboad

upvoted a paper about 1 year ago

Unitxt: Flexible, Shareable and Reusable Data Preparation and Evaluation for Generative AI

Paper • 2401.14019 • Published Jan 25, 2024 • 23