1 1 3

George Cameron

georgewritescode

https://artificialanalysis.ai/

AI & ML interests

None yet

Recent Activity

updated a Space 4 days ago

ArtificialAnalysis/Video-Generation-Arena-Leaderboard

liked a Space 30 days ago

ArtificialAnalysis/Video-Generation-Arena-Leaderboard

View all activity

Articles

Launching the Artificial Analysis Text to Image Leaderboard & Arena

Jun 6

• 11

Bringing the Artificial Analysis LLM Performance Leaderboard to Hugging Face

May 3

• 13

Organizations

georgewritescode's activity

updated a Space 4 days ago

Running

📊

Video Generation Leaderboard

Leaderboard and arena of Video Generation models

liked a Space 30 days ago

Running

📊

Video Generation Leaderboard

Leaderboard and arena of Video Generation models

updated a Space 5 months ago

Running

288

📊

Visualization of GPT-4o breaking away from the quality & speed trade-off curve the LLMs have followed thus far ✂️

Key GPT-4o takeaways
‣ GPT-4o not only offers the highest quality, it also sits amongst the fastest LLMs
‣ For those with speed/latency-sensitive use cases, where previously Claude 3 Haiku or Mixtral 8x7b were leaders, GPT-4o is now a compelling option (though significantly more expensive)
‣ Previously Groq was the only provider to break from the curve using its own LPU chips. OpenAI has done it on Nvidia hardware (one can imagine the potential for GPT-4o on Groq)

👉 How did they do it? Will follow up with more analysis on this but potential approaches include a very large but sparse MoE model (similar to Snowflake's Arctic) and improvements in data quality (likely to have driven much of Llama 3's impressive quality relative to parameter count)

Notes: Throughput represents the median across providers over the last 14 days of measurements (8x per day)

Data is present on our HF leaderboard: ArtificialAnalysis/LLM-Performance-Leaderboard and graphs present on our website

1 reply

New activity in ArtificialAnalysis/LLM-Performance-Leaderboard 7 months ago

small typo

#1 opened 7 months ago by

clem

upvoted an article 7 months ago

Article

Bringing the Artificial Analysis LLM Performance Leaderboard to Hugging Face

May 3

• 13

posted an update 7 months ago

Post

2309

Excited to bring our benchmarking leaderboard of >100 LLM API endpoints to HF!

Speed and price are often just as important as quality when building applications with LLMs. We bring together all the data you need to consider all three when you need to pick a model and API provider.

Coverage:
‣ Quality (Index of evals, MMLU, Chatbot Arena, HumanEval, MT-Bench)
‣ Throughput (tokens/s: median, P5, P25, P75, P95)
‣ Latency (TTFT: median, P5, P25, P75, P95)
‣ Context window
‣ OpenAI library compatibility

Link to Space: ArtificialAnalysis/LLM-Performance-Leaderboard

Blog post: https://huggingface.co/blog/leaderboard-artificial-analysis