nyuuzyou's picture

nyuuzyou PRO

nyuuzyou

AI & ML interests

None yet

Recent Activity

liked a model 34 minutes ago
meta-llama/Llama-4-Scout-17B-16E-Instruct
reacted to merterbak's post with πŸ”₯ about 1 hour ago
Meta has unveiled its Llama 4 πŸ¦™ family of models, featuring native multimodality and mixture-of-experts architecture. Two model families are available now: ModelsπŸ€—: https://huggingface.co/collections/meta-llama/llama-4-67f0c30d9fe03840bc9d0164 Blog Post: https://ai.meta.com/blog/llama-4-multimodal-intelligence/ HF's Blog Post: https://huggingface.co/blog/llama4-release - 🧠 Native Multimodality - Process text and images in a unified architecture - πŸ” Mixture-of-Experts - First Llama models using MoE for incredible efficiency - πŸ“ Super Long Context - Up to 10M tokens - 🌐 Multilingual Power - Trained on 200 languages with 10x more multilingual tokens than Llama 3 (including over 100 languages with over 1 billion tokens each) πŸ”Ή Llama 4 Scout - 17B active parameters (109B total) - 16 experts architecture - 10M context window - Fits on a single H100 GPU - Beats Gemma 3, Gemini 2.0 Flash-Lite, and Mistral 3.1 πŸ”Ή Llama 4 Maverick - 17B active parameters (400B total) - 128 experts architecture - It can fit perfectly on DGX H100(8x H100) - 1M context window - Outperforms GPT-4o and Gemini 2.0 Flash - ELO score of 1417 on LMArena currently second best model on arena πŸ”Ή Llama 4 Behemoth (Coming Soon) - 288B active parameters (2T total) - 16 experts architecture - Teacher model for Scout and Maverick - Outperforms GPT-4.5, Claude Sonnet 3.7, and Gemini 2.0 Pro on STEM benchmarks
View all activity

Organizations

Social Post Explorers's profile picture AI Starter Pack's profile picture

nyuuzyou's activity

upvoted an article 11 days ago
view article
Article

XetHub is joining Hugging Face!

β€’ 89
upvoted 2 articles about 2 months ago
view article
Article

From Chunks to Blocks: Accelerating Uploads and Downloads on the Hub

β€’ 55
view article
Article

Open-R1: a fully open reproduction of DeepSeek-R1

β€’ 835