Smiley

Smiley0707

AI & ML interests

None yet

Recent Activity

liked a Space about 11 hours ago
webml-community/kokoro-webgpu
liked a Space about 11 hours ago
tencent/Hunyuan3D-2
liked a Space about 11 hours ago
nanotron/ultrascale-playbook
View all activity

Organizations

ZeroGPU Explorers's profile picture

Smiley0707's activity

reacted to merve's post with šŸ”„ 11 days ago
view post
Post
3009
Interesting releases in open AI this week, let's recap šŸ¤  merve/feb-7-releases-67a5f7d7f172d8bfe0dd66f4

šŸ¤– Robotics
> Pi0, first open-source foundation vision-language action model was released in Le Robot (Apache 2.0)

šŸ’¬ LLMs
> Groundbreaking: s1 is simpler approach to test-time scaling, the release comes with small s1K dataset of 1k question-reasoning trace pairs (from Gemini-Thinking Exp) they fine-tune Qwen2.5-32B-Instruct to get s1-32B, outperforming o1-preview on math šŸ¤Æ s1-32B and s1K is out!
> Adyen released DABstep, a new benchmark along with it's leaderboard demo for agents doing data analysis
> Krutrim released Krutrim-2 instruct, new 12B model based on NeMo12B trained and aligned on Indic languages, a new multilingual sentence embedding model (based on STSB-XLM-R), and a translation model for Indic languages

šŸ‘€ Multimodal
> PKU released Align-DS-V, a model aligned using their new technique called LLF for all modalities (image-text-audio), along with the dataset Align Anything
> OLA-7B is a new any-to-any model by Tencent that can take text, image, video, audio data with context window of 32k tokens and output text and speech in English and Chinese
> Krutrim released Chitrarth, a new vision language model for Indic languages and English

šŸ–¼ļø Vision
> BiRefNet_HR is a new higher resolution BiRefNet for background removal

šŸ—£ļø Audio
> kyutai released Hibiki, it's a real-time speech-to-speech translation model šŸ¤Æ it's available for French-English translation
> Krutrim released Dhwani, a new STT model for Indic languages
> They also release a new dataset for STT-TTS

šŸ–¼ļø Image Generation
> Lumina released Lumina-Image-2.0, a 2B parameter-flow based DiT for text to image generation
> Tencent released Hunyuan3D-2, a 3D asset generation model based on DiT and Hunyuan3D-Paint
> boreal-hl-v1 is a new boring photorealistic image generation LoRA based on Hunyuan