Daniel Serrano's picture
1 7

Daniel Serrano

dnlserrano

AI & ML interests

None yet

Recent Activity

Organizations

None yet

dnlserrano's activity

reacted to merve's post with 🔥 about 2 months ago
view post
Post
5431
Another great week in open ML!
Here's a small recap 🫰🏻

Model releases
⏯️ Video Language Models
AI at Meta released Vision-CAIR/LongVU_Qwen2_7B, a new state-of-the-art long video LM model based on DINOv2, SigLIP, Qwen2 and Llama 3.2

💬 Small language models
Hugging Face released HuggingFaceTB/SmolLM2-1.7B, a family of new smol language models with Apache 2.0 license that come in sizes 135M, 360M and 1.7B, along with datasets.
Meta released facebook/MobileLLM-1B, a new family of on-device LLMs of sizes 125M, 350M and 600M

🖼️ Image Generation
Stability AI released stabilityai/stable-diffusion-3.5-medium, a 2B model with commercially permissive license

🖼️💬Any-to-Any
gpt-omni/mini-omni2 is closest reproduction to GPT-4o, a new LLM that can take image-text-audio input and output speech is released!

Dataset releases
🖼️ Spawning/PD12M, a new captioning dataset of 12.4 million examples generated using Florence-2
reacted to merve's post with 🔥 3 months ago
view post
Post
3770
Meta AI vision has been cooking @facebook
They shipped multiple models and demos for their papers at @ECCV 🤗

Here's a compilation of my top picks:
- Sapiens is family of foundation models for human-centric depth estimation, segmentation and more, all models have open weights and demos 👏

All models have their demos and even torchscript checkpoints!
A collection of models and demos: facebook/sapiens-66d22047daa6402d565cb2fc
- VFusion3D is state-of-the-art consistent 3D generation model from images

Model: facebook/vfusion3d
Demo: facebook/VFusion3D

- CoTracker is the state-of-the-art point (pixel) tracking model

Demo: facebook/cotracker
Model: facebook/cotracker