Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
singhsidhukuldeepΒ 
posted an update Aug 6, 2024
Post
2183
πŸ—“οΈ Remember when last April, @Meta released Segment Anything Model (SAM) paper and it was too good to be true. 🀯

They have now released Segment Anything Model 2 (SAM 2) and it's mind-blowingly great! πŸš€

SAM 2 is the first unified model for segmenting objects across images and videos. You can use a click, box, or mask as the input to select an object on any image or frame of video. πŸ–ΌοΈπŸ“Ή

SAM consists of an image encoder to encode images, a prompt encoder to encode prompts, then outputs of these two are given to a mask decoder to generate masks. 🎭

The biggest jump of SAM2 from SAM is using memory to have consistent masking across frames! They call it masklet prediction! 🧠

They have also released the dataset, SA-V
This dataset is truly huge, with 190.9K manual annotations and 451.7K automatic! πŸ“Š

πŸ“„ Paper: https://ai.meta.com/research/publications/sam-2-segment-anything-in-images-and-videos/

πŸ“ Blog: https://ai.meta.com/sam2/

πŸ”— Demo: https://sam2.metademolab.com/demo

πŸ’Ύ Model Weights: https://github.com/facebookresearch/segment-anything-2/blob/main/checkpoints/download_ckpts.sh

πŸ“ Dataset: https://ai.meta.com/datasets/segment-anything-video-downloads/

Tried it with a basic minecraft video and the tracking was not so good. oof