facebook
/

vc1-large

Model card Files Files and versions Community

silwals commited on Apr 7, 2023

Commit

8a47f31

•

1 Parent(s): f23b1d7

Update README.md

Files changed (1) hide show

README.md +1 -0

README.md CHANGED Viewed

@@ -8,6 +8,7 @@ Last updated: 2023-04-07
 Version: 1.0
 Code: https://github.com/facebookresearch/eai-vc
 Other Links: VC-1 Website, VC-1 Blogpost, VC-1 Paper, VC-1 Demo
 The VC-1 model is a vision transformer (ViT) pre-trained on over 4,000 hours of egocentric videos from 7 different sources, together with ImageNet. The model is trained using Masked Auto-Encoding (MAE) and is available in two sizes: ViT-B and ViT-L. The model is intended for use for EmbodiedAI tasks, such as object manipulation and indoor navigation.
 The VC-1 model is a vision transformer (ViT) pre-trained on over 4,000 hours of egocentric videos from 7 different sources, together with ImageNet. The model is trained using Masked Auto-Encoding (MAE) and is available in two sizes: ViT-B and ViT-L. The model is intended for use for EmbodiedAI tasks, such as object manipulation and indoor navigation.

 Version: 1.0
 Code: https://github.com/facebookresearch/eai-vc
 Other Links: VC-1 Website, VC-1 Blogpost, VC-1 Paper, VC-1 Demo
 The VC-1 model is a vision transformer (ViT) pre-trained on over 4,000 hours of egocentric videos from 7 different sources, together with ImageNet. The model is trained using Masked Auto-Encoding (MAE) and is available in two sizes: ViT-B and ViT-L. The model is intended for use for EmbodiedAI tasks, such as object manipulation and indoor navigation.
 The VC-1 model is a vision transformer (ViT) pre-trained on over 4,000 hours of egocentric videos from 7 different sources, together with ImageNet. The model is trained using Masked Auto-Encoding (MAE) and is available in two sizes: ViT-B and ViT-L. The model is intended for use for EmbodiedAI tasks, such as object manipulation and indoor navigation.