Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
kpyu
/
video-blip-flan-t5-xl-ego4d
like
3
Image-to-Text
Transformers
PyTorch
English
blip-2
text2text-generation
vision
video-to-text
image-captioning
video-captioning
visual-question-answering
Inference Endpoints
arxiv:
2301.12597
arxiv:
2210.11416
License:
mit
Model card
Files
Files and versions
Community
1
Train
Deploy
Use in Transformers
main
video-blip-flan-t5-xl-ego4d
1 contributor
History:
4 commits
kpyu
Update README.md
b494c30
12 months ago
.gitattributes
1.48 kB
initial commit
12 months ago
README.md
1.39 kB
Update README.md
12 months ago
config.json
7.81 kB
Upload VideoBlipForConditionalGeneration
12 months ago
preprocessor_config.json
432 Bytes
Upload processor
12 months ago
pytorch_model-00001-of-00002.bin
9.44 GB
LFS
Upload VideoBlipForConditionalGeneration
12 months ago
pytorch_model-00002-of-00002.bin
6.33 GB
LFS
Upload VideoBlipForConditionalGeneration
12 months ago
pytorch_model.bin.index.json
128 kB
Upload VideoBlipForConditionalGeneration
12 months ago
special_tokens_map.json
2.2 kB
Upload processor
12 months ago
tokenizer.json
2.42 MB
Upload processor
12 months ago
tokenizer_config.json
2.39 kB
Upload processor
12 months ago