Video-Text-to-Text
Transformers
English
video-understanding
vision-language-model
qwen2-vl
whisper
structured-extraction
operator-curated
evidence-bound-retrieval
Instructions to use ramene/mae-video-ingestion with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use ramene/mae-video-ingestion with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("ramene/mae-video-ingestion", dtype="auto") - Notebooks
- Google Colab
- Kaggle
Welcome to the community
The community tab is the place to discuss and collaborate with the HF community!