Spaces:

SeViLA
/

SeViLA

Runtime error

App Files Files Community

shoubin commited on May 14, 2023

Commit

1202c22

•

1 Parent(s): 7e8784c

udpate_readme

Browse files

Files changed (2) hide show

.DS_Store +0 -0
README.md +8 -112

.DS_Store CHANGED Viewed

Binary files a/.DS_Store and b/.DS_Store differ

README.md CHANGED Viewed

@@ -1,112 +1,8 @@
-# Self-Chained Image-Language Model for Video Localization and Question Answering
-* Authors: [Shoubin Yu](https://yui010206.github.io/), [Jaemin Cho](https://j-min.io), [Prateek Yadav](https://prateek-yadav.github.io/), [Mohit Bansal](https://www.cs.unc.edu/~mbansal/)
-* [arXiv](https://arxiv.org/abs/2305.06988)
-<img src="./assets/teaser.png" alt="teaser image" width="800"/>
-<img src="./assets/model.png" alt="teaser image" width="800"/>
-<img src="./assets/chain.png" alt="teaser image" width="800"/>
-# Code structure
-```bash
-# Data & Data Preprocessing
-./sevila_data
-# Pretrained Checkpoints
-./sevila_checkpoints
-# SeViLA code
-./lavis/
-# running scripts for SeViLa localizer/answerer training/inference
-./run_scripts
-```
-# Setup
-## Install Dependencies
-1. (Optional) Creating conda environment
-```bash
-conda create -n sevila python=3.8
-conda activate sevila
-```
-2. build from source
-```bash
-pip install -e .
-```
-## Download Pretrained Models
-We pre-train SeViLA localizer on QVHighlights and hold checkpoints via [Huggingface](https://huggingface.co/Shoubin/SeViLA/resolve/main/sevila_pretrained.pth).
-Download checkpoints and put it under /sevila_checkpoints.
-The checkpoints (814.55M) contains pre-trained localizer and zero-shot answerer.
-# Dataset Preparation
-We test our model on:
-+ [NExT-QA](https://doc-doc.github.io/docs/nextqa.html)
-+ [STAR](https://star.csail.mit.edu/)
-+ [How2QA](https://value-benchmark.github.io/index.html)
-+ [TVQA](https://tvqa.cs.unc.edu/)
-+ [VLEP](https://value-benchmark.github.io/index.html)
-+ [QVHighlights](https://github.com/jayleicn/moment_detr)
-please download original data and preprocess them via our [scripts](sevila_data/) under ./sevila_data/ .
-# Training and Inference
-We provideo SeViLA training and inference script examples as following:
-## 1) Localizer Pre-training
-```bash
-sh run_scripts/sevila/pre-train/pretrain_qvh.sh
-```
-## 2) Localizer Self-refinement
-```bash
-sh run_scripts/sevila/refinement/nextqa_sr.sh
-```
-## 3) Answerer Fine-tuning
-```bash
-sh run_scripts/sevila/finetune/nextqa_ft.sh
-```
-## 4) Inference
-```bash
-sh run_scripts/sevila/inference/nextqa_infer.sh
-```
-# Acknowledgments
-We thank the developers of [LAVIS](https://github.com/salesforce/LAVIS), [BLIP-2](https://github.com/salesforce/LAVIS/tree/main/projects/blip2), [CLIP](https://github.com/openai/CLIP), [All-in-one](https://github.com/showlab/all-in-one), for their public code release.
-# Reference
-Please cite our paper if you use our models in your works:
-```bibtex
-@misc{yu2023selfchained,
-      title={Self-Chained Image-Language Model for Video Localization and Question Answering},
-      author={Shoubin Yu and Jaemin Cho and Prateek Yadav and Mohit Bansal},
-      year={2023},
-      eprint={2305.06988},
-      archivePrefix={arXiv},
-      primaryClass={cs.CV}
-}

+title: SeViLA Demo
+emoji: ⛓️
+colorFrom: blue
+colorTo: purple
+sdk: gradio
+sdk_version: 3.19.1
+app_file: app.py
+pinned: false