Spaces:
Configuration error
Configuration error
luxmorocco
commited on
Commit
•
59e5ee4
1
Parent(s):
108b1ba
Delete README.md
Browse files
README.md
DELETED
@@ -1,68 +0,0 @@
|
|
1 |
-
# YOLO-World + EfficientViT SAM
|
2 |
-
|
3 |
-
🤗 [HuggingFace Space](https://huggingface.co/spaces/curt-park/yolo-world-with-efficientvit-sam)
|
4 |
-
|
5 |
-
![example_0](https://github.com/Curt-Park/yolo-world-with-efficientvit-sam/assets/14961526/326bde19-d535-4be5-829e-782fce0c1d00)
|
6 |
-
|
7 |
-
## Prerequisites
|
8 |
-
This project is developed and tested on Python3.10.
|
9 |
-
|
10 |
-
```bash
|
11 |
-
# Create and activate a python 3.10 environment.
|
12 |
-
conda create -n yolo-world-with-efficientvit-sam python=3.10 -y
|
13 |
-
conda activate yolo-world-with-efficientvit-sam
|
14 |
-
# Setup packages.
|
15 |
-
make setup
|
16 |
-
```
|
17 |
-
|
18 |
-
## How to Run
|
19 |
-
```bash
|
20 |
-
python app.py
|
21 |
-
```
|
22 |
-
|
23 |
-
Open http://127.0.0.1:7860/ on your web browser.
|
24 |
-
|
25 |
-
![example_1](https://github.com/Curt-Park/yolo-world-with-efficientvit-sam/assets/14961526/9388e4ee-6f71-4428-b17c-d218fd059949)
|
26 |
-
|
27 |
-
## Core Components
|
28 |
-
|
29 |
-
### YOLO-World
|
30 |
-
[YOLO-World](https://github.com/AILab-CVC/YOLO-World) is an open-vocabulary object detection model with high efficiency.
|
31 |
-
On the challenging LVIS dataset, YOLO-World achieves 35.4 AP with 52.0 FPS on V100,
|
32 |
-
which outperforms many state-of-the-art methods in terms of both accuracy and speed.
|
33 |
-
![image](https://github.com/Curt-Park/yolo-world-with-efficientvit-sam/assets/14961526/8a4a17bd-918d-478a-8451-f58e4a2dce79)
|
34 |
-
<img width="1024" src="https://github.com/Curt-Park/yolo-world-with-efficientvit-sam/assets/14961526/fce57405-e18d-45f3-bea8-fc3971faf975">
|
35 |
-
|
36 |
-
### EfficientViT SAM
|
37 |
-
[EfficientViT SAM](https://github.com/mit-han-lab/efficientvit) is a new family of accelerated segment anything models.
|
38 |
-
Thanks to the lightweight and hardware-efficient core building block,
|
39 |
-
it delivers 48.9× measured TensorRT speedup on A100 GPU over SAM-ViT-H without sacrificing performance.
|
40 |
-
|
41 |
-
<img width="1024" src="https://github.com/Curt-Park/yolo-world-with-efficientvit-sam/assets/14961526/9eec003f-47c9-43a5-86b0-82d6689e1bf9">
|
42 |
-
<img width="1024" src="https://github.com/Curt-Park/yolo-world-with-efficientvit-sam/assets/14961526/d79973bb-0d80-4b64-a175-252de56d0d09">
|
43 |
-
|
44 |
-
## Powered By
|
45 |
-
```
|
46 |
-
@misc{zhang2024efficientvitsam,
|
47 |
-
title={EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss},
|
48 |
-
author={Zhuoyang Zhang and Han Cai and Song Han},
|
49 |
-
year={2024},
|
50 |
-
eprint={2402.05008},
|
51 |
-
archivePrefix={arXiv},
|
52 |
-
primaryClass={cs.CV}
|
53 |
-
}
|
54 |
-
|
55 |
-
@article{cheng2024yolow,
|
56 |
-
title={YOLO-World: Real-Time Open-Vocabulary Object Detection},
|
57 |
-
author={Cheng, Tianheng and Song, Lin and Ge, Yixiao and Liu, Wenyu and Wang, Xinggang and Shan, Ying},
|
58 |
-
journal={arXiv preprint arXiv:2401.17270},
|
59 |
-
year={2024}
|
60 |
-
}
|
61 |
-
|
62 |
-
@article{cai2022efficientvit,
|
63 |
-
title={Efficientvit: Enhanced linear attention for high-resolution low-computation visual recognition},
|
64 |
-
author={Cai, Han and Gan, Chuang and Han, Song},
|
65 |
-
journal={arXiv preprint arXiv:2205.14756},
|
66 |
-
year={2022}
|
67 |
-
}
|
68 |
-
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|