LearnItAnyway
commited on
Commit
•
be97739
1
Parent(s):
a979a2e
Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,16 @@
|
|
1 |
---
|
2 |
license: other
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: other
|
3 |
---
|
4 |
+
|
5 |
+
# Overview
|
6 |
+
This project aims to support visually impaired individuals in their daily navigation.
|
7 |
+
|
8 |
+
This project combines the [YOLO](https://ultralytics.com/yolov8) model and [LLaMa 2 7b](https://huggingface.co/meta-llama/Llama-2-7b) for the navigation.
|
9 |
+
|
10 |
+
YOLO is trained on the bounding box data from the [AI Hub](https://aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&aihubDataSe=realm&dataSetSn=189),
|
11 |
+
Output of YOLO (bbox data) is converted as lists like `[[class_of_obj_1, xmin, xmax, ymin, ymax, size], [class_of...] ...]` then added to the input of question.
|
12 |
+
The LLM is trained to navigate using [LearnItAnyway/Visual-Navigation-21k](https://huggingface.co/datasets/LearnItAnyway/Visual-Navigation-21k) multi-turn dataset
|
13 |
+
|
14 |
+
|
15 |
+
## Usage
|
16 |
+
We show how to use the model in [yolo_llama_visnav_test.ipynb](https://huggingface.co/LearnItAnyway/YOLO_LLaMa_7B_VisNav/blob/main/yolo_llama_visnav_test.ipynb)
|