miguelcarv
commited on
Commit
•
6100b6f
1
Parent(s):
019db21
Update README.md
Browse files
README.md
CHANGED
@@ -1,8 +1,3 @@
|
|
1 |
-
---
|
2 |
-
language:
|
3 |
-
- en
|
4 |
-
pipeline_tag: image-text-to-text
|
5 |
-
---
|
6 |
# Φ Pheye - a family of efficient small vision-language models
|
7 |
|
8 |
- These models train a fraction of the number of parameters other models of similar sizes train
|
@@ -18,20 +13,22 @@ pipeline_tag: image-text-to-text
|
|
18 |
| MoE-LLaVA-2.7B×4 | 336 | 5.3B | 5.9M | 77.1 | - | 50.2 | - |
|
19 |
| moondream1 | 384 | 1.86B | 3.9M | 74.7 | - | 35.6 |
|
20 |
| moondream2 | 384 | 1.86B | - | 77.7 | 92.5 | 49.7 | 120.2 |
|
21 |
-
| [Pheye-x4 🤗](https://huggingface.co/miguelcarv/Pheye-x4-448) | 448 | 295M | 2.9M | 75.2 | 110.
|
22 |
| [Pheye-x4 🤗](https://huggingface.co/miguelcarv/Pheye-x4-672) | 672 | 295M | 2.9M | 75.5 | 110.8 | 49.2 | 111.9 |
|
23 |
| [Pheye-x2 🤗](https://huggingface.co/miguelcarv/Pheye-x2-448) | 448 | 578M | 2.9M | 76.0 | 111.8 | 47.3 | 108.9 |
|
24 |
| [Pheye-x2 🤗](https://huggingface.co/miguelcarv/Pheye-x2-672) | 672 | 578M | 2.9M | 76.4 | 110.5 | 50.5 | 115.9 |
|
25 |
|
26 |
-
## Examples
|
27 |
|
28 |
-
| Image | Example
|
29 |
-
| ----------------------------------------------------------------------------------------- |
|
30 |
-
| <img src="https://c5.staticflickr.com/6/5463/17191308944_ae0b20bb7e_o.jpg" width="500"/> | **How much do these popcorn packets
|
31 |
-
| <img src="https://farm2.staticflickr.com/2708/5836100440_6e1117d36f_o.jpg" width="500"/> | **Can I pet that dog?**<br>No, you cannot pet the dog in the image.
|
32 |
-
| <img src="
|
33 |
| |
|
34 |
|
|
|
|
|
35 |
## Usage
|
36 |
|
37 |
To generate a sample response from a prompt use `generate.py`.
|
@@ -39,6 +36,7 @@ Use a Python version >= 3.11. Start by cloning the repo and create a virtual env
|
|
39 |
|
40 |
```bash
|
41 |
git clone https://github.com/miguelscarv/pheye.git
|
|
|
42 |
python3 -m venv venv
|
43 |
source venv/bin/activate
|
44 |
pip3 install -r requirements.txt
|
@@ -52,4 +50,4 @@ python3 generate.py --image_path images/dog_flower.jpg --prompt "What is the dog
|
|
52 |
|
53 |
## Acknowledgments
|
54 |
|
55 |
-
This implementation was inspired by [OpenFlamingo](https://github.com/mlfoundations/open_flamingo)'s repository.
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
# Φ Pheye - a family of efficient small vision-language models
|
2 |
|
3 |
- These models train a fraction of the number of parameters other models of similar sizes train
|
|
|
13 |
| MoE-LLaVA-2.7B×4 | 336 | 5.3B | 5.9M | 77.1 | - | 50.2 | - |
|
14 |
| moondream1 | 384 | 1.86B | 3.9M | 74.7 | - | 35.6 |
|
15 |
| moondream2 | 384 | 1.86B | - | 77.7 | 92.5 | 49.7 | 120.2 |
|
16 |
+
| [Pheye-x4 🤗](https://huggingface.co/miguelcarv/Pheye-x4-448) | 448 | 295M | 2.9M | 75.2 | 110.1 | 45.9 | 106.4 |
|
17 |
| [Pheye-x4 🤗](https://huggingface.co/miguelcarv/Pheye-x4-672) | 672 | 295M | 2.9M | 75.5 | 110.8 | 49.2 | 111.9 |
|
18 |
| [Pheye-x2 🤗](https://huggingface.co/miguelcarv/Pheye-x2-448) | 448 | 578M | 2.9M | 76.0 | 111.8 | 47.3 | 108.9 |
|
19 |
| [Pheye-x2 🤗](https://huggingface.co/miguelcarv/Pheye-x2-672) | 672 | 578M | 2.9M | 76.4 | 110.5 | 50.5 | 115.9 |
|
20 |
|
21 |
+
## Examples\*
|
22 |
|
23 |
+
| Image | Example |
|
24 |
+
| ----------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
25 |
+
| <img src="https://c5.staticflickr.com/6/5463/17191308944_ae0b20bb7e_o.jpg" width="500"/> | **How much do these popcorn packets weigh?**<br>These popcorn packets weigh 90g each.<br><br>**What brand are these popcorn packets?**<br>These popcorn packets are from the brand "Magic Pop". |
|
26 |
+
| <img src="https://farm2.staticflickr.com/2708/5836100440_6e1117d36f_o.jpg" width="500"/> | **Can I pet that dog?**<br>No, you cannot pet the dog in the image. It is a black bear, not a dog. |
|
27 |
+
| <img src="images/beekepper.jpeg" width="500"/> | **Why is the person wearing protective gear?**<br>The person in the image is wearing protective gear, such as a beekeeper's suit and a helmet, to ensure their safety while working with bees. Beekeeping involves handling bees and their hives, which can be potentially dangerous due to the risk of bee stings. The protective gear helps to minimize the risk of bee stings and other bee-related injuries, allowing the beekeeper to carry out their work safely and effectively. |
|
28 |
| |
|
29 |
|
30 |
+
\* Generated by Pheye-x2-672
|
31 |
+
|
32 |
## Usage
|
33 |
|
34 |
To generate a sample response from a prompt use `generate.py`.
|
|
|
36 |
|
37 |
```bash
|
38 |
git clone https://github.com/miguelscarv/pheye.git
|
39 |
+
cd pheye
|
40 |
python3 -m venv venv
|
41 |
source venv/bin/activate
|
42 |
pip3 install -r requirements.txt
|
|
|
50 |
|
51 |
## Acknowledgments
|
52 |
|
53 |
+
This implementation was inspired by [OpenFlamingo](https://github.com/mlfoundations/open_flamingo)'s repository.
|