arxyzan commited on
Commit
a51bbf9
1 Parent(s): 0e764a4

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +23 -0
README.md ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - fa
4
+ metrics:
5
+ - wer
6
+ pipeline_tag: image-to-text
7
+ ---
8
+
9
+ A Persian image captioning model constructed from a ViT + RoBERTa architecture trained on flickr30k-fa.
10
+ The encoder (ViT) was initialized from https://huggingface.co/google/vit-base-patch16-224 and the decoder (RoBERTa) was initialized
11
+ from https://huggingface.co/HooshvareLab/roberta-fa-zwnj-base .
12
+
13
+ ## Usage
14
+ ```
15
+ pip install hezar
16
+ ```
17
+ ```python
18
+ from hezar import Model
19
+
20
+ model = Model.load("hezarai/vit-gpt2-fa-image-captioning-flickr30k")
21
+ captions = model.predict("example_image.jpg")
22
+ print(captions)
23
+ ```