FreddeFrallan commited on
Commit
ff85d1d
1 Parent(s): 040db0b

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +28 -0
README.md CHANGED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <br />
2
+ <p align="center">
3
+ <h1 align="center">Swe-CLIP 500k</h1>
4
+
5
+ <p align="center">
6
+ <a href="https://github.com/FreddeFrallan/Multilingual-CLIP/tree/main/Model%20Cards/Swe-CLIP%20500k">Github Model Card</a>
7
+ </p>
8
+ </p>
9
+
10
+
11
+ ## Usage
12
+ To use this model along with the original CLIP vision encoder you need to download the code and additional linear weights from the [Multilingual-CLIP Github](https://github.com/FreddeFrallan/Multilingual-CLIP).
13
+ Once this is done, you can load and use the model with the following code
14
+ ```python
15
+ from src import multilingual_clip
16
+
17
+ model = multilingual_clip.load_model('Swe-CLIP-500k')
18
+ embeddings = model(['Älgen är skogens konung!', 'Alla isbjörnar är vänsterhänta'])
19
+ print(embeddings.shape)
20
+ # Yields: torch.Size([2, 640])
21
+ ```
22
+
23
+ <!-- ABOUT THE PROJECT -->
24
+ ## About
25
+ A [KB/Bert-Swedish-Cased](https://huggingface.co/KB/bert-base-swedish-cased) tuned to match the embedding space of the CLIP text encoder which accompanies the Res50x4 vision encoder. <br>
26
+
27
+ Training data pairs was generated by sampling 500k sentences from the combined descriptions of [GCC](https://ai.google.com/research/ConceptualCaptions/) + [MSCOCO](https://cocodataset.org/#home) + [VizWiz](https://vizwiz.org/tasks-and-datasets/image-captioning/), and translating them into Swedish.
28
+ All translation was done using the [Huggingface Opus Model](https://huggingface.co/Helsinki-NLP/opus-mt-en-sv), which seemingly procudes higher quality translations than relying on the [AWS translate service](https://aws.amazon.com/translate/).