nreimers commited on
Commit
ab9209f
1 Parent(s): 976782c

Add new SentenceTransformer model.

Browse files
0_CLIPModel/CLIPModel.pt CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:016a1243bd9df5f1181668d35326d761f8da5b7d22f8ee1319973e37111c82a2
3
- size 605223903
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:16afc66c29d22afa707a6118519bcb2d9fa56f07f4bb9cde21d87c9e5cf0283b
3
+ size 605217061
0_CLIPModel/bpe_simple_vocab_16e6.txt.gz CHANGED
Binary files a/0_CLIPModel/bpe_simple_vocab_16e6.txt.gz and b/0_CLIPModel/bpe_simple_vocab_16e6.txt.gz differ
README.md CHANGED
@@ -4,11 +4,31 @@ tags:
4
  - sentence-transformers
5
  - feature-extraction
6
  - sentence-similarity
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7
  ---
8
 
9
  # sentence-transformers/clip-ViT-B-32
10
 
11
- <!--- Describe your model here -->
 
 
12
 
13
  ## Usage (Sentence-Transformers)
14
 
@@ -29,20 +49,342 @@ embeddings = model.encode(sentences)
29
  print(embeddings)
30
  ```
31
 
 
 
32
  ## Evaluation Results
33
 
34
- <!--- Describe how your model was evaluated -->
35
 
36
  For an automated evaluation of this model, see the *Sentence Embeddings Benchmark*: [https://seb.sbert.net](https://seb.sbert.net?model_name=sentence-transformers/clip-ViT-B-32)
37
 
 
 
38
  ## Full Model Architecture
39
  ```
40
  SentenceTransformer(
41
- (0): CLIPModel()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
42
  )
43
  ```
44
 
45
  ## Citing & Authors
46
 
47
- <!--- Describe where people can find more information -->
48
-
 
 
 
 
 
 
 
 
 
 
 
 
4
  - sentence-transformers
5
  - feature-extraction
6
  - sentence-similarity
7
+ - transformers
8
+ - transformers
9
+ - transformers
10
+ - transformers
11
+ - transformers
12
+ - transformers
13
+ - transformers
14
+ - transformers
15
+ - transformers
16
+ - transformers
17
+ - transformers
18
+ - transformers
19
+ - transformers
20
+ - transformers
21
+ - transformers
22
+ - transformers
23
+ - transformers
24
+ - transformers
25
  ---
26
 
27
  # sentence-transformers/clip-ViT-B-32
28
 
29
+ This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a None dimensional dense vector space and can be used for tasks like clustering or semantic search.
30
+
31
+
32
 
33
  ## Usage (Sentence-Transformers)
34
 
49
  print(embeddings)
50
  ```
51
 
52
+
53
+
54
  ## Evaluation Results
55
 
56
+
57
 
58
  For an automated evaluation of this model, see the *Sentence Embeddings Benchmark*: [https://seb.sbert.net](https://seb.sbert.net?model_name=sentence-transformers/clip-ViT-B-32)
59
 
60
+
61
+
62
  ## Full Model Architecture
63
  ```
64
  SentenceTransformer(
65
+ (0): CLIPModel(
66
+ (model): CLIP(
67
+ (visual): VisualTransformer(
68
+ (conv1): Conv2d(3, 768, kernel_size=(32, 32), stride=(32, 32), bias=False)
69
+ (ln_pre): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
70
+ (transformer): Transformer(
71
+ (resblocks): Sequential(
72
+ (0): ResidualAttentionBlock(
73
+ (attn): MultiheadAttention(
74
+ (out_proj): NonDynamicallyQuantizableLinear(in_features=768, out_features=768, bias=True)
75
+ )
76
+ (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
77
+ (mlp): Sequential(
78
+ (c_fc): Linear(in_features=768, out_features=3072, bias=True)
79
+ (gelu): QuickGELU()
80
+ (c_proj): Linear(in_features=3072, out_features=768, bias=True)
81
+ )
82
+ (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
83
+ )
84
+ (1): ResidualAttentionBlock(
85
+ (attn): MultiheadAttention(
86
+ (out_proj): NonDynamicallyQuantizableLinear(in_features=768, out_features=768, bias=True)
87
+ )
88
+ (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
89
+ (mlp): Sequential(
90
+ (c_fc): Linear(in_features=768, out_features=3072, bias=True)
91
+ (gelu): QuickGELU()
92
+ (c_proj): Linear(in_features=3072, out_features=768, bias=True)
93
+ )
94
+ (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
95
+ )
96
+ (2): ResidualAttentionBlock(
97
+ (attn): MultiheadAttention(
98
+ (out_proj): NonDynamicallyQuantizableLinear(in_features=768, out_features=768, bias=True)
99
+ )
100
+ (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
101
+ (mlp): Sequential(
102
+ (c_fc): Linear(in_features=768, out_features=3072, bias=True)
103
+ (gelu): QuickGELU()
104
+ (c_proj): Linear(in_features=3072, out_features=768, bias=True)
105
+ )
106
+ (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
107
+ )
108
+ (3): ResidualAttentionBlock(
109
+ (attn): MultiheadAttention(
110
+ (out_proj): NonDynamicallyQuantizableLinear(in_features=768, out_features=768, bias=True)
111
+ )
112
+ (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
113
+ (mlp): Sequential(
114
+ (c_fc): Linear(in_features=768, out_features=3072, bias=True)
115
+ (gelu): QuickGELU()
116
+ (c_proj): Linear(in_features=3072, out_features=768, bias=True)
117
+ )
118
+ (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
119
+ )
120
+ (4): ResidualAttentionBlock(
121
+ (attn): MultiheadAttention(
122
+ (out_proj): NonDynamicallyQuantizableLinear(in_features=768, out_features=768, bias=True)
123
+ )
124
+ (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
125
+ (mlp): Sequential(
126
+ (c_fc): Linear(in_features=768, out_features=3072, bias=True)
127
+ (gelu): QuickGELU()
128
+ (c_proj): Linear(in_features=3072, out_features=768, bias=True)
129
+ )
130
+ (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
131
+ )
132
+ (5): ResidualAttentionBlock(
133
+ (attn): MultiheadAttention(
134
+ (out_proj): NonDynamicallyQuantizableLinear(in_features=768, out_features=768, bias=True)
135
+ )
136
+ (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
137
+ (mlp): Sequential(
138
+ (c_fc): Linear(in_features=768, out_features=3072, bias=True)
139
+ (gelu): QuickGELU()
140
+ (c_proj): Linear(in_features=3072, out_features=768, bias=True)
141
+ )
142
+ (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
143
+ )
144
+ (6): ResidualAttentionBlock(
145
+ (attn): MultiheadAttention(
146
+ (out_proj): NonDynamicallyQuantizableLinear(in_features=768, out_features=768, bias=True)
147
+ )
148
+ (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
149
+ (mlp): Sequential(
150
+ (c_fc): Linear(in_features=768, out_features=3072, bias=True)
151
+ (gelu): QuickGELU()
152
+ (c_proj): Linear(in_features=3072, out_features=768, bias=True)
153
+ )
154
+ (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
155
+ )
156
+ (7): ResidualAttentionBlock(
157
+ (attn): MultiheadAttention(
158
+ (out_proj): NonDynamicallyQuantizableLinear(in_features=768, out_features=768, bias=True)
159
+ )
160
+ (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
161
+ (mlp): Sequential(
162
+ (c_fc): Linear(in_features=768, out_features=3072, bias=True)
163
+ (gelu): QuickGELU()
164
+ (c_proj): Linear(in_features=3072, out_features=768, bias=True)
165
+ )
166
+ (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
167
+ )
168
+ (8): ResidualAttentionBlock(
169
+ (attn): MultiheadAttention(
170
+ (out_proj): NonDynamicallyQuantizableLinear(in_features=768, out_features=768, bias=True)
171
+ )
172
+ (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
173
+ (mlp): Sequential(
174
+ (c_fc): Linear(in_features=768, out_features=3072, bias=True)
175
+ (gelu): QuickGELU()
176
+ (c_proj): Linear(in_features=3072, out_features=768, bias=True)
177
+ )
178
+ (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
179
+ )
180
+ (9): ResidualAttentionBlock(
181
+ (attn): MultiheadAttention(
182
+ (out_proj): NonDynamicallyQuantizableLinear(in_features=768, out_features=768, bias=True)
183
+ )
184
+ (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
185
+ (mlp): Sequential(
186
+ (c_fc): Linear(in_features=768, out_features=3072, bias=True)
187
+ (gelu): QuickGELU()
188
+ (c_proj): Linear(in_features=3072, out_features=768, bias=True)
189
+ )
190
+ (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
191
+ )
192
+ (10): ResidualAttentionBlock(
193
+ (attn): MultiheadAttention(
194
+ (out_proj): NonDynamicallyQuantizableLinear(in_features=768, out_features=768, bias=True)
195
+ )
196
+ (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
197
+ (mlp): Sequential(
198
+ (c_fc): Linear(in_features=768, out_features=3072, bias=True)
199
+ (gelu): QuickGELU()
200
+ (c_proj): Linear(in_features=3072, out_features=768, bias=True)
201
+ )
202
+ (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
203
+ )
204
+ (11): ResidualAttentionBlock(
205
+ (attn): MultiheadAttention(
206
+ (out_proj): NonDynamicallyQuantizableLinear(in_features=768, out_features=768, bias=True)
207
+ )
208
+ (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
209
+ (mlp): Sequential(
210
+ (c_fc): Linear(in_features=768, out_features=3072, bias=True)
211
+ (gelu): QuickGELU()
212
+ (c_proj): Linear(in_features=3072, out_features=768, bias=True)
213
+ )
214
+ (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
215
+ )
216
+ )
217
+ )
218
+ (ln_post): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
219
+ )
220
+ (transformer): Transformer(
221
+ (resblocks): Sequential(
222
+ (0): ResidualAttentionBlock(
223
+ (attn): MultiheadAttention(
224
+ (out_proj): NonDynamicallyQuantizableLinear(in_features=512, out_features=512, bias=True)
225
+ )
226
+ (ln_1): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
227
+ (mlp): Sequential(
228
+ (c_fc): Linear(in_features=512, out_features=2048, bias=True)
229
+ (gelu): QuickGELU()
230
+ (c_proj): Linear(in_features=2048, out_features=512, bias=True)
231
+ )
232
+ (ln_2): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
233
+ )
234
+ (1): ResidualAttentionBlock(
235
+ (attn): MultiheadAttention(
236
+ (out_proj): NonDynamicallyQuantizableLinear(in_features=512, out_features=512, bias=True)
237
+ )
238
+ (ln_1): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
239
+ (mlp): Sequential(
240
+ (c_fc): Linear(in_features=512, out_features=2048, bias=True)
241
+ (gelu): QuickGELU()
242
+ (c_proj): Linear(in_features=2048, out_features=512, bias=True)
243
+ )
244
+ (ln_2): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
245
+ )
246
+ (2): ResidualAttentionBlock(
247
+ (attn): MultiheadAttention(
248
+ (out_proj): NonDynamicallyQuantizableLinear(in_features=512, out_features=512, bias=True)
249
+ )
250
+ (ln_1): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
251
+ (mlp): Sequential(
252
+ (c_fc): Linear(in_features=512, out_features=2048, bias=True)
253
+ (gelu): QuickGELU()
254
+ (c_proj): Linear(in_features=2048, out_features=512, bias=True)
255
+ )
256
+ (ln_2): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
257
+ )
258
+ (3): ResidualAttentionBlock(
259
+ (attn): MultiheadAttention(
260
+ (out_proj): NonDynamicallyQuantizableLinear(in_features=512, out_features=512, bias=True)
261
+ )
262
+ (ln_1): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
263
+ (mlp): Sequential(
264
+ (c_fc): Linear(in_features=512, out_features=2048, bias=True)
265
+ (gelu): QuickGELU()
266
+ (c_proj): Linear(in_features=2048, out_features=512, bias=True)
267
+ )
268
+ (ln_2): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
269
+ )
270
+ (4): ResidualAttentionBlock(
271
+ (attn): MultiheadAttention(
272
+ (out_proj): NonDynamicallyQuantizableLinear(in_features=512, out_features=512, bias=True)
273
+ )
274
+ (ln_1): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
275
+ (mlp): Sequential(
276
+ (c_fc): Linear(in_features=512, out_features=2048, bias=True)
277
+ (gelu): QuickGELU()
278
+ (c_proj): Linear(in_features=2048, out_features=512, bias=True)
279
+ )
280
+ (ln_2): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
281
+ )
282
+ (5): ResidualAttentionBlock(
283
+ (attn): MultiheadAttention(
284
+ (out_proj): NonDynamicallyQuantizableLinear(in_features=512, out_features=512, bias=True)
285
+ )
286
+ (ln_1): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
287
+ (mlp): Sequential(
288
+ (c_fc): Linear(in_features=512, out_features=2048, bias=True)
289
+ (gelu): QuickGELU()
290
+ (c_proj): Linear(in_features=2048, out_features=512, bias=True)
291
+ )
292
+ (ln_2): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
293
+ )
294
+ (6): ResidualAttentionBlock(
295
+ (attn): MultiheadAttention(
296
+ (out_proj): NonDynamicallyQuantizableLinear(in_features=512, out_features=512, bias=True)
297
+ )
298
+ (ln_1): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
299
+ (mlp): Sequential(
300
+ (c_fc): Linear(in_features=512, out_features=2048, bias=True)
301
+ (gelu): QuickGELU()
302
+ (c_proj): Linear(in_features=2048, out_features=512, bias=True)
303
+ )
304
+ (ln_2): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
305
+ )
306
+ (7): ResidualAttentionBlock(
307
+ (attn): MultiheadAttention(
308
+ (out_proj): NonDynamicallyQuantizableLinear(in_features=512, out_features=512, bias=True)
309
+ )
310
+ (ln_1): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
311
+ (mlp): Sequential(
312
+ (c_fc): Linear(in_features=512, out_features=2048, bias=True)
313
+ (gelu): QuickGELU()
314
+ (c_proj): Linear(in_features=2048, out_features=512, bias=True)
315
+ )
316
+ (ln_2): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
317
+ )
318
+ (8): ResidualAttentionBlock(
319
+ (attn): MultiheadAttention(
320
+ (out_proj): NonDynamicallyQuantizableLinear(in_features=512, out_features=512, bias=True)
321
+ )
322
+ (ln_1): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
323
+ (mlp): Sequential(
324
+ (c_fc): Linear(in_features=512, out_features=2048, bias=True)
325
+ (gelu): QuickGELU()
326
+ (c_proj): Linear(in_features=2048, out_features=512, bias=True)
327
+ )
328
+ (ln_2): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
329
+ )
330
+ (9): ResidualAttentionBlock(
331
+ (attn): MultiheadAttention(
332
+ (out_proj): NonDynamicallyQuantizableLinear(in_features=512, out_features=512, bias=True)
333
+ )
334
+ (ln_1): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
335
+ (mlp): Sequential(
336
+ (c_fc): Linear(in_features=512, out_features=2048, bias=True)
337
+ (gelu): QuickGELU()
338
+ (c_proj): Linear(in_features=2048, out_features=512, bias=True)
339
+ )
340
+ (ln_2): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
341
+ )
342
+ (10): ResidualAttentionBlock(
343
+ (attn): MultiheadAttention(
344
+ (out_proj): NonDynamicallyQuantizableLinear(in_features=512, out_features=512, bias=True)
345
+ )
346
+ (ln_1): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
347
+ (mlp): Sequential(
348
+ (c_fc): Linear(in_features=512, out_features=2048, bias=True)
349
+ (gelu): QuickGELU()
350
+ (c_proj): Linear(in_features=2048, out_features=512, bias=True)
351
+ )
352
+ (ln_2): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
353
+ )
354
+ (11): ResidualAttentionBlock(
355
+ (attn): MultiheadAttention(
356
+ (out_proj): NonDynamicallyQuantizableLinear(in_features=512, out_features=512, bias=True)
357
+ )
358
+ (ln_1): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
359
+ (mlp): Sequential(
360
+ (c_fc): Linear(in_features=512, out_features=2048, bias=True)
361
+ (gelu): QuickGELU()
362
+ (c_proj): Linear(in_features=2048, out_features=512, bias=True)
363
+ )
364
+ (ln_2): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
365
+ )
366
+ )
367
+ )
368
+ (token_embedding): Embedding(49408, 512)
369
+ (ln_final): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
370
+ )
371
+ )
372
  )
373
  ```
374
 
375
  ## Citing & Authors
376
 
377
+ This model was trained by [sentence-transformers](https://www.sbert.net/).
378
+
379
+ If you find this model helpful, feel free to cite our publication [Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks](https://arxiv.org/abs/1908.10084):
380
+ ```bibtex
381
+ @inproceedings{reimers-2019-sentence-bert,
382
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
383
+ author = "Reimers, Nils and Gurevych, Iryna",
384
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
385
+ month = "11",
386
+ year = "2019",
387
+ publisher = "Association for Computational Linguistics",
388
+ url = "http://arxiv.org/abs/1908.10084",
389
+ }
390
+ ```
config_sentence_transformers.json CHANGED
@@ -1,7 +1,7 @@
1
  {
2
  "__version__": {
3
  "sentence_transformers": "2.0.0",
4
- "transformers": "4.6.1",
5
- "pytorch": "1.8.0"
6
  }
7
  }
1
  {
2
  "__version__": {
3
  "sentence_transformers": "2.0.0",
4
+ "transformers": "4.7.0",
5
+ "pytorch": "1.9.0+cu102"
6
  }
7
  }