nreimers commited on
Commit
3fa6f52
1 Parent(s): ab9209f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -319
README.md CHANGED
@@ -1,32 +1,16 @@
1
  ---
2
  pipeline_tag: sentence-similarity
 
3
  tags:
4
  - sentence-transformers
5
  - feature-extraction
6
  - sentence-similarity
7
  - transformers
8
- - transformers
9
- - transformers
10
- - transformers
11
- - transformers
12
- - transformers
13
- - transformers
14
- - transformers
15
- - transformers
16
- - transformers
17
- - transformers
18
- - transformers
19
- - transformers
20
- - transformers
21
- - transformers
22
- - transformers
23
- - transformers
24
- - transformers
25
  ---
26
 
27
  # sentence-transformers/clip-ViT-B-32
28
 
29
- This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a None dimensional dense vector space and can be used for tasks like clustering or semantic search.
30
 
31
 
32
 
@@ -64,307 +48,8 @@ For an automated evaluation of this model, see the *Sentence Embeddings Benchmar
64
  SentenceTransformer(
65
  (0): CLIPModel(
66
  (model): CLIP(
67
- (visual): VisualTransformer(
68
- (conv1): Conv2d(3, 768, kernel_size=(32, 32), stride=(32, 32), bias=False)
69
- (ln_pre): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
70
- (transformer): Transformer(
71
- (resblocks): Sequential(
72
- (0): ResidualAttentionBlock(
73
- (attn): MultiheadAttention(
74
- (out_proj): NonDynamicallyQuantizableLinear(in_features=768, out_features=768, bias=True)
75
- )
76
- (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
77
- (mlp): Sequential(
78
- (c_fc): Linear(in_features=768, out_features=3072, bias=True)
79
- (gelu): QuickGELU()
80
- (c_proj): Linear(in_features=3072, out_features=768, bias=True)
81
- )
82
- (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
83
- )
84
- (1): ResidualAttentionBlock(
85
- (attn): MultiheadAttention(
86
- (out_proj): NonDynamicallyQuantizableLinear(in_features=768, out_features=768, bias=True)
87
- )
88
- (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
89
- (mlp): Sequential(
90
- (c_fc): Linear(in_features=768, out_features=3072, bias=True)
91
- (gelu): QuickGELU()
92
- (c_proj): Linear(in_features=3072, out_features=768, bias=True)
93
- )
94
- (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
95
- )
96
- (2): ResidualAttentionBlock(
97
- (attn): MultiheadAttention(
98
- (out_proj): NonDynamicallyQuantizableLinear(in_features=768, out_features=768, bias=True)
99
- )
100
- (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
101
- (mlp): Sequential(
102
- (c_fc): Linear(in_features=768, out_features=3072, bias=True)
103
- (gelu): QuickGELU()
104
- (c_proj): Linear(in_features=3072, out_features=768, bias=True)
105
- )
106
- (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
107
- )
108
- (3): ResidualAttentionBlock(
109
- (attn): MultiheadAttention(
110
- (out_proj): NonDynamicallyQuantizableLinear(in_features=768, out_features=768, bias=True)
111
- )
112
- (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
113
- (mlp): Sequential(
114
- (c_fc): Linear(in_features=768, out_features=3072, bias=True)
115
- (gelu): QuickGELU()
116
- (c_proj): Linear(in_features=3072, out_features=768, bias=True)
117
- )
118
- (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
119
- )
120
- (4): ResidualAttentionBlock(
121
- (attn): MultiheadAttention(
122
- (out_proj): NonDynamicallyQuantizableLinear(in_features=768, out_features=768, bias=True)
123
- )
124
- (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
125
- (mlp): Sequential(
126
- (c_fc): Linear(in_features=768, out_features=3072, bias=True)
127
- (gelu): QuickGELU()
128
- (c_proj): Linear(in_features=3072, out_features=768, bias=True)
129
- )
130
- (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
131
- )
132
- (5): ResidualAttentionBlock(
133
- (attn): MultiheadAttention(
134
- (out_proj): NonDynamicallyQuantizableLinear(in_features=768, out_features=768, bias=True)
135
- )
136
- (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
137
- (mlp): Sequential(
138
- (c_fc): Linear(in_features=768, out_features=3072, bias=True)
139
- (gelu): QuickGELU()
140
- (c_proj): Linear(in_features=3072, out_features=768, bias=True)
141
- )
142
- (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
143
- )
144
- (6): ResidualAttentionBlock(
145
- (attn): MultiheadAttention(
146
- (out_proj): NonDynamicallyQuantizableLinear(in_features=768, out_features=768, bias=True)
147
- )
148
- (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
149
- (mlp): Sequential(
150
- (c_fc): Linear(in_features=768, out_features=3072, bias=True)
151
- (gelu): QuickGELU()
152
- (c_proj): Linear(in_features=3072, out_features=768, bias=True)
153
- )
154
- (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
155
- )
156
- (7): ResidualAttentionBlock(
157
- (attn): MultiheadAttention(
158
- (out_proj): NonDynamicallyQuantizableLinear(in_features=768, out_features=768, bias=True)
159
- )
160
- (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
161
- (mlp): Sequential(
162
- (c_fc): Linear(in_features=768, out_features=3072, bias=True)
163
- (gelu): QuickGELU()
164
- (c_proj): Linear(in_features=3072, out_features=768, bias=True)
165
- )
166
- (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
167
- )
168
- (8): ResidualAttentionBlock(
169
- (attn): MultiheadAttention(
170
- (out_proj): NonDynamicallyQuantizableLinear(in_features=768, out_features=768, bias=True)
171
- )
172
- (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
173
- (mlp): Sequential(
174
- (c_fc): Linear(in_features=768, out_features=3072, bias=True)
175
- (gelu): QuickGELU()
176
- (c_proj): Linear(in_features=3072, out_features=768, bias=True)
177
- )
178
- (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
179
- )
180
- (9): ResidualAttentionBlock(
181
- (attn): MultiheadAttention(
182
- (out_proj): NonDynamicallyQuantizableLinear(in_features=768, out_features=768, bias=True)
183
- )
184
- (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
185
- (mlp): Sequential(
186
- (c_fc): Linear(in_features=768, out_features=3072, bias=True)
187
- (gelu): QuickGELU()
188
- (c_proj): Linear(in_features=3072, out_features=768, bias=True)
189
- )
190
- (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
191
- )
192
- (10): ResidualAttentionBlock(
193
- (attn): MultiheadAttention(
194
- (out_proj): NonDynamicallyQuantizableLinear(in_features=768, out_features=768, bias=True)
195
- )
196
- (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
197
- (mlp): Sequential(
198
- (c_fc): Linear(in_features=768, out_features=3072, bias=True)
199
- (gelu): QuickGELU()
200
- (c_proj): Linear(in_features=3072, out_features=768, bias=True)
201
- )
202
- (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
203
- )
204
- (11): ResidualAttentionBlock(
205
- (attn): MultiheadAttention(
206
- (out_proj): NonDynamicallyQuantizableLinear(in_features=768, out_features=768, bias=True)
207
- )
208
- (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
209
- (mlp): Sequential(
210
- (c_fc): Linear(in_features=768, out_features=3072, bias=True)
211
- (gelu): QuickGELU()
212
- (c_proj): Linear(in_features=3072, out_features=768, bias=True)
213
- )
214
- (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
215
- )
216
- )
217
- )
218
- (ln_post): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
219
- )
220
- (transformer): Transformer(
221
- (resblocks): Sequential(
222
- (0): ResidualAttentionBlock(
223
- (attn): MultiheadAttention(
224
- (out_proj): NonDynamicallyQuantizableLinear(in_features=512, out_features=512, bias=True)
225
- )
226
- (ln_1): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
227
- (mlp): Sequential(
228
- (c_fc): Linear(in_features=512, out_features=2048, bias=True)
229
- (gelu): QuickGELU()
230
- (c_proj): Linear(in_features=2048, out_features=512, bias=True)
231
- )
232
- (ln_2): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
233
- )
234
- (1): ResidualAttentionBlock(
235
- (attn): MultiheadAttention(
236
- (out_proj): NonDynamicallyQuantizableLinear(in_features=512, out_features=512, bias=True)
237
- )
238
- (ln_1): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
239
- (mlp): Sequential(
240
- (c_fc): Linear(in_features=512, out_features=2048, bias=True)
241
- (gelu): QuickGELU()
242
- (c_proj): Linear(in_features=2048, out_features=512, bias=True)
243
- )
244
- (ln_2): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
245
- )
246
- (2): ResidualAttentionBlock(
247
- (attn): MultiheadAttention(
248
- (out_proj): NonDynamicallyQuantizableLinear(in_features=512, out_features=512, bias=True)
249
- )
250
- (ln_1): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
251
- (mlp): Sequential(
252
- (c_fc): Linear(in_features=512, out_features=2048, bias=True)
253
- (gelu): QuickGELU()
254
- (c_proj): Linear(in_features=2048, out_features=512, bias=True)
255
- )
256
- (ln_2): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
257
- )
258
- (3): ResidualAttentionBlock(
259
- (attn): MultiheadAttention(
260
- (out_proj): NonDynamicallyQuantizableLinear(in_features=512, out_features=512, bias=True)
261
- )
262
- (ln_1): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
263
- (mlp): Sequential(
264
- (c_fc): Linear(in_features=512, out_features=2048, bias=True)
265
- (gelu): QuickGELU()
266
- (c_proj): Linear(in_features=2048, out_features=512, bias=True)
267
- )
268
- (ln_2): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
269
- )
270
- (4): ResidualAttentionBlock(
271
- (attn): MultiheadAttention(
272
- (out_proj): NonDynamicallyQuantizableLinear(in_features=512, out_features=512, bias=True)
273
- )
274
- (ln_1): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
275
- (mlp): Sequential(
276
- (c_fc): Linear(in_features=512, out_features=2048, bias=True)
277
- (gelu): QuickGELU()
278
- (c_proj): Linear(in_features=2048, out_features=512, bias=True)
279
- )
280
- (ln_2): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
281
- )
282
- (5): ResidualAttentionBlock(
283
- (attn): MultiheadAttention(
284
- (out_proj): NonDynamicallyQuantizableLinear(in_features=512, out_features=512, bias=True)
285
- )
286
- (ln_1): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
287
- (mlp): Sequential(
288
- (c_fc): Linear(in_features=512, out_features=2048, bias=True)
289
- (gelu): QuickGELU()
290
- (c_proj): Linear(in_features=2048, out_features=512, bias=True)
291
- )
292
- (ln_2): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
293
- )
294
- (6): ResidualAttentionBlock(
295
- (attn): MultiheadAttention(
296
- (out_proj): NonDynamicallyQuantizableLinear(in_features=512, out_features=512, bias=True)
297
- )
298
- (ln_1): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
299
- (mlp): Sequential(
300
- (c_fc): Linear(in_features=512, out_features=2048, bias=True)
301
- (gelu): QuickGELU()
302
- (c_proj): Linear(in_features=2048, out_features=512, bias=True)
303
- )
304
- (ln_2): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
305
- )
306
- (7): ResidualAttentionBlock(
307
- (attn): MultiheadAttention(
308
- (out_proj): NonDynamicallyQuantizableLinear(in_features=512, out_features=512, bias=True)
309
- )
310
- (ln_1): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
311
- (mlp): Sequential(
312
- (c_fc): Linear(in_features=512, out_features=2048, bias=True)
313
- (gelu): QuickGELU()
314
- (c_proj): Linear(in_features=2048, out_features=512, bias=True)
315
- )
316
- (ln_2): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
317
- )
318
- (8): ResidualAttentionBlock(
319
- (attn): MultiheadAttention(
320
- (out_proj): NonDynamicallyQuantizableLinear(in_features=512, out_features=512, bias=True)
321
- )
322
- (ln_1): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
323
- (mlp): Sequential(
324
- (c_fc): Linear(in_features=512, out_features=2048, bias=True)
325
- (gelu): QuickGELU()
326
- (c_proj): Linear(in_features=2048, out_features=512, bias=True)
327
- )
328
- (ln_2): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
329
- )
330
- (9): ResidualAttentionBlock(
331
- (attn): MultiheadAttention(
332
- (out_proj): NonDynamicallyQuantizableLinear(in_features=512, out_features=512, bias=True)
333
- )
334
- (ln_1): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
335
- (mlp): Sequential(
336
- (c_fc): Linear(in_features=512, out_features=2048, bias=True)
337
- (gelu): QuickGELU()
338
- (c_proj): Linear(in_features=2048, out_features=512, bias=True)
339
- )
340
- (ln_2): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
341
- )
342
- (10): ResidualAttentionBlock(
343
- (attn): MultiheadAttention(
344
- (out_proj): NonDynamicallyQuantizableLinear(in_features=512, out_features=512, bias=True)
345
- )
346
- (ln_1): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
347
- (mlp): Sequential(
348
- (c_fc): Linear(in_features=512, out_features=2048, bias=True)
349
- (gelu): QuickGELU()
350
- (c_proj): Linear(in_features=2048, out_features=512, bias=True)
351
- )
352
- (ln_2): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
353
- )
354
- (11): ResidualAttentionBlock(
355
- (attn): MultiheadAttention(
356
- (out_proj): NonDynamicallyQuantizableLinear(in_features=512, out_features=512, bias=True)
357
- )
358
- (ln_1): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
359
- (mlp): Sequential(
360
- (c_fc): Linear(in_features=512, out_features=2048, bias=True)
361
- (gelu): QuickGELU()
362
- (c_proj): Linear(in_features=2048, out_features=512, bias=True)
363
- )
364
- (ln_2): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
365
- )
366
- )
367
- )
368
  (token_embedding): Embedding(49408, 512)
369
  (ln_final): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
370
  )
1
  ---
2
  pipeline_tag: sentence-similarity
3
+ license: apache-2.0
4
  tags:
5
  - sentence-transformers
6
  - feature-extraction
7
  - sentence-similarity
8
  - transformers
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  ---
10
 
11
  # sentence-transformers/clip-ViT-B-32
12
 
13
+ This the [OpenAI CLIP Model](https://github.com/openai/CLIP) ported to [sentence-transformers](https://www.SBERT.net) model: It maps images and text to a shared vector space.
14
 
15
 
16
 
48
  SentenceTransformer(
49
  (0): CLIPModel(
50
  (model): CLIP(
51
+ (visual): VisualTransformer()
52
+ (transformer): Transformer()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
53
  (token_embedding): Embedding(49408, 512)
54
  (ln_final): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
55
  )