youval commited on
Commit
960253a
1 Parent(s): 1cbdf0f

Update model card (#2)

Browse files

- update model card (5f951096b5aa3dbf993ca791f5e148908ba7aea9)

Files changed (1) hide show
  1. README.md +95 -88
README.md CHANGED
@@ -1,88 +1,95 @@
1
- ---
2
- pipeline_tag: sentence-similarity
3
- tags:
4
- - feature-extraction
5
- - sentence-similarity
6
- language:
7
- - en
8
- ---
9
-
10
- # Model Card for `vectorizer-v1-S-en`
11
-
12
- This model is a vectorizer developed by Sinequa. It produces an embedding vector given a passage or a query. The
13
- passage vectors are stored in our vector index and the query vector is used at query time to look up relevant passages
14
- in the index.
15
-
16
- Model name: `vectorizer-v1-S-en`
17
-
18
- ## Supported Languages
19
-
20
- The model was trained and tested in the following languages:
21
-
22
- - English
23
-
24
- ## Scores
25
-
26
- | Metric | Value |
27
- |:-----------------------|------:|
28
- | Relevance (Recall@100) | 0.456 |
29
-
30
- Note that the relevance score is computed as an average over 14 retrieval datasets (see
31
- [details below](#evaluation-metrics)).
32
-
33
- ## Inference Times
34
-
35
- | GPU | Batch size 1 (at query time) | Batch size 32 (at indexing) |
36
- |:-----------|-----------------------------:|----------------------------:|
37
- | NVIDIA A10 | 2 ms | 14 ms |
38
- | NVIDIA T4 | 4 ms | 52 ms |
39
-
40
- The inference times only measure the time the model takes to process a single batch, it does not include pre- or
41
- post-processing steps like the tokenization.
42
-
43
- ## Requirements
44
-
45
- - Minimal Sinequa version: 11.10.0
46
- - GPU memory usage: 330 MiB
47
-
48
- Note that GPU memory usage only includes how much GPU memory the actual model consumes on an NVIDIA T4 GPU with a batch
49
- size of 32. It does not include the fix amount of memory that is consumed by the ONNX Runtime upon initialization which
50
- can be around 0.5 to 1 GiB depending on the used GPU.
51
-
52
- ## Model Details
53
-
54
- ### Overview
55
-
56
- - Number of parameters: 29 million
57
- - Base language model: [English BERT-Small](https://huggingface.co/google/bert_uncased_L-4_H-512_A-8)
58
- - Insensitive to casing and accents
59
- - Output dimensions: 256 (reduced with an additional dense layer)
60
- - Training procedure: A first model was trained with query-passage pairs, using the in-batch negative strategy with [this loss](https://www.sbert.net/docs/package_reference/losses.html#multiplenegativesrankingloss). A second model was then trained on query-passage-negative triplets with negatives mined from the previous model, like a variant of [ANCE](https://arxiv.org/pdf/2007.00808.pdf) but with different hyper parameters.
61
-
62
- ### Training Data
63
-
64
- The model was trained on a Sinequa curated version of Google's [Natural Questions](https://ai.google.com/research/NaturalQuestions).
65
-
66
- ### Evaluation Metrics
67
-
68
- To determine the relevance score, we averaged the results that we obtained when evaluating on the datasets of the
69
- [BEIR benchmark](https://github.com/beir-cellar/beir). Note that all these datasets are in English.
70
-
71
- | Dataset | Recall@100 |
72
- |:------------------|-----------:|
73
- | Average | 0.456 |
74
- | | |
75
- | Arguana | 0.832 |
76
- | CLIMATE-FEVER | 0.342 |
77
- | DBPedia Entity | 0.299 |
78
- | FEVER | 0.660 |
79
- | FiQA-2018 | 0.301 |
80
- | HotpotQA | 0.434 |
81
- | MS MARCO | 0.610 |
82
- | NFCorpus | 0.159 |
83
- | NQ | 0.671 |
84
- | Quora | 0.966 |
85
- | SCIDOCS | 0.194 |
86
- | SciFact | 0.592 |
87
- | TREC-COVID | 0.037 |
88
- | Webis-Touche-2020 | 0.285 |
 
 
 
 
 
 
 
 
1
+ ---
2
+ pipeline_tag: sentence-similarity
3
+ tags:
4
+ - feature-extraction
5
+ - sentence-similarity
6
+ language:
7
+ - en
8
+ ---
9
+
10
+ # Model Card for `vectorizer-v1-S-en`
11
+
12
+ This model is a vectorizer developed by Sinequa. It produces an embedding vector given a passage or a query. The passage vectors are stored in our vector index and the query vector is used at query time to look up relevant passages in the index.
13
+
14
+ Model name: `vectorizer-v1-S-en`
15
+
16
+ ## Supported Languages
17
+
18
+ The model was trained and tested in the following languages:
19
+
20
+ - English
21
+
22
+ ## Scores
23
+
24
+ | Metric | Value |
25
+ |:-----------------------|------:|
26
+ | Relevance (Recall@100) | 0.456 |
27
+
28
+ Note that the relevance score is computed as an average over 14 retrieval datasets (see
29
+ [details below](#evaluation-metrics)).
30
+
31
+ ## Inference Times
32
+
33
+ | GPU | Quantization type | Batch size 1 | Batch size 32 |
34
+ |:------------------------------------------|:------------------|---------------:|---------------:|
35
+ | NVIDIA A10 | FP16 | 1 ms | 4 ms |
36
+ | NVIDIA A10 | FP32 | 2 ms | 13 ms |
37
+ | NVIDIA T4 | FP16 | 1 ms | 13 ms |
38
+ | NVIDIA T4 | FP32 | 2 ms | 52 ms |
39
+ | NVIDIA L4 | FP16 | 1 ms | 5 ms |
40
+ | NVIDIA L4 | FP32 | 2 ms | 18 ms |
41
+
42
+ ## Gpu Memory usage
43
+
44
+ | Quantization type | Memory |
45
+ |:-------------------------------------------------|-----------:|
46
+ | FP16 | 300 MiB |
47
+ | FP32 | 500 MiB |
48
+
49
+ Note that GPU memory usage only includes how much GPU memory the actual model consumes on an NVIDIA T4 GPU with a batch
50
+ size of 32. It does not include the fix amount of memory that is consumed by the ONNX Runtime upon initialization which
51
+ can be around 0.5 to 1 GiB depending on the used GPU.
52
+
53
+ ## Requirements
54
+
55
+ - Minimal Sinequa version: 11.10.0
56
+ - Minimal Sinequa version for using FP16 models and GPUs with CUDA compute capability of 8.9+ (like NVIDIA L4): 11.11.0
57
+ - [Cuda compute capability](https://developer.nvidia.com/cuda-gpus): above 5.0 (above 6.0 for FP16 use)
58
+
59
+ ## Model Details
60
+
61
+ ### Overview
62
+
63
+ - Number of parameters: 29 million
64
+ - Base language model: [English BERT-Small](https://huggingface.co/google/bert_uncased_L-4_H-512_A-8)
65
+ - Insensitive to casing and accents
66
+ - Output dimensions: 256 (reduced with an additional dense layer)
67
+ - Training procedure: A first model was trained with query-passage pairs, using the in-batch negative strategy with [this loss](https://www.sbert.net/docs/package_reference/losses.html#multiplenegativesrankingloss). A second model was then trained on query-passage-negative triplets with negatives mined from the previous model, like a variant of [ANCE](https://arxiv.org/pdf/2007.00808.pdf) but with different hyper parameters.
68
+
69
+ ### Training Data
70
+
71
+ The model was trained on a Sinequa curated version of Google's [Natural Questions](https://ai.google.com/research/NaturalQuestions).
72
+
73
+ ### Evaluation Metrics
74
+
75
+ To determine the relevance score, we averaged the results that we obtained when evaluating on the datasets of the
76
+ [BEIR benchmark](https://github.com/beir-cellar/beir). Note that all these datasets are in English.
77
+
78
+ | Dataset | Recall@100 |
79
+ |:------------------|-----------:|
80
+ | Average | 0.456 |
81
+ | | |
82
+ | Arguana | 0.832 |
83
+ | CLIMATE-FEVER | 0.342 |
84
+ | DBPedia Entity | 0.299 |
85
+ | FEVER | 0.660 |
86
+ | FiQA-2018 | 0.301 |
87
+ | HotpotQA | 0.434 |
88
+ | MS MARCO | 0.610 |
89
+ | NFCorpus | 0.159 |
90
+ | NQ | 0.671 |
91
+ | Quora | 0.966 |
92
+ | SCIDOCS | 0.194 |
93
+ | SciFact | 0.592 |
94
+ | TREC-COVID | 0.037 |
95
+ | Webis-Touche-2020 | 0.285 |