Files changed (1) hide show
  1. README.md +49 -16
README.md CHANGED
@@ -1,30 +1,63 @@
1
  ---
2
  language:
 
3
  - ja
 
4
  ---
5
 
6
- # Model Card for answer-finder-v1-jp
 
 
 
 
 
7
 
8
- This model is a Japanese Answer-Finder. It retrieves answers from questions and passages for the Sinequa Search Plateform.
9
 
10
- # Supported Languages
11
 
12
  - Japanese
13
 
14
- # Model score
15
- || Relevance Score |
16
- |:-----------------|:---:|
17
- |__Japanese__ | <span style="font-size:200%;color:#ffd21e;">&starf;</span><span style="font-size:200%;color:#ffd21e;">&starf;</span><span style="font-size:200%;color:#ffd21e;">&starf;</span><span style="font-size:200%;color:#ffd21e;">&starf;</span><span style="font-size:200%;color:#ffd21e;">&starf;</span><span style="font-size:200%;color:#ffd21e;">&starf;</span><span style="font-size:200%;color:#ffd21e;">&starf;</span><span style="font-size:200%;color:#ffd21e;">&starf;</span><span style="font-size:200%;color:#ffd21e;">&starf;</span><span style="font-size:200%;color:black;">&starf;</span> |
18
- ||
19
 
20
- |Speed Score|
21
- |---|
22
- |<span style="font-size:200%;color:#ffd21e;">&starf;</span><span style="font-size:200%;color:#ffd21e;">&starf;</span><span style="font-size:200%;color:#ffd21e;">&starf;</span><span style="font-size:200%;color:#ffd21e;">&starf;</span><span style="font-size:200%;color:#ffd21e;">&starf;</span><span style="font-size:200%;color:black;">&starf;</span><span style="font-size:200%;color:black;">&starf;</span><span style="font-size:200%;color:black;">&starf;</span><span style="font-size:200%;color:black;">&starf;</span><span style="font-size:200%;color:black;">&starf;</span>|
23
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
24
 
 
 
 
25
 
26
- # Training data
27
 
28
- | Dataset | Paper |
29
- |--------------------------------------------------------|:----------------------------------------:|
30
- | [JSQuAD](https://github.com/yahoojapan/JGLUE) | [paper](https://aclanthology.org/2022.lrec-1.317.pdf) |
 
1
  ---
2
  language:
3
+
4
  - ja
5
+
6
  ---
7
 
8
+ # Model Card for answer-finder-v1-L-ja
9
+
10
+ This model is a question answering model developed by Sinequa. It produces two lists of logit scores corresponding to
11
+ the start token and end token of an answer.
12
+
13
+ Model name: `answer-finder-v1-L-ja`
14
 
15
+ ## Supported Languages
16
 
17
+ The model was trained and tested in the following languages:
18
 
19
  - Japanese
20
 
21
+ Besides the aforementioned languages, basic support can be expected for the 104 languages that were used during the
22
+ pretraining of the base model (See [original repository](https://github.com/google-research/bert)).
 
 
 
23
 
24
+ ## Scores
25
+
26
+ | Metric | Value |
27
+ |:--------------------------------------------------------------|-------:|
28
+ | F1 Score on JSQuAD with Hugging Face evaluation pipeline | 92.1 |
29
+ | F1 Score on JSQuAD with Haystack evaluation pipeline | 91.5 |
30
+
31
+ ## Inference Time
32
+
33
+ | GPU | Batch size 1 | Batch size 32 |
34
+ |:--------------------------------------------------------------|---------------:|---------------:|
35
+ | NVIDIA A10 | 4 ms | 84 ms |
36
+ | NVIDIA T4 | 15 ms | 361 ms |
37
+
38
+ The inference times only measure the time the model takes to process a single batch, it does not include pre- or
39
+ post-processing steps like the tokenization.
40
+
41
+ **Note that the Answer Finder models are only used at query time.**
42
+
43
+ ## Requirements
44
+
45
+ - Minimal Sinequa version: 11.10.0
46
+ - GPU memory usage: TODO
47
+
48
+ Note that GPU memory usage only includes how much GPU memory the actual model consumes on an NVIDIA T4 GPU with a batch
49
+ size of 32. It does not include the fix amount of memory that is consumed by the ONNX Runtime upon initialization which
50
+ can be around 0.5 to 1 GiB depending on the used GPU.
51
+
52
+ ## Model Details
53
+
54
+ ### Overview
55
 
56
+ - Number of parameters: 110 million
57
+ - Base language model: [bert-base-multilingual-cased](https://huggingface.co/bert-base-multilingual-cased)
58
+ - Sensitive to casing and accents
59
 
60
+ ### Training Data
61
 
62
+ - [JSQuAD](https://github.com/yahoojapan/JGLUE) see [Paper](https://aclanthology.org/2022.lrec-1.317.pdf)
63
+ - Japanese translation of SQuAD v2 "impossible" query-passage pairs