Commit
·
e31fa91
1
Parent(s):
faa6aeb
Update README.md
Browse files
README.md
CHANGED
@@ -17,7 +17,6 @@ library_name: colbert
|
|
17 |
This is a [ColBERTv1](https://github.com/stanford-futuredata/ColBERT) model: it encodes queries & passages into matrices of token-level embeddings and efficiently finds passages that contextually match the query using scalable vector-similarity (MaxSim) operators. It can be used for tasks like clustering or semantic search. The model was trained on the **French** portion of the [mMARCO](https://huggingface.co/datasets/unicamp-dl/mmarco) dataset.
|
18 |
|
19 |
## Usage
|
20 |
-
***
|
21 |
|
22 |
Using ColBERT on a dataset typically involves the following steps:
|
23 |
|
@@ -59,22 +58,16 @@ if __name__=='__main__':
|
|
59 |
|
60 |
|
61 |
## Evaluation
|
62 |
-
***
|
63 |
|
64 |
We evaluated our model on the smaller development set of mMARCO-fr, which consists of 6,980 queries for a corpus of 8.8M candidate passages.
|
65 |
|
66 |
[...]
|
67 |
|
68 |
## Training
|
69 |
-
***
|
70 |
|
71 |
-
####
|
72 |
|
73 |
-
We used the [camembert-base](https://huggingface.co/camembert-base) model and fine-tuned it on a 500K sentence triples dataset in French via pairwise softmax cross-entropy loss over the computed scores of the positive and negative passages associated to a query.
|
74 |
-
|
75 |
-
#### Hyperparameters
|
76 |
-
|
77 |
-
We trained the model on a single Tesla V100 GPU with 32GBs of memory during 200k steps using a batch size of 64. We used the AdamW optimizer with a constant learning rate of 3e-06. The passage length was limited to 256 tokens and the query length to 32 tokens.
|
78 |
|
79 |
#### Data
|
80 |
|
|
|
17 |
This is a [ColBERTv1](https://github.com/stanford-futuredata/ColBERT) model: it encodes queries & passages into matrices of token-level embeddings and efficiently finds passages that contextually match the query using scalable vector-similarity (MaxSim) operators. It can be used for tasks like clustering or semantic search. The model was trained on the **French** portion of the [mMARCO](https://huggingface.co/datasets/unicamp-dl/mmarco) dataset.
|
18 |
|
19 |
## Usage
|
|
|
20 |
|
21 |
Using ColBERT on a dataset typically involves the following steps:
|
22 |
|
|
|
58 |
|
59 |
|
60 |
## Evaluation
|
|
|
61 |
|
62 |
We evaluated our model on the smaller development set of mMARCO-fr, which consists of 6,980 queries for a corpus of 8.8M candidate passages.
|
63 |
|
64 |
[...]
|
65 |
|
66 |
## Training
|
|
|
67 |
|
68 |
+
#### Details
|
69 |
|
70 |
+
We used the [camembert-base](https://huggingface.co/camembert-base) model and fine-tuned it on a 500K sentence triples dataset in French via pairwise softmax cross-entropy loss over the computed scores of the positive and negative passages associated to a query. We trained the model on a single Tesla V100 GPU with 32GBs of memory during 200k steps using a batch size of 64. We used the AdamW optimizer with a constant learning rate of 3e-06. The passage length was limited to 256 tokens and the query length to 32 tokens.
|
|
|
|
|
|
|
|
|
71 |
|
72 |
#### Data
|
73 |
|