veneres commited on
Commit
81b559d
1 Parent(s): 6c2eac0

Update wrapup.md

Browse files

Hi!
I just edited the wrap-up with two typo corrections. The typo in "pretokenized" is subtle since there is no argument validation in the IterDictIndexer constructor, and "tokenized" is the standard spelling. Even though it is a small typo, I think it should be corrected.
Thanks!

Files changed (1) hide show
  1. wrapup.md +2 -2
wrapup.md CHANGED
@@ -1,6 +1,6 @@
1
  ### Putting it all together
2
 
3
- When you use the document encoder in an indexing pipeline, the rewritting document contents are indexed:
4
 
5
  <div class="pipeline">
6
  <div class="df" title="Document Frame">D</div>
@@ -18,7 +18,7 @@ import pyt_splade
18
  dataset = pt.get_dataset('irds:msmarco-passage')
19
  splade = pyt_splade.SpladeFactory()
20
 
21
- indexer = pt.IterDictIndexer('./msmarco_psg', pretokenized=True)
22
 
23
  indxer_pipe = splade.indexing() >> indexer
24
  indxer_pipe.index(dataset.get_corpus_iter())
 
1
  ### Putting it all together
2
 
3
+ When you use the document encoder in an indexing pipeline, the rewritten document contents are indexed:
4
 
5
  <div class="pipeline">
6
  <div class="df" title="Document Frame">D</div>
 
18
  dataset = pt.get_dataset('irds:msmarco-passage')
19
  splade = pyt_splade.SpladeFactory()
20
 
21
+ indexer = pt.IterDictIndexer('./msmarco_psg', pretokenised=True)
22
 
23
  indxer_pipe = splade.indexing() >> indexer
24
  indxer_pipe.index(dataset.get_corpus_iter())