michaelfeil commited on
Commit
2557a64
1 Parent(s): b795d29

Upload sentence-transformers/all-MiniLM-L6-v2 ctranslate fp16 weights

Browse files
Files changed (1) hide show
  1. README.md +20 -15
README.md CHANGED
@@ -38,21 +38,11 @@ Speedup inference while reducing memory by 2x-4x using int8 inference in C++ on
38
 
39
  quantized version of [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2)
40
  ```bash
41
- pip install hf-hub-ctranslate2>=2.0.8 ctranslate2>=3.16.0
42
- ```
43
- Converted on 2023-06-15 using
44
- ```
45
- ct2-transformers-converter --model sentence-transformers/all-MiniLM-L6-v2 --output_dir ~/tmp-ct2fast-all-MiniLM-L6-v2 --force --copy_files config_sentence_transformers.json tokenizer.json modules.json README.md tokenizer_config.json sentence_bert_config.json data_config.json vocab.txt special_tokens_map.json .gitattributes --quantization float16 --trust_remote_code
46
  ```
47
 
48
- Checkpoint compatible to [ctranslate2>=3.16.0](https://github.com/OpenNMT/CTranslate2)
49
- and [hf-hub-ctranslate2>=2.0.8](https://github.com/michaelfeil/hf-hub-ctranslate2)
50
- - `compute_type=int8_float16` for `device="cuda"`
51
- - `compute_type=int8` for `device="cpu"`
52
-
53
  ```python
54
- from transformers import AutoTokenizer
55
-
56
  model_name = "michaelfeil/ct2fast-all-MiniLM-L6-v2"
57
 
58
  from hf_hub_ctranslate2 import EncoderCT2fromHfHub
@@ -63,10 +53,25 @@ model = EncoderCT2fromHfHub(
63
  compute_type="float16",
64
  # tokenizer=AutoTokenizer.from_pretrained("{ORG}/{NAME}")
65
  )
66
- outputs = model.generate(
67
- text=["I like soccer", "I like tennis", "The eiffel tower is in Paris"],
 
 
 
68
  )
69
- print(outputs.shape, outputs)
 
 
 
 
 
 
 
 
 
 
 
 
70
  ```
71
 
72
  # Licence and other remarks:
 
38
 
39
  quantized version of [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2)
40
  ```bash
41
+ pip install hf-hub-ctranslate2>=2.10.0 ctranslate2>=3.16.0
 
 
 
 
42
  ```
43
 
 
 
 
 
 
44
  ```python
45
+ # from transformers import AutoTokenizer
 
46
  model_name = "michaelfeil/ct2fast-all-MiniLM-L6-v2"
47
 
48
  from hf_hub_ctranslate2 import EncoderCT2fromHfHub
 
53
  compute_type="float16",
54
  # tokenizer=AutoTokenizer.from_pretrained("{ORG}/{NAME}")
55
  )
56
+ embeddings = model.encode(
57
+ ["I like soccer", "I like tennis", "The eiffel tower is in Paris"],
58
+ batch_size=32,
59
+ convert_to_numpy=True,
60
+ normalize_embeddings=True,
61
  )
62
+ print(embeddings.shape, embeddings)
63
+ scores = (embeddings @ embeddings.T) * 100
64
+
65
+ ```
66
+
67
+ Checkpoint compatible to [ctranslate2>=3.16.0](https://github.com/OpenNMT/CTranslate2)
68
+ and [hf-hub-ctranslate2>=2.10.0](https://github.com/michaelfeil/hf-hub-ctranslate2)
69
+ - `compute_type=int8_float16` for `device="cuda"`
70
+ - `compute_type=int8` for `device="cpu"`
71
+
72
+ Converted on 2023-06-16 using
73
+ ```
74
+ ct2-transformers-converter --model sentence-transformers/all-MiniLM-L6-v2 --output_dir ~/tmp-ct2fast-all-MiniLM-L6-v2 --force --copy_files config_sentence_transformers.json tokenizer.json modules.json README.md tokenizer_config.json sentence_bert_config.json data_config.json vocab.txt special_tokens_map.json .gitattributes --quantization float16 --trust_remote_code
75
  ```
76
 
77
  # Licence and other remarks: