michaelfeil commited on
Commit
720aeb7
1 Parent(s): 48221dd

Upload bigcode/starcoder ctranslate fp16 weights

Browse files
Files changed (1) hide show
  1. README.md +7 -6
README.md CHANGED
@@ -266,13 +266,14 @@ quantized version of [bigcode/starcoder](https://huggingface.co/bigcode/starcode
266
  ```bash
267
  pip install hf-hub-ctranslate2>=2.0.8 ctranslate2>=3.14.0
268
  ```
269
- Converted on 2023-05-31 using
270
  ```
271
  ct2-transformers-converter --model bigcode/starcoder --output_dir /home/michael/tmp-ct2fast-starcoder --force --copy_files merges.txt tokenizer.json README.md tokenizer_config.json vocab.json generation_config.json special_tokens_map.json .gitattributes --quantization float16 --trust_remote_code
272
  ```
273
 
274
- Checkpoint compatible to [ctranslate2>=3.14.0](https://github.com/OpenNMT/CTranslate2) and [hf-hub-ctranslate2>=2.0.8](https://github.com/michaelfeil/hf-hub-ctranslate2)
275
- - `compute_type=int8_float16` for `device="cuda"`
 
276
  - `compute_type=int8` for `device="cpu"`
277
 
278
  ```python
@@ -283,14 +284,14 @@ model_name = "michaelfeil/ct2fast-starcoder"
283
  # use either TranslatorCT2fromHfHub or GeneratorCT2fromHfHub here, depending on model.
284
  model = GeneratorCT2fromHfHub(
285
  # load in int8 on CUDA
286
- model_name_or_path=model_name,
287
  device="cuda",
288
  compute_type="int8_float16",
289
  # tokenizer=AutoTokenizer.from_pretrained("bigcode/starcoder")
290
  )
291
  outputs = model.generate(
292
- text=["How do you call a fast Flan-ingo?", "User: How are you doing? Bot:"],
293
- max_length=64,
294
  include_prompt_in_result=False
295
  )
296
  print(outputs)
 
266
  ```bash
267
  pip install hf-hub-ctranslate2>=2.0.8 ctranslate2>=3.14.0
268
  ```
269
+ Converted on 2023-06-01 using
270
  ```
271
  ct2-transformers-converter --model bigcode/starcoder --output_dir /home/michael/tmp-ct2fast-starcoder --force --copy_files merges.txt tokenizer.json README.md tokenizer_config.json vocab.json generation_config.json special_tokens_map.json .gitattributes --quantization float16 --trust_remote_code
272
  ```
273
 
274
+ Checkpoint compatible to [ctranslate2>=3.14.0](https://github.com/OpenNMT/CTranslate2)
275
+ and [hf-hub-ctranslate2>=2.0.8](https://github.com/michaelfeil/hf-hub-ctranslate2)
276
+ - `compute_type=int8_float16` for `device="cuda"`
277
  - `compute_type=int8` for `device="cpu"`
278
 
279
  ```python
 
284
  # use either TranslatorCT2fromHfHub or GeneratorCT2fromHfHub here, depending on model.
285
  model = GeneratorCT2fromHfHub(
286
  # load in int8 on CUDA
287
+ model_name_or_path=model_name,
288
  device="cuda",
289
  compute_type="int8_float16",
290
  # tokenizer=AutoTokenizer.from_pretrained("bigcode/starcoder")
291
  )
292
  outputs = model.generate(
293
+ text=["def fibonnaci(", "User: How are you doing? Bot:"],
294
+ max_length=64,
295
  include_prompt_in_result=False
296
  )
297
  print(outputs)