michaelfeil commited on
Commit
b273205
1 Parent(s): bc4c196

Upload bigcode/starcoderbase ctranslate fp16 weights

Browse files
Files changed (1) hide show
  1. README.md +7 -6
README.md CHANGED
@@ -256,13 +256,14 @@ quantized version of [bigcode/starcoderbase](https://huggingface.co/bigcode/star
256
  ```bash
257
  pip install hf-hub-ctranslate2>=2.0.8 ctranslate2>=3.14.0
258
  ```
259
- Converted on 2023-05-31 using
260
  ```
261
  ct2-transformers-converter --model bigcode/starcoderbase --output_dir /home/michael/tmp-ct2fast-starcoderbase --force --copy_files merges.txt tokenizer.json README.md tokenizer_config.json vocab.json generation_config.json special_tokens_map.json .gitattributes --quantization float16 --trust_remote_code
262
  ```
263
 
264
- Checkpoint compatible to [ctranslate2>=3.14.0](https://github.com/OpenNMT/CTranslate2) and [hf-hub-ctranslate2>=2.0.8](https://github.com/michaelfeil/hf-hub-ctranslate2)
265
- - `compute_type=int8_float16` for `device="cuda"`
 
266
  - `compute_type=int8` for `device="cpu"`
267
 
268
  ```python
@@ -273,14 +274,14 @@ model_name = "michaelfeil/ct2fast-starcoderbase"
273
  # use either TranslatorCT2fromHfHub or GeneratorCT2fromHfHub here, depending on model.
274
  model = GeneratorCT2fromHfHub(
275
  # load in int8 on CUDA
276
- model_name_or_path=model_name,
277
  device="cuda",
278
  compute_type="int8_float16",
279
  # tokenizer=AutoTokenizer.from_pretrained("bigcode/starcoderbase")
280
  )
281
  outputs = model.generate(
282
- text=["How do you call a fast Flan-ingo?", "User: How are you doing? Bot:"],
283
- max_length=64,
284
  include_prompt_in_result=False
285
  )
286
  print(outputs)
 
256
  ```bash
257
  pip install hf-hub-ctranslate2>=2.0.8 ctranslate2>=3.14.0
258
  ```
259
+ Converted on 2023-06-01 using
260
  ```
261
  ct2-transformers-converter --model bigcode/starcoderbase --output_dir /home/michael/tmp-ct2fast-starcoderbase --force --copy_files merges.txt tokenizer.json README.md tokenizer_config.json vocab.json generation_config.json special_tokens_map.json .gitattributes --quantization float16 --trust_remote_code
262
  ```
263
 
264
+ Checkpoint compatible to [ctranslate2>=3.14.0](https://github.com/OpenNMT/CTranslate2)
265
+ and [hf-hub-ctranslate2>=2.0.8](https://github.com/michaelfeil/hf-hub-ctranslate2)
266
+ - `compute_type=int8_float16` for `device="cuda"`
267
  - `compute_type=int8` for `device="cpu"`
268
 
269
  ```python
 
274
  # use either TranslatorCT2fromHfHub or GeneratorCT2fromHfHub here, depending on model.
275
  model = GeneratorCT2fromHfHub(
276
  # load in int8 on CUDA
277
+ model_name_or_path=model_name,
278
  device="cuda",
279
  compute_type="int8_float16",
280
  # tokenizer=AutoTokenizer.from_pretrained("bigcode/starcoderbase")
281
  )
282
  outputs = model.generate(
283
+ text=["def fibonnaci(", "User: How are you doing? Bot:"],
284
+ max_length=64,
285
  include_prompt_in_result=False
286
  )
287
  print(outputs)