michaelfeil commited on
Commit
a6788a4
1 Parent(s): a3834ca

Upload Salesforce/codegen-350M-mono ctranslate fp16 weights

Browse files
Files changed (1) hide show
  1. README.md +3 -2
README.md CHANGED
@@ -11,7 +11,7 @@ Speedup inference while reducing memory by 2x-4x using int8 inference in C++ on
11
 
12
  quantized version of [Salesforce/codegen-350M-mono](https://huggingface.co/Salesforce/codegen-350M-mono)
13
  ```bash
14
- pip install hf-hub-ctranslate2>=2.0.7
15
  ```
16
  Converted on 2023-05-21 using
17
  ```
@@ -33,10 +33,11 @@ model = GeneratorCT2fromHfHub(
33
  model_name_or_path=model_name,
34
  device="cuda",
35
  compute_type="int8_float16",
36
- tokenizer=AutoTokenizer.from_pretrained("Salesforce/codegen-350M-mono")
37
  )
38
  outputs = model.generate(
39
  text=["def print_hello_world():", "def hello_name(name:"],
 
40
  )
41
  print(outputs)
42
  ```
 
11
 
12
  quantized version of [Salesforce/codegen-350M-mono](https://huggingface.co/Salesforce/codegen-350M-mono)
13
  ```bash
14
+ pip install hf-hub-ctranslate2>=2.0.8
15
  ```
16
  Converted on 2023-05-21 using
17
  ```
 
33
  model_name_or_path=model_name,
34
  device="cuda",
35
  compute_type="int8_float16",
36
+ # tokenizer=AutoTokenizer.from_pretrained("Salesforce/codegen-350M-mono")
37
  )
38
  outputs = model.generate(
39
  text=["def print_hello_world():", "def hello_name(name:"],
40
+ max_length=64
41
  )
42
  print(outputs)
43
  ```