michaelfeil
/

ct2fast-nllb-200-3.3B

text2text-generation

Model card Files Files and versions Community

michaelfeil commited on Jul 20, 2023

Commit

2ec3559

•

1 Parent(s): 65217ea

Update README.md

Files changed (1) hide show

README.md +1 -17

README.md CHANGED Viewed

@@ -219,27 +219,11 @@ Speedup inference while reducing memory by 2x-4x using int8 inference in C++ on
 quantized version of [facebook/nllb-200-3.3B](https://huggingface.co/facebook/nllb-200-3.3B)
 ```bash
-pip install hf-hub-ctranslate2>=2.12.0 ctranslate2>=3.16.0
 ```
 ```python
-# from transformers import AutoTokenizer
-model_name = "michaelfeil/ct2fast-nllb-200-3.3B"
-from hf_hub_ctranslate2 import TranslatorCT2fromHfHub
-model = TranslatorCT2fromHfHub(
-        # load in int8 on CUDA
-        model_name_or_path=model_name,
-        device="cuda",
-        compute_type="int8_float16",
-        # tokenizer=AutoTokenizer.from_pretrained("{ORG}/{NAME}")
-)
-outputs = model.generate(
-    text=["def fibonnaci(", "User: How are you doing? Bot:"],
-    max_length=64,
-)
-print(outputs)
 ```
 Checkpoint compatible to [ctranslate2>=3.16.0](https://github.com/OpenNMT/CTranslate2)

 quantized version of [facebook/nllb-200-3.3B](https://huggingface.co/facebook/nllb-200-3.3B)
 ```bash
+pip install ctranslate2>=3.16.0
 ```
 ```python
 ```
 Checkpoint compatible to [ctranslate2>=3.16.0](https://github.com/OpenNMT/CTranslate2)