TRI-ML
/

mistral-supra

Text Generation

Model card Files Files and versions Community

sedrickkeh commited on May 2, 2024

Commit

ab4dcf3

·

verified ·

1 Parent(s): 5b76332

Update README.md

Files changed (1) hide show

README.md +4 -1

README.md CHANGED Viewed

@@ -139,11 +139,14 @@ print(output)
 # Machine learning is a branch of artificial intelligence (AI) that enables computers to learn from experience without being explicitly programmed. Machine learning is used in a wide range of applications, including spam filtering, image recognition, speech recognition, and computer-based medical diagnosis
 ```
-The Mistral-SUPRA model can be used both in parallel mode and in recurrent mode. If `use_cache` is set to `False` for `model.generate(...)`, then it will use recurrent mode; otherwise, it will use parallel mode.
 The recurrent model uses `xformers` and requires the inputs and models to be loaded to GPU.
 ```python
 # Recurrent mode
 output = model.to('cuda').generate(inputs['input_ids'].to('cuda'), use_cache=False, **gen_kwargs)
 ```

 # Machine learning is a branch of artificial intelligence (AI) that enables computers to learn from experience without being explicitly programmed. Machine learning is used in a wide range of applications, including spam filtering, image recognition, speech recognition, and computer-based medical diagnosis
 ```
+The Mistral-SUPRA model can be used both in parallel mode and in recurrent mode. If `use_cache` is set to `False` for `model.generate(...)`, then it will use parallel mode; otherwise, it will use recurrent mode.
 The recurrent model uses `xformers` and requires the inputs and models to be loaded to GPU.
 ```python
 # Recurrent mode
+output = model.to('cuda').generate(inputs['input_ids'].to('cuda'), use_cache=True, **gen_kwargs)
+# Parallel mode
 output = model.to('cuda').generate(inputs['input_ids'].to('cuda'), use_cache=False, **gen_kwargs)
 ```