BSC-LT
/

salamandra-2b-instruct-fp8

Text Generation

text-generation-inference

Inference Endpoints

🇪🇺 Region: EU

Model card Files Files and versions Community

ferran-espuna commited on Nov 7, 2024

Commit

325e6dd

·

verified ·

1 Parent(s): 731cf33

Update README.md

Files changed (1) hide show

README.md +27 -1

README.md CHANGED Viewed

@@ -63,7 +63,33 @@ This model card corresponds to the fp8-quantized version of Salamandra-2b-instru
 The entire Salamandra family is released under a permissive [Apache 2.0 license]((https://www.apache.org/licenses/LICENSE-2.0)).
-## Additional information
 ### Author
 International Business Machines (IBM).

 The entire Salamandra family is released under a permissive [Apache 2.0 license]((https://www.apache.org/licenses/LICENSE-2.0)).
+## How to Use
+The following example code works under ``Python 3.9.16``, ``vllm==0.6.3.post1``, ``torch==2.4.0`` and ``torchvision==0.19.0``, though it should run on
+any current version of the libraries. This example provides a chat interface for the model.
+```
+from vllm import LLM, SamplingParams
+model_name = "BSC-LT/salamandra-2b-instruct-fp8"
+llm = LLM(model=model_name)
+messages = []
+while True:
+    user_input = input("user >> ")
+    if user_input.lower() == "exit":
+        print("Chat ended.")
+        break
+    messages.append({'role': 'user', 'content': user_input})
+    outputs = llm.chat(messages, sampling_params=SamplingParams(temperature=0.5, stop_token_ids=[5], max_tokens=200))[0].outputs
+    model_output = outputs[0].text
+    print(f'assistant >> {model_output}')
+    messages.append({'role': 'assistant', 'content': model_output})
+```
 ### Author
 International Business Machines (IBM).