BSC-LT
/

salamandra7b_rag_prompt_ca-en-es

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

ankush13r commited on Aug 7, 2024

Commit

d5246c4

·

verified ·

1 Parent(s): bb957f0

Update README.md

Files changed (1) hide show

README.md +46 -3

README.md CHANGED Viewed

@@ -1,3 +1,46 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+---
+## How to use
+This instructed model uses a chat template that must be adhered to the input for conversational use.
+The easiest way to apply it is using the tokenizer's built-in chat template, as shown in the following snippet.
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+import transformers
+import torch
+model_id = "BSC-LT/salamandra7b_rag_prompt_ca-en-es"
+prompt = "Here is a question that you should answer based on the given context. Write a response that answers the question using only information provided in the context. Provide the answer in Spanish."
+context = """Water boils at 100°C (212°F) at standard atmospheric pressure, which is at sea level.
+However, this boiling point can vary depending on altitude and atmospheric pressure.
+At higher altitudes, where atmospheric pressure is lower, water boils at a lower temperature.
+For example, at 2,000 meters (about 6,600 feet) above sea level, water boils at around 93°C (199°F).
+"""
+instruction = "At what temperature does water boil?"
+tokenizer = AutoTokenizer.from_pretrained(model_id)
+model = AutoModelForCausalLM.from_pretrained(
+    model_id,
+    device_map="cuda",
+    torch_dtype=torch.bfloat16
+  )
+content = f"{prompt}\n\nContext:\n{context}\n\nQuestion:\n{instruction}"
+chat = [ { "role": "user", "content": content } ]
+prompt = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
+eos_tokens = [
+    tokenizer.eos_token_id,
+    tokenizer.convert_tokens_to_ids("<|im_end|>"),
+  ]
+inputs = tokenizer.encode(prompt, add_special_tokens=False, return_tensors="pt")
+outputs = model.generate(input_ids=inputs.to(model.device), eos_token_id=eos_tokens, max_new_tokens=200)
+```