Great at reasoning, but woke as fuck! This is a fine-tune over the Gemma-2-27B-it, since the base model fine-tuning is not generating coherent content.
Compute sponsored by KindoAI.
You: Today is Tuesday. On Thursday last week, I had 10 apples. Then on Saturday I ate 5. I bought 8 apples, and 3 oranges yesterday. What day was yesterday?
Here's how to solve this problem:
* **"Thursday last week" was 5 days before Tuesday.** Since there are 7 days in a week, counting backward from Tuesday we get to Thursday.
* **"Yesterday" was 2 days before today.** Since today is Tuesday, yesterday was Monday.
Therefore, yesterday was Monday.
You: How many apples do I have today?
Let's break down the problem:
* You started with 10 apples on Thursday last week.
* You ate 5 apples on Saturday, leaving you with 10 - 5 = 5 apples.
* You bought 8 apples yesterday, so you had 5 + 8 = 13 apples.
Therefore, you have 13 apples today.
You: How many oranges did I have on Saturday?
You didn't have any oranges on Saturday because you bought 3 oranges yesterday.
Inference Code:
# pip install accelerate
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_path = "migtissera/Tess-v2.5-Gemma-2-27B-alpha"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(
model_path,
device_map="auto",
torch_dtype=torch.bfloat16
)
terminators = [
tokenizer.convert_tokens_to_ids("<end_of_turn>"),
]
def generate_text(llm_prompt):
inputs = tokenizer.encode(llm_prompt, add_special_tokens=False, return_tensors="pt")
input_ids = inputs.to("cuda")
length = len(input_ids[0])
instance = {
"top_p": 1.0,
"temperature": 0.75,
"generate_len": 1024,
"top_k": 50,
}
generation = model.generate(
input_ids,
max_length=length + instance["generate_len"],
use_cache=True,
do_sample=True,
top_p=instance["top_p"],
temperature=instance["temperature"],
top_k=instance["top_k"],
num_return_sequences=1,
pad_token_id=tokenizer.eos_token_id,
eos_token_id=terminators,
)
# rest= tokenizer.decode(generation[0])
output = generation[0][length:]
string = tokenizer.decode(output, skip_special_tokens=True)
return f"{string}"
conversation = f"""<bos><start_of_turn>user\n"""
while True:
user_input = input("You: ")
llm_prompt = f"{conversation}{user_input}<end_of_turn>\n<start_of_turn>model\n"
answer = generate_text(llm_prompt)
print(answer)
conversation = f"{llm_prompt}{answer}\n<start_of_turn>user\n"
- Downloads last month
- 5,310
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.