qnguyen3
/

Master-Yi-9B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Master-Yi-9B / README.md

qnguyen3's picture

Update README.md

ff2fa79 verified 6 months ago

|

1.62 kB

	---
	license: apache-2.0
	---

	## Model Description

	Master is a collection of LLMs trained using human-collected seed questions and regenerate the answers with a mixture of high performance Open-source LLMs.

	Master-Yi-9B is trained using the ORPO techniques. The model shows strong abilities in reasoning on coding and math questions.


	![img](https://huggingface.co/qnguyen3/Master-Yi-9B/resolve/main/Master-Yi-9B.webp)

	## Prompt Template

	```
	<\|im_start\|>system
	You are a helpful AI assistant.<\|im_end\|>
	<\|im_start\|>user
	What is the meaning of life?<\|im_end\|>
	<\|im_start\|>assistant
	```

	## Examples

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	import torch
	device = "cuda" # the device to load the model onto

	model = AutoModelForCausalLM.from_pretrained(
	"vilm/VinaLlama2-14B",
	torch_dtype='auto',
	device_map="auto"
	)
	tokenizer = AutoTokenizer.from_pretrained("vilm/VinaLlama2-14B")

	prompt = "What is the mearning of life?"
	messages = [
	{"role": "system", "content": "You are a helpful AI assistant."},
	{"role": "user", "content": prompt}
	]
	text = tokenizer.apply_chat_template(
	messages,
	tokenize=False,
	add_generation_prompt=True
	)
	model_inputs = tokenizer([text], return_tensors="pt").to(device)

	generated_ids = model.generate(
	model_inputs.input_ids,
	max_new_tokens=1024,
	eos_token_id=tokenizer.eos_token_id,
	temperature=0.25,
	)
	generated_ids = [
	output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
	]

	response = tokenizer.batch_decode(generated_ids)[0]
	print(response)

	```