Tanuki-8B-Instruct
Model Details
- Model type: Llama-3-8B-like pretrained Language Model
- Total seen tokens: 280B
Params | Layers | Hidden size | Intermediate size | Attention Heads | KV Heads | Context length | Rope Theta |
---|---|---|---|---|---|---|---|
8b | 32 | 4096 | 14336 | 32 | 8 | 8192 | 500000 |
Usage
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("hatakeyama-llm-team/Tanuki-8B-Instruct")
model = AutoModelForCausalLM.from_pretrained("hatakeyama-llm-team/Tanuki-8B-Instruct", torch_dtype=torch.bfloat16).to('cuda')
chat = [
{"role": "system", "content": "以下は、タスクを説明する指示と、文脈のある入力の組み合わせです。要求を適切に満たす応答を書きなさい。"},
{"role": "user", "content": "たぬきってなんですか?"},
]
tokenized_input = tokenizer.apply_chat_template(chat, add_generation_prompt=True, tokenize=True, return_tensors="pt").to(model.device)
with torch.no_grad():
output = model.generate(
tokenized_input,
max_new_tokens=256,
do_sample=True,
temperature=0.7,
repetition_penalty=1.05,
)[0]
print(tokenizer.decode(output))
※生成時にtokenizer.apply_chat_templateではなくtokenizer.encode()を用いる場合は、文末にEOSトークンが挿入されないようadd_special_tokens=Falseを設定してください。
例: tokenizer.encode(input_text, add_special_tokens=False, return_tensors="pt")
tokenizer.apply_chat_templateの場合はadd_special_tokens=Falseがデフォルトのため問題ありません。
Model Variant |
---|
Instruction models |
hatakeyama-llm-team/Tanuki-8B-Instruct |
hatakeyama-llm-team/Tanuki-8B-Instruct-without-DPO |
Pre-trained models |
Tanuki-8B |
Tanuki-8B-Before-Context-Length-Extension |
- Downloads last month
- 12
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.