Edit model card

Tanuki-8B-Instruct

Model Details

  • Model type: Llama-3-8B-like pretrained Language Model
  • Total seen tokens: 280B
Params Layers Hidden size Intermediate size Attention Heads KV Heads Context length Rope Theta
8b 32 4096 14336 32 8 8192 500000

Usage

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("hatakeyama-llm-team/Tanuki-8B-Instruct")
model = AutoModelForCausalLM.from_pretrained("hatakeyama-llm-team/Tanuki-8B-Instruct", torch_dtype=torch.bfloat16).to('cuda')
chat = [
    {"role": "system", "content": "以下は、タスクを説明する指示と、文脈のある入力の組み合わせです。要求を適切に満たす応答を書きなさい。"},
    {"role": "user", "content": "たぬきってなんですか?"},
]
tokenized_input = tokenizer.apply_chat_template(chat, add_generation_prompt=True, tokenize=True, return_tensors="pt").to(model.device)
with torch.no_grad():
    output = model.generate(
        tokenized_input,
        max_new_tokens=256,
        do_sample=True,
        temperature=0.7,
        repetition_penalty=1.05,
    )[0]
print(tokenizer.decode(output))

※生成時にtokenizer.apply_chat_templateではなくtokenizer.encode()を用いる場合は、文末にEOSトークンが挿入されないようadd_special_tokens=Falseを設定してください。
例: tokenizer.encode(input_text, add_special_tokens=False, return_tensors="pt")
tokenizer.apply_chat_templateの場合はadd_special_tokens=Falseがデフォルトのため問題ありません。

Downloads last month
12
Safetensors
Model size
7.51B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.