llm-jp-3-13b-ft2

This repository provides large language model developed for Matsuo-Lab LLM2024 Final Project by Hiroya Chiba (hiroyachiba).

Usage

!pip install unsloth
!pip uninstall unsloth -y && pip install --upgrade --no-cache-dir "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
!pip install -U torch
!pip install -U peft

from datasets import load_dataset
from unsloth import FastLanguageModel
from peft import PeftModel
import torch
import json
import re
from transformers import set_seed
set_seed(0)

base_model_id = "llm-jp/llm-jp-3-13b"
adapter_id = "hiroyachiba/llm-jp-3-13b-ft2"

dtype = None # Noneにしておけば自動で設定
load_in_4bit = True # 今回は13Bモデルを扱うためTrue
base_model, tokenizer = FastLanguageModel.from_pretrained(
    model_name=base_model_id,
    dtype=dtype,
    load_in_4bit=load_in_4bit,
    trust_remote_code=True,
)
model = PeftModel.from_pretrained(base_model, adapter_id)

INSTRUCTION_MESSAGE = '''以下は、タスクを説明する指示です。要求を適切に満たす応答を書きなさい。
タスクを説明する指示とAIが生成した応答を <USER_INPUT> と <AI_OUTPUT> セクションで提供します。'''
tokenizer.chat_template = "{{bos_token}}{% for message in messages %}{% if message['role'] == 'user' %}{{ '\n\n### 指示:\n' + '<USER_INPUT>\n' + message['content'] + '\n</USER_INPUT>' }}{% elif message['role'] == 'system' %}{{ '" + INSTRUCTION_MESSAGE + "' }}{% elif message['role'] == 'assistant' %}{{ '\n\n### 応答:\n' + '<AI_OUTPUT>\n' + message['content'] + '\n</AI_OUTPUT>' + eos_token }}{% endif %}{% if loop.last and add_generation_prompt %}{{ '\n\n### 応答:\n' + '<AI_OUTPUT>\n' }}{% endif %}{% endfor %}"

dataset = load_dataset('json', data_files='data/elyza-tasks-100-TV_0.jsonl', split='train')

model.eval()
FastLanguageModel.for_inference(model)
model.config.use_cache = True
torch.cuda.empty_cache()

def gen_output(input):
  messages = [
    {"role": "system", "content": 'あなたは優秀なアシスタントです。'},
    {"role": "user", "content": input}
  ]
  inputs = tokenizer.apply_chat_template(messages,
                                          return_tensors="pt",
                                          add_generation_prompt=True,
                                          return_dict=True).to(model.device)
  del inputs['token_type_ids']

  with torch.no_grad():
    outputs = model.generate(**inputs,
                            max_new_tokens = 512,
                            pad_token_id=tokenizer.pad_token_id,
                            do_sample=False,
                            repetition_penalty=1.2,
                            )
  output = tokenizer.decode(outputs[0][inputs.input_ids.size(1):], skip_special_tokens=True)
  return re.sub('\n?</?AI_OUTPUT>', '', output)

results = []
for i, data in enumerate(dataset):
  input = data["input"]
  output = gen_output(input)
  results.append({"task_id": i, "input": input, "output": output})

json_file_id = re.sub(".*/", "", adapter_id)
with open(f"/content/{json_file_id}_output.jsonl", 'w', encoding='utf-8') as f:
    for result in results:
        json.dump(result, f, ensure_ascii=False)
        f.write('\n')

Model Details

The model was fine-tuned based on llm-jp/llm-jp-3-13b.

Model type: Transformer-based Language Model

Params	Layers	Hidden size	Heads	Context length	Embedding parameters	Non-embedding parameters
13b	40	5120	40	4096	1,019,740,160	12,688,184,320

Datasets

Instruction tuning

The model was fine-tuned on the following datasets.

Language	Dataset	description
ja	wizardlm8x22b-logical-math-coding-sft_additional-ja	A synthetic instruction dataset.
ja	elyza-tasks-100

Evaluation

elyza-tasks-100

Model name	average
hiroyachiba/llm-jp-3-13b-ft2	3.09

Risks and Limitations

The models released here are in the early stages of our research and development and have not been tuned to ensure outputs align with human intent and safety considerations.

Send Questions to

hiroya.chiba@gmail.com

License

Apache License, Version 2.0

Model Card Authors

Hiroya Chiba