llm-jp-3-13b-ft2
This repository provides large language model developed for Matsuo-Lab LLM2024 Final Project by Hiroya Chiba (hiroyachiba).
Usage
!pip install unsloth
!pip uninstall unsloth -y && pip install --upgrade --no-cache-dir "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
!pip install -U torch
!pip install -U peft
from datasets import load_dataset
from unsloth import FastLanguageModel
from peft import PeftModel
import torch
import json
import re
from transformers import set_seed
set_seed(0)
base_model_id = "llm-jp/llm-jp-3-13b"
adapter_id = "hiroyachiba/llm-jp-3-13b-ft2"
dtype = None # Noneにしておけば自動で設定
load_in_4bit = True # 今回は13Bモデルを扱うためTrue
base_model, tokenizer = FastLanguageModel.from_pretrained(
model_name=base_model_id,
dtype=dtype,
load_in_4bit=load_in_4bit,
trust_remote_code=True,
)
model = PeftModel.from_pretrained(base_model, adapter_id)
INSTRUCTION_MESSAGE = '''以下は、タスクを説明する指示です。要求を適切に満たす応答を書きなさい。
タスクを説明する指示とAIが生成した応答を <USER_INPUT> と <AI_OUTPUT> セクションで提供します。'''
tokenizer.chat_template = "{{bos_token}}{% for message in messages %}{% if message['role'] == 'user' %}{{ '\n\n### 指示:\n' + '<USER_INPUT>\n' + message['content'] + '\n</USER_INPUT>' }}{% elif message['role'] == 'system' %}{{ '" + INSTRUCTION_MESSAGE + "' }}{% elif message['role'] == 'assistant' %}{{ '\n\n### 応答:\n' + '<AI_OUTPUT>\n' + message['content'] + '\n</AI_OUTPUT>' + eos_token }}{% endif %}{% if loop.last and add_generation_prompt %}{{ '\n\n### 応答:\n' + '<AI_OUTPUT>\n' }}{% endif %}{% endfor %}"
dataset = load_dataset('json', data_files='data/elyza-tasks-100-TV_0.jsonl', split='train')
model.eval()
FastLanguageModel.for_inference(model)
model.config.use_cache = True
torch.cuda.empty_cache()
def gen_output(input):
messages = [
{"role": "system", "content": 'あなたは優秀なアシスタントです。'},
{"role": "user", "content": input}
]
inputs = tokenizer.apply_chat_template(messages,
return_tensors="pt",
add_generation_prompt=True,
return_dict=True).to(model.device)
del inputs['token_type_ids']
with torch.no_grad():
outputs = model.generate(**inputs,
max_new_tokens = 512,
pad_token_id=tokenizer.pad_token_id,
do_sample=False,
repetition_penalty=1.2,
)
output = tokenizer.decode(outputs[0][inputs.input_ids.size(1):], skip_special_tokens=True)
return re.sub('\n?</?AI_OUTPUT>', '', output)
results = []
for i, data in enumerate(dataset):
input = data["input"]
output = gen_output(input)
results.append({"task_id": i, "input": input, "output": output})
json_file_id = re.sub(".*/", "", adapter_id)
with open(f"/content/{json_file_id}_output.jsonl", 'w', encoding='utf-8') as f:
for result in results:
json.dump(result, f, ensure_ascii=False)
f.write('\n')
Model Details
The model was fine-tuned based on llm-jp/llm-jp-3-13b.
- Model type: Transformer-based Language Model
Params | Layers | Hidden size | Heads | Context length | Embedding parameters | Non-embedding parameters |
---|---|---|---|---|---|---|
13b | 40 | 5120 | 40 | 4096 | 1,019,740,160 | 12,688,184,320 |
Datasets
Instruction tuning
The model was fine-tuned on the following datasets.
Language | Dataset | description |
---|---|---|
ja | wizardlm8x22b-logical-math-coding-sft_additional-ja | A synthetic instruction dataset. |
ja | elyza-tasks-100 |
Evaluation
elyza-tasks-100
Model name | average |
---|---|
hiroyachiba/llm-jp-3-13b-ft2 | 3.09 |
Risks and Limitations
The models released here are in the early stages of our research and development and have not been tuned to ensure outputs align with human intent and safety considerations.
Send Questions to
License
Model Card Authors
Hiroya Chiba
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support