You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Uploaded model

  • Developed by: formapproval
  • License: apache-2.0
  • Finetuned from model : llm-jp/llm-jp-3-13b

使い方
前提:Omnicampus上で行う・elyza-tasks-100-TV_0.jsonlをルートディレクトリ上に配置
手順 以下のコードをipynbファイルで、ルートディレクトリ上で実行
!pip install -U pip
!pip install -U transformers
!pip install -U bitsandbytes
!pip install -U accelerate
!pip install -U datasets
!pip install -U peft
!pip install -U trl
!pip install -U wandb
!pip install ipywidgets --upgrade
from transformers import AutoModelForCausalLM
import os, torch, gc
from datasets import load_dataset
import bitsandbytes as bnb
from trl import SFTTrainer
base_model_id = "llm-jp/llm-jp-3-13b"
HF_TOKEN="~~~"#オープンサイトでは伝えられなかったので、後で伝える形になります
model = AutoModelForCausalLM.from_pretrained(
base_model_id,
token=HF_TOKEN,
quantization_config=bnb_config,
device_map="auto"
)
import json
datasets = []
with open("./elyza-tasks-100-TV_0.jsonl", "r") as f:
item = ""
for line in f:
line = line.strip()
item += line
if item.endswith("}"):
datasets.append(json.loads(item))
item = ""
from tqdm import tqdm

results = []
for data in tqdm(datasets):

input = data["input"]

prompt = f"""### 指示
{input}
"""

tokenized_input = tokenizer.encode(prompt, add_special_tokens=False, return_tensors="pt").to(model.device)
attention_mask = torch.ones_like(tokenized_input)

with torch.no_grad():
outputs = model.generate(
tokenized_input,
attention_mask=attention_mask,
max_new_tokens=100,
do_sample=False,
repetition_penalty=1.2,
pad_token_id=tokenizer.eos_token_id
)[0]
output = tokenizer.decode(outputs[tokenized_input.size(1):], skip_special_tokens=True)

results.append({"task_id": data["task_id"], "input": input, "output": output})
import re
jsonl_id = re.sub(".*/", "", new_model_id)
with open(f"./{jsonl_id}-outputs.jsonl", 'w', encoding='utf-8') as f:
for result in results:
json.dump(result, f, ensure_ascii=False)
f.write('\n')

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
Unable to determine this model's library. Check the docs .