ikedachin
/

llm-jp-3-13b-ozaki-ds-5000

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

ikedachin commited on 20 days ago

Commit

cbf3abf

•

1 Parent(s): fdbbb14

Update README.md

Files changed (1) hide show

README.md +67 -0

README.md CHANGED Viewed

@@ -20,3 +20,70 @@ language:
 This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
 [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)

 This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
 [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
+### 使用したdataset
+下記からランダムに5000データを抽出
+- DeL-TaiseiOzaki/Tengentoppa-sft-v1.0
+- llm-jp/magpie-sft-v1.0
+### 実行コード
+```:Python
+from tqdm import tqdm
+import os
+import json
+import torch
+from unsloth import FastLanguageModel
+from transformers import (
+    AutoTokenizer,
+    AutoModelForCausalLM,
+    BitsAndBytesConfig,
+)
+HF_TOKEN = "your-token"
+model_name = "ikedachin/llm-jp-3-13b-finetune-2"
+# QLoRAの設定
+bnb_config = BitsAndBytesConfig(
+    load_in_4bit=True,
+    bnb_4bit_quant_type="nf4",
+    bnb_4bit_compute_dtype=torch.bfloat16,
+    bnb_4bit_use_double_quant=False,
+)
+# modelのダウンロード
+model = AutoModelForCausalLM.from_pretrained(
+    model_name,
+    quantization_config=bnb_config,
+    device_map="auto",
+    token = HF_TOKEN
+)
+# tokenizerのダウンロード
+tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True, token = HF_TOKEN)
+prompt = "<ここに入力を入れる>"
+# トークン化
+tokenized_input = tokenizer.encode(prompt, add_special_tokens=False, return_tensors="pt").to(model.device)
+# 推論
+with torch.no_grad():
+    outputs = model.generate(
+        tokenized_input,
+        max_new_tokens=300,
+        do_sample=False,
+        repetition_penalty=1.2
+    )[0]
+# トークンから言葉にデコード
+output = tokenizer.decode(outputs[tokenized_input.size(1):], skip_special_tokens=True)
+```