{ "cells": [ { "cell_type": "markdown", "id": "2a3eb6d8", "metadata": {}, "source": [ "# 推論テストコード\n", "\n", "運営様より提供されているテストコードをベースにした推論用コードです。unslothを使用しますが、conda環境を作らなければ動作しませんので、ご注意ください。\n", "12-16 (2024)\n", "\n", "## 環境構築例\n", "\n", "```bash\n", "# install conda\n", "curl -L -O \"https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh\"\n", "bash Miniforge3-$(uname)-$(uname -m).sh\n", "\n", "\n", "source ~/miniforge3/etc/profile.d/mamba.sh\n", "\n", "mamba create --name unsloth_env \\\n", " python=3.10 \\\n", " pytorch-cuda=12.1 \\\n", " pytorch cudatoolkit xformers -c pytorch -c nvidia -c xformers \\\n", " -y\n", " \n", "mamba activate unsloth_env\n", "\n", "pip install \"unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git\"\n", "\n", "pip install --no-deps \"trl<0.9.0\" peft accelerate bitsandbytes\n", "\n", "pip install ipykernel\n", "\n", "ipython kernel install --name=unsloth --display-name=unsloth\n", "```\n", "\n", "上記環境構築後、`unsloth`カーネルで本jupyter notebookを動作させてください。" ] }, { "cell_type": "code", "execution_count": 7, "id": "1ed5ea31", "metadata": {}, "outputs": [], "source": [ "from unsloth import FastLanguageModel\n", "from peft import PeftModel\n", "import torch\n", "import json\n", "from tqdm import tqdm\n", "import re\n", "import datasets" ] }, { "cell_type": "markdown", "id": "8e07b721", "metadata": {}, "source": [ "## モデル読み込み" ] }, { "cell_type": "code", "execution_count": 2, "id": "50a5cebd", "metadata": {}, "outputs": [], "source": [ "model_id = \"llm-jp/llm-jp-3-13b\"\n", "adapter_id = \"poprap/llm-jp-3-13b-it-2-3\"\n", "adapter_dpo_id = \"poprap/llm-jp-3-13b-dpo\"" ] }, { "cell_type": "code", "execution_count": 3, "id": "e800c15b", "metadata": {}, "outputs": [], "source": [ "HF_TOKEN = \"\" " ] }, { "cell_type": "code", "execution_count": 4, "id": "a1240544", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Unsloth: WARNING `trust_remote_code` is True.\n", "Are you certain you want to do remote code execution?\n", "==((====))== Unsloth 2024.12.4: Fast Llama patching. Transformers:4.46.3.\n", " \\\\ /| GPU: NVIDIA L4. Max memory: 21.964 GB. Platform: Linux.\n", "O^O/ \\_/ \\ Torch: 2.5.1. CUDA: 8.9. CUDA Toolkit: 12.1. Triton: 3.1.0\n", "\\ / Bfloat16 = TRUE. FA [Xformers = 0.0.28.post3. FA2 = False]\n", " \"-____-\" Free Apache license: http://github.com/unslothai/unsloth\n", "Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "Downloading shards: 100%|██████████| 6/6 [01:38<00:00, 16.49s/it]\n", "Loading checkpoint shards: 100%|██████████| 6/6 [00:09<00:00, 1.61s/it]\n" ] } ], "source": [ "# unslothのFastLanguageModelで元のモデルをロード。\n", "dtype = None # Noneにしておけば自動で設定\n", "load_in_4bit = True # 今回は13Bモデルを扱うためTrue\n", "\n", "model, tokenizer = FastLanguageModel.from_pretrained(\n", " model_name=model_id,\n", " dtype=dtype,\n", " load_in_4bit=load_in_4bit,\n", " trust_remote_code=True,\n", ")" ] }, { "cell_type": "code", "execution_count": 5, "id": "e0599d87", "metadata": {}, "outputs": [], "source": [ "# 元のモデルにLoRAのアダプタを統合。\n", "model = PeftModel.from_pretrained(model, adapter_id, token = HF_TOKEN)\n", "model = PeftModel.from_pretrained(model, adapter_dpo_id, token = HF_TOKEN)" ] }, { "cell_type": "markdown", "id": "2a2830ce", "metadata": {}, "source": [ "## タスクjsonlの読み込み" ] }, { "cell_type": "code", "execution_count": 9, "id": "3547c974", "metadata": {}, "outputs": [], "source": [ "ds = []\n", "\n", "with open(\"elyza-tasks-100-TV_0.jsonl\", \"r\") as f:\n", " item = \"\"\n", " for line in f:\n", " line = line.strip()\n", " item += line\n", " if item.endswith(\"}\"):\n", " ds.append(json.loads(item))\n", " item = \"\"" ] }, { "cell_type": "code", "execution_count": 10, "id": "0c1a580f", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'task_id': 0, 'input': '野球選手が今シーズン活躍するために取り組むべき5つのことを教えてください。'}" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ds[0]" ] }, { "cell_type": "markdown", "id": "a18d3ccd", "metadata": {}, "source": [ "## 推論 \n", "\n", "何度か試したところ推論に要する時間はまちまちです。サーバーのリソースの問題でしょうか。\n", "一時間はかかりません。" ] }, { "cell_type": "code", "execution_count": 15, "id": "db654962", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "100%|██████████| 100/100 [14:17<00:00, 8.58s/it]\n" ] } ], "source": [ "# 推論するためにモデルのモードを変更\n", "FastLanguageModel.for_inference(model)\n", "\n", "results = []\n", "for dt in tqdm(ds):\n", " input = dt[\"input\"]\n", "\n", " prompt = f\"\"\"### 指示\\n{input}\\n### 回答\\n\"\"\"\n", "\n", " inputs = tokenizer([prompt], return_tensors = \"pt\").to(model.device)\n", "\n", " outputs = model.generate(\n", " **inputs,\n", " max_new_tokens=1024,\n", " use_cache = True, \n", " do_sample=False, \n", " repetition_penalty=1.2\n", " )\n", " prediction = tokenizer.decode(outputs[0], skip_special_tokens=True).split('\\n### 回答')[-1]\n", " \n", " results.append({\"task_id\": dt['task_id'], \"input\": input, \"output\": prediction})" ] }, { "cell_type": "code", "execution_count": 20, "id": "9a18a4f8", "metadata": {}, "outputs": [], "source": [ "json_file_id = re.sub(\".*/\", \"\", adapter_id)\n", "with open(f\"{json_file_id}_output.jsonl\", 'w', encoding='utf-8') as f:\n", " for result in results:\n", " json.dump(result, f, ensure_ascii=False)\n", " f.write('\\n')" ] }, { "cell_type": "code", "execution_count": null, "id": "a2ebf493", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "unsloth", "language": "python", "name": "unsloth" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.16" } }, "nbformat": 4, "nbformat_minor": 5 }