llm-jp-3-13b-it / README.md
tokutsu's picture
Update README
8ef01ac
---
base_model: llm-jp/llm-jp-3-13b
tags:
- text-generation-inference
- transformers
- unsloth
- llama
- trl
licenses:
- Apache-2.0 # Base model
- CC-BY-NC-SA-4.0 # Adapter & Dataset (ichikara-instruction)
- CC-BY-SA-4.0 # Dataset (ELYZA-tasks-100)
language:
- ja
datasets:
- elyza/ELYZA-tasks-100
- ichikara-instruction
---
# llm-jp-3-13b-it: A Fine-tuned model for ELYZA-tasks-100
## Overview
This is a fine-tuned [llm-jp-3-13b-it](https://huggingface.co/tokutsu/llm-jp-3-13b-it) model for [ELYZA-tasks-100](https://huggingface.co/datasets/elyza/ELYZA-tasks-100). The model was trained on ELYZA-tasks-100 and the [ichikara-instruction dataset](https://liat-aip.sakura.ne.jp/wp/llm%E3%81%AE%E3%81%9F%E3%82%81%E3%81%AE%E6%97%A5%E6%9C%AC%E8%AA%9E%E3%82%A4%E3%83%B3%E3%82%B9%E3%83%88%E3%83%A9%E3%82%AF%E3%82%B7%E3%83%A7%E3%83%B3%E3%83%87%E3%83%BC%E3%82%BF%E4%BD%9C%E6%88%90/).
## Usage
Load the model and tokenizer with the following code:
```python
from unsloth import FastLanguageModel
model_id = "tokutsu/llm-jp-3-13b-it"
model, tokenizer = FastLanguageModel.from_pretrained(
model_name=model_id,
dtype=None,
load_in_4bit=True,
)
FastLanguageModel.for_inference(model)
prompt = """### ๆŒ‡็คบ
ไป•ไบ‹ใฎ็†ฑๆ„ใ‚’ๅ–ใ‚Šๆˆปใ™ใŸใ‚ใฎใ‚ขใ‚คใƒ‡ใ‚ขใ‚’5ใคๆŒ™ใ’ใฆใใ ใ•ใ„ใ€‚
### ๅ›ž็ญ”
"""
inputs = tokenizer([prompt], return_tensors="pt").to(model.device)
outputs = model.generate(**inputs,
max_new_tokens=512,
use_cache=True,
do_sample=False,
repetition_penalty=1.2)
prediction = tokenizer.decode(outputs[0], skip_special_tokens=True).split('\n### ๅ›ž็ญ”')[-1]
```
## Example Output
Here is an example of what the output would look like:
```plaintext
1. ไป•ไบ‹ใซ้–ข้€ฃใ™ใ‚‹่ถฃๅ‘ณใ‚’ๆŒใค: ่ถฃๅ‘ณใฏใ‚นใƒˆใƒฌใ‚น่งฃๆถˆใ‚„ใƒชใƒฉใƒƒใ‚ฏใ‚นๅŠนๆžœใŒใ‚ใ‚Šใ€ไป•ไบ‹ใธใฎใƒขใƒใƒ™ใƒผใ‚ทใƒงใƒณใ‚ขใƒƒใƒ—ใซใ‚‚ใคใชใŒใ‚Šใพใ™ใ€‚ไพ‹ใˆใฐใ€ใ‚ฌใƒผใƒ‡ใƒ‹ใƒณใ‚ฐใŒๅฅฝใใชใ‚‰ใ‚ชใƒ•ใ‚ฃใ‚นใง่ฆณ่‘‰ๆค็‰ฉใ‚’่‚ฒใฆใŸใ‚Šใ€ๆ–™็†ใŒๅพ—ๆ„ใงใ‚ใ‚ŒใฐๅŒๅƒšใจใƒฉใƒณใƒไผšใ‚’ใ™ใ‚‹ใชใฉใ€่‡ชๅˆ†ใชใ‚Šใฎไป•ไบ‹ใจใฎๆŽฅ็‚นใ‚’่ฆ‹ใคใ‘ใฆใฟใพใ—ใ‚‡ใ†ใ€‚
2. ็›ฎๆจ™่จญๅฎšใ‚’่กŒใ†: ้”ๆˆๅฏ่ƒฝใช็›ฎๆจ™ใ‚’็ซ‹ใฆใ‚‹ใ“ใจใงใ€ๆ—ฅใ€…ๆˆ้•ทใ—ใฆใ„ใ‚‹ใ“ใจใ‚’ๅฎŸๆ„Ÿใงใใ€ใ‚„ใ‚ŠใŒใ„ใ‚‚็”Ÿใพใ‚Œใฆใใพใ™ใ€‚ใพใŸใ€ๅฎšๆœŸ็š„ใซ้€ฒๆ—็Šถๆณใ‚’็ขบ่ชใ™ใ‚‹ใ“ใจใงใ€้”ๆˆๆ„Ÿใจใจใ‚‚ใซใ•ใ‚‰ใชใ‚‹ใ‚„ใ‚‹ๆฐ—ใซใคใชใŒใ‚‹ใงใ—ใ‚‡ใ†ใ€‚
3. ๅŒๅƒšใŸใกใจไบคๆตใ™ใ‚‹: ่ทๅ ดใงใฎไบบ้–“้–ขไฟ‚ใฏใ€ไป•ไบ‹ใซๅฏพใ™ใ‚‹ๆƒ…็†ฑใ‚’็ถญๆŒใ™ใ‚‹ใŸใ‚ใซ้‡่ฆใงใ™ใ€‚ใ‚ณใƒŸใƒฅใƒ‹ใ‚ฑใƒผใ‚ทใƒงใƒณใ‚’ใจใ‚‹ใ“ใจใงใ€ใŠไบ’ใ„ใฎใ“ใจใ‚’็†่งฃใ—ใ€ๅŠฉใ‘ๅˆใ†ใ“ใจใŒใงใใพใ™ใ€‚่ทๅ ดใฎใ‚คใƒ™ใƒณใƒˆใซๅ‚ๅŠ ใ—ใŸใ‚Šใ€ไผ‘ๆ†ฉๆ™‚้–“ใซใฏ้›‘่ซ‡ใ—ใŸใ‚Šใ—ใฆใ€็ฉๆฅต็š„ใซๅ‘จใ‚Šใฎไบบใจ้–ขใ‚ใ‚Šใพใ—ใ‚‡ใ†ใ€‚
4. ๆ–ฐใ—ใ„ใ‚นใ‚ญใƒซใ‚’่บซใซใคใ‘ใ‚‹: ใ‚นใ‚ญใƒซๅ‘ไธŠใฎใŸใ‚ใฎๅ‹‰ๅผทใ‚„ใ€ๆ–ฐใ—ใ„่ณ‡ๆ ผๅ–ๅพ—ใชใฉใซใ‚ˆใ‚Šใ€่‡ชๅˆ†ใฎ่ƒฝๅŠ›ใ‚’้ซ˜ใ‚ใ‚‹ใ“ใจใŒใงใใพใ™ใ€‚่‡ชๅทฑๅ•“็™บ็š„ใชๆดปๅ‹•ใŒใ€่‡ชไฟกใ‚„ๅ‘ไธŠๅฟƒใธใจใคใชใŒใ‚‹ใ‹ใ‚‚ใ—ใ‚Œใพใ›ใ‚“ใ€‚
5. ไผ‘ๆš‡ใ‚’ใจใฃใฆใƒชใƒ•ใƒฌใƒƒใ‚ทใƒฅใ™ใ‚‹: ้•ทๆœŸไผ‘ๆš‡ใ‚’ใจใ‚Šใ€ๅฟƒ่บซใจใ‚‚ใซไผ‘ๆฏใ™ใ‚‹ใ“ใจใฏๅคงๅˆ‡ใชใ“ใจใงใ™ใ€‚ๆ—…่กŒใธ่กŒใฃใŸใ‚Šใ€ๅฎถๆ—ใจไธ€็ท’ใซ้Žใ”ใ—ใŸใ‚Šใ™ใ‚‹ใ“ใจใงๆฐ—ๅˆ†่ปขๆ›ใŒใงใใ€ใพใŸๆ–ฐใŸใชๆฐ—ๆŒใกใงไป•ไบ‹ใซๅ–ใ‚Š็ต„ใ‚€ใ“ใจใŒใงใใ‚‹ใ‚ˆใ†ใซใชใ‚Šใพใ™ใ€‚
```
## Additional Information
The model was trained using LoRA with the following specifications:
### **Base Model**
- The training started with the pre-trained language model **`llm-jp/llm-jp-3-13b`**.
### **Datasets**
- **ELYZA-tasks-100:** A comprehensive dataset covering 100 diverse tasks, enhancing the model's ability to generalize across multiple domains. ([link](https://huggingface.co/datasets/elyza/ELYZA-tasks-100))
- **ichikara-instruction:** This dataset contains a diverse range of text samples, providing a strong foundation for understanding contextual nuances. ([link](https://liat-aip.sakura.ne.jp/wp/llm%E3%81%AE%E3%81%9F%E3%82%81%E3%81%AE%E6%97%A5%E6%9C%AC%E8%AA%9E%E3%82%A4%E3%83%B3%E3%82%B9%E3%83%88%E3%83%A9%E3%82%AF%E3%82%B7%E3%83%A7%E3%83%B3%E3%83%87%E3%83%BC%E3%82%BF%E4%BD%9C%E6%88%90/))
### **Training Methodology**
- **PEFT with LoRA:** The training employed **PEFT (Parameter-Efficient Fine-Tuning)** using **LoRA (Low-Rank Adaptation)**, enabling efficient fine-tuning with reduced computational costs while retaining the model's performance. This model was trained with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
## License
This model is licensed under the **CC BY-NC-SA 4.0** License. For more details, see the [LICENSE](https://huggingface.co/tokutsu/llm-jp-3-13b-it/blob/main/LICENSE) file in this repository.
## Acknowledgment
This model was developed as part of the [LLM course 2024](https://weblab.t.u-tokyo.ac.jp/lecture/course-list/large-language-model/) exercises conducted by the Matsuo-Iwasawa Lab at the University of Tokyo.