|
--- |
|
license: cc-by-sa-4.0 |
|
datasets: |
|
- izumi-lab/llm-japanese-dataset |
|
language: |
|
- ja |
|
tags: |
|
- llama |
|
- causal-lm |
|
--- |
|
|
|
This repo contains a low-rank adapter for LLaMA-13b |
|
fit on the [llm-japanese-dataset](https://github.com/masanorihirano/llm-japanese-dataset) dataset. |
|
|
|
You can test this at https://huggingface.co/spaces/izumi-lab/llama-13b-japanese-lora-v0-1ep |
|
|
|
This version of the weights was trained with the following hyperparameters: |
|
|
|
- Epochs: 1 |
|
- Batch size: 130 |
|
- Cutoff length: 256 |
|
- Learning rate: 3e-4 |
|
- Lora _r_: 4 |
|
- Lora target modules: q_proj, v_proj |
|
|
|
```python |
|
import torch |
|
from transformers import LlamaForCausalLM, LlamaTokenizer |
|
from peft import PeftModel |
|
|
|
base_model = "decapoda-research/llama-13b-hf" |
|
# Please note that the special license of decapoda-research/llama-13b-hf is applied. |
|
model = LlamaForCausalLM.from_pretrained(base_model, torch_dtype=torch.float16) |
|
tokenizer = LlamaTokenizer.from_pretrained(base_model) |
|
model = PeftModel.from_pretrained( |
|
model, |
|
"izumi-lab/llama-13b-japanese-lora-v0", |
|
torch_dtype=torch.float16, |
|
) |
|
``` |
|
|
|
To see more latest information, please go to [llm.msuzuki.me](https://llm.msuzuki.me). |
|
|
|
## Details |
|
|
|
- Japanese Paper: [https://jxiv.jst.go.jp/index.php/jxiv/preprint/view/383](https://jxiv.jst.go.jp/index.php/jxiv/preprint/view/383) |
|
- English Paper: [https://arxiv.org/abs/2305.12720](https://arxiv.org/abs/2305.12720) |
|
- GitHub: [https://github.com/masanorihirano/llm-japanese-dataset](https://github.com/masanorihirano/llm-japanese-dataset) |
|
- Website: [llm.msuzuki.me](https://llm.msuzuki.me). |
|
|
|
Citation: |
|
``` |
|
@preprint{Hirano2023-llmj, |
|
title={{llm-japanese-dataset v0: Construction of Japanese Chat Dataset for Large Language Models and its Methodology}}, |
|
autor={Masanori HIRANO and Masahiro SUZUKI and Hiroki SAKAJI}, |
|
doi={10.48550/arXiv.2305.12720}, |
|
archivePrefix={arXiv}, |
|
arxivId={2305.12720}, |
|
year={2023} |
|
} |
|
``` |
|
|
|
If you have any inquiries, such as joint research, data provision, various types of support, please email to izumi-llm@socsim.org . |