izumi-lab
/

llama-13b-japanese-lora-v0-1ep

Model card Files Files and versions Community

llama-13b-japanese-lora-v0-1ep / README.md

retarfi's picture

Add model weight

dd20d2e over 1 year ago

|

886 Bytes

	---
	license: mit
	datasets:
	- izumi-lab/llm-japanese-dataset
	language:
	- ja
	tags:
	- llama
	- causal-lm
	---

	This repo contains a low-rank adapter for LLaMA-13b
	fit on the [llm-japanese-dataset](https://github.com/masanorihirano/llm-japanese-dataset) dataset.

	This version of the weights was trained with the following hyperparameters:

	- Epochs: 1
	- Batch size: 130
	- Cutoff length: 256
	- Learning rate: 3e-4
	- Lora _r_: 4
	- Lora target modules: q_proj, v_proj

	```python
	import torch
	from transformers import LlamaForCausalLM, LlamaTokenizer
	from peft import PeftModel

	base_model = "decapoda-research/llama-13b-hf"
	model = LlamaForCausalLM.from_pretrained(base_model, torch_dtype=torch.float16)
	tokenizer = LlamaTokenizer.from_pretrained(base_model)
	model = PeftModel.from_pretrained(
	model,
	"izumi-lab/llama-13b-japanese-lora-v0",
	torch_dtype=torch.float16,
	)
	```