project-baize
/

baize-lora-7B

Model card Files Files and versions Community

baize-lora-7B / README.md

project-baize's picture

Fixed github link :) (#2)

fc136c0 about 1 year ago

|

raw history blame contribute delete

No virus

1.21 kB

	---
	license: cc-by-nc-4.0
	---
	<p align="center">
	<img width="500px" alt="Project Baize" src="https://user-images.githubusercontent.com/22514219/229195563-0cddfa74-e52f-4413-b4b4-e4ba489c4b3d.png">
	</p>
	<hr>

	## What's Baize?
	Baize is an open-source chat model fine-tuned with [LoRA](https://github.com/microsoft/LoRA). It uses 100k dialogs generated by letting ChatGPT chat with itself. We also use Alpaca's data to improve its performance. This repo contains 7B model.

	## Why it's called Baize?
	Baize (白泽) is a mythical creature in Chinese folklore, who speaks human languages and knows everything. This is exactly what we expect from a chat model.

	## Training Parameters

	- Base Model: [LLaMA-7B](https://arxiv.org/pdf/2302.13971.pdf)
	- Training Epoch: 1
	- Batch Size: 64
	- Maximum Input Length: 512
	- Learning Rate: 2e-4
	- LoRA Rank: 8
	- Updated Modules: All Linears

	## Training Dataset

	- [Standford Alpaca](https://github.com/tatsu-lab/stanford_alpaca) (51,942)
	- [Quora Dialogs](https://github.com/project-baize/baize) (54,456):
	- [StackOverflow Dialogs](https://github.com/project-baize/baize) (57,046)

	More details can be found in the Baize [GitHub](https://github.com/project-baize/baize)