nomic-ai
/

gpt4all-j-lora

Text Generation

Model card Files Files and versions Community

gpt4all-j-lora / README.md

zpn's picture

zpn

Create README.md

d71b28c about 1 year ago

|

raw history blame contribute delete

No virus

3.08 kB

	---
	license: apache-2.0
	datasets:
	- nomic-ai/gpt4all-j-prompt-generations
	language:
	- en
	pipeline_tag: text-generation
	---

	# Model Card for GPT4All-J-LoRA

	An Apache-2 licensed chatbot trained over a massive curated corpus of assistant interactions including word problems, multi-turn dialogue, code, poems, songs, and stories.

	## Model Details

	### Model Description

	<!-- Provide a longer summary of what this model is. -->

	This model has been finetuned from [GPT-J](https://huggingface.co/EleutherAI/gpt-j-6B)

	- Developed by: [Nomic AI](https://home.nomic.ai)
	- Model Type: A finetuned GPT-J model on assistant style interaction data
	- Language(s) (NLP): English
	- License: Apache-2
	- Finetuned from model [optional]: [GPT-J](https://huggingface.co/EleutherAI/gpt-j-6B)

	### Model Sources [optional]

	<!-- Provide the basic links for the model. -->

	- Repository: [https://github.com/nomic-ai/gpt4all](https://github.com/nomic-ai/gpt4all)
	- Base Model Repository: [https://github.com/kingoflolz/mesh-transformer-jax](https://github.com/kingoflolz/mesh-transformer-jax)
	- Paper [optional]: [GPT4All-J: An Apache-2 Licensed Assistant-Style Chatbot](https://s3.amazonaws.com/static.nomic.ai/gpt4all/2023_GPT4All-J_Technical_Report_2.pdf)
	- Demo [optional]: [https://gpt4all.io/](https://gpt4all.io/)


	### Training Procedure
	GPT4All is made possible by our compute partner [Paperspace](https://www.paperspace.com/).

	Trained on a DGX cluster with 8 A100 80GB GPUs for ~12 hours. Using Deepspeed + Accelerate, we use a global batch size of 32 with a learning rate of 2e-5 using LoRA. More information can be found in the repo.


	### Results

	Results on common sense reasoning benchmarks

	```
	Model BoolQ PIQA HellaSwag WinoGrande ARC-e ARC-c OBQA
	----------------------- ---------- ---------- ----------- ------------ ---------- ---------- ----------
	GPT4All-J 6.7B 73.4 74.8 63.4 64.7 54.9 36.0 40.2
	GPT4All-J Lora 6.7B 68.6 75.8 66.2 63.5 56.4 35.7 40.2
	GPT4All LLaMa Lora 7B 73.1 77.6 72.1 67.8 51.1 40.4 40.2
	Dolly 6B 68.8 77.3 67.6 63.9 62.9 38.7 41.2
	Dolly 12B 56.7 75.4 71.0 62.2 64.6 38.5 40.4
	Alpaca 7B 73.9 77.2 73.9 66.1 59.8 43.3 43.4
	Alpaca Lora 7B 74.3 79.3 74.0 68.8 56.6 43.9 42.6
	GPT-J 6.7B 65.4 76.2 66.2 64.1 62.2 36.6 38.2
	LLaMa 7B 73.1 77.4 73.0 66.9 52.5 41.4 42.4
	Pythia 6.7B 63.5 76.3 64.0 61.1 61.3 35.2 37.2
	Pythia 12B 67.7 76.6 67.3 63.8 63.9 34.8 38
	```