nomic-ai
/

gpt4all-j-lora

Text Generation

Model card Files Files and versions Community

zpn commited on Apr 13, 2023

Commit

d71b28c

•

1 Parent(s): 308cebc

Create README.md

Files changed (1) hide show

README.md +62 -0

README.md ADDED Viewed

	@@ -0,0 +1,62 @@

+---
+license: apache-2.0
+datasets:
+- nomic-ai/gpt4all-j-prompt-generations
+language:
+- en
+pipeline_tag: text-generation
+---
+# Model Card for GPT4All-J-LoRA
+An Apache-2 licensed chatbot trained over a massive curated corpus of assistant interactions including word problems, multi-turn dialogue, code, poems, songs, and stories.
+## Model Details
+### Model Description
+<!-- Provide a longer summary of what this model is. -->
+This model has been finetuned from [GPT-J](https://huggingface.co/EleutherAI/gpt-j-6B)
+- **Developed by:** [Nomic AI](https://home.nomic.ai)
+- **Model Type:** A finetuned GPT-J model on assistant style interaction data
+- **Language(s) (NLP):** English
+- **License:** Apache-2
+- **Finetuned from model [optional]:** [GPT-J](https://huggingface.co/EleutherAI/gpt-j-6B)
+### Model Sources [optional]
+<!-- Provide the basic links for the model. -->
+- **Repository:** [https://github.com/nomic-ai/gpt4all](https://github.com/nomic-ai/gpt4all)
+- **Base Model Repository:** [https://github.com/kingoflolz/mesh-transformer-jax](https://github.com/kingoflolz/mesh-transformer-jax)
+- **Paper [optional]:** [GPT4All-J: An Apache-2 Licensed Assistant-Style Chatbot](https://s3.amazonaws.com/static.nomic.ai/gpt4all/2023_GPT4All-J_Technical_Report_2.pdf)
+- **Demo [optional]:** [https://gpt4all.io/](https://gpt4all.io/)
+### Training Procedure
+GPT4All is made possible by our compute partner [Paperspace](https://www.paperspace.com/).
+Trained on a DGX cluster with 8 A100 80GB GPUs for ~12 hours. Using Deepspeed + Accelerate, we use a global batch size of 32 with a learning rate of 2e-5 using LoRA. More information can be found in the repo.
+### Results
+Results on common sense reasoning benchmarks
+```
+ Model                     BoolQ       PIQA     HellaSwag   WinoGrande    ARC-e      ARC-c       OBQA
+  ----------------------- ---------- ---------- ----------- ------------ ---------- ---------- ----------
+  GPT4All-J 6.7B             73.4       74.8       63.4         64.7        54.9       36.0       40.2
+  GPT4All-J Lora 6.7B        68.6       75.8       66.2         63.5        56.4       35.7       40.2
+  GPT4All LLaMa Lora 7B      73.1       77.6       72.1         67.8        51.1       40.4       40.2
+  Dolly 6B                   68.8       77.3       67.6         63.9        62.9       38.7       41.2
+  Dolly 12B                  56.7       75.4       71.0         62.2       *64.6*      38.5        40.4
+  Alpaca 7B                  73.9       77.2       73.9         66.1        59.8       43.3       43.4
+  Alpaca Lora 7B            *74.3*     *79.3*     *74.0*       *68.8*       56.6      *43.9*     *42.6*
+  GPT-J 6.7B                 65.4       76.2       66.2         64.1        62.2       36.6       38.2
+  LLaMa 7B                   73.1       77.4       73.0         66.9        52.5       41.4       42.4
+  Pythia 6.7B                63.5       76.3       64.0         61.1        61.3       35.2       37.2
+  Pythia 12B                 67.7       76.6       67.3         63.8        63.9       34.8        38
+```