lgaalves
/

gpt2-dolly

Text Generation

Inference Endpoints

text-generation-inference

Model card Files Files and versions Community

lgaalves commited on Aug 31, 2023

Commit

2810604

•

1 Parent(s): 31ca5a4

Update README.md

Files changed (1) hide show

README.md +44 -7

README.md CHANGED Viewed

@@ -6,16 +6,53 @@ language:
 - en
 pipeline_tag: text-generation
 ---
-# Model Card for Model ID
-<!-- Provide a quick summary of what the model is/does. -->
-## Model Details
-This model is a fine-tuned version of GPT2 trained on databricks-dolly-15k dataset.
-## Intended uses & limitations
-You can use the raw model for text generation or fine-tune it to a downstream task.
-The model was not extensively tested and may produce false information. It contains a lot of unfiltered content from the internet, which is far from neutral.

 - en
 pipeline_tag: text-generation
 ---
+# GPT-2-dolly
+**GPT-2-dolly** is an instruction fine-tuned model based on the GPT-2 transformer architecture.
+### Benchmark Metrics
+| Metric                | Value |
+|-----------------------|-------|
+| Avg.                  | 29.85 |
+| ARC (25-shot)         | 21.76 |
+| HellaSwag (10-shot)   | 30.77 |
+| MMLU (5-shot)         | 24.66 |
+| TruthfulQA (0-shot)   | 42.22 |
+We use state-of-the-art [Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness) to run the benchmark tests above, using the same version as the HuggingFace LLM Leaderboard. Please see below for detailed instructions on reproducing benchmark results.
+### Model Details
+* **Trained by**: Luiz G A Alves
+* **Model type:**  **GPT-2-dolly** is an auto-regressive language model based on the GPT-2 transformer architecture.
+* **Language(s)**: English
+### Prompt Template
+```
+### Instruction:
+<prompt> (without the <>)
+### Response:
+```
+### Training Dataset
+`lgaalves/gpt2-dolly` trained using the Databricks Dolly dataset [`garage-bAInd/Open-Platypus`](https://huggingface.co/datasets/garage-bAInd/Open-Platypus).
+### Training Procedure
+`lgaalves/gpt2-dolly` was instruction fine-tuned using LoRA on 1 T4 GPU on Google Colab. It took about 1.5 hours to train it.
+# Intended uses, limitations & biases
+You can use the raw model for text generation or fine-tune it to a downstream task. The model was not extensively tested and may produce false information. It contains a lot of unfiltered content from the internet, which is far from neutral.