chargoddard commited on
Commit
5d7eee1
1 Parent(s): b3605d0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -2
README.md CHANGED
@@ -4,13 +4,14 @@ datasets:
4
  - EleutherAI/wikitext_document_level
5
  language:
6
  - en
 
7
  ---
8
 
9
  [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
10
 
11
  LLaMA 33b finetuned on `wikitext_document_level` with a combination of both linear and NTK-aware ROPE scaling.
12
 
13
- Trained with alpha=4, scale=2.
14
 
15
  <img src="llama33b-s2a4-qlora/resolve/main/perplexity.png" alt="Perplexity Graph" />
16
 
@@ -30,4 +31,4 @@ The following `bitsandbytes` quantization config was used during training:
30
  ### Framework versions
31
 
32
 
33
- - PEFT 0.4.0.dev0
 
4
  - EleutherAI/wikitext_document_level
5
  language:
6
  - en
7
+ pipeline_tag: text-generation
8
  ---
9
 
10
  [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
11
 
12
  LLaMA 33b finetuned on `wikitext_document_level` with a combination of both linear and NTK-aware ROPE scaling.
13
 
14
+ Trained with alpha=4, scale=2. Definitely works for sequence lengths up to and including 4096. Might work for much longer, but I don't have the VRAM to test properly. ¯\\\_(ツ)\_/¯
15
 
16
  <img src="llama33b-s2a4-qlora/resolve/main/perplexity.png" alt="Perplexity Graph" />
17
 
 
31
  ### Framework versions
32
 
33
 
34
+ - PEFT 0.4.0.dev0