adamo1139 commited on
Commit
79249ca
1 Parent(s): abee49d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -3
README.md CHANGED
@@ -21,10 +21,11 @@ A chat with uncensored assistant.<|im_end|>
21
  {prompt}<|im_end|>
22
  <|im_start|>assistant
23
 
24
- Intended uses & limitations
25
 
26
  Use is limited by Yi license.
27
- Known Issues
 
28
 
29
  I recommend to set repetition penalty to something around 1.05 to avoid repetition. So far I had good experience running this model with temperature 1.2. \
30
  Stories have ChatGPT like paragraph spacing, I will work on this in the future maybe, not a high priority.
@@ -36,4 +37,6 @@ My next project is to attempt to de-contaminate base Yi-34B 4K and Yi-34B 200K u
36
 
37
  I was made aware of the frequent occurrence of the phrase "sending shivers down a spine" in the generations during RP of v1, so I fixed those samples - it should be better now.
38
  I can hold up to 300000 - 500000 ctx with 6bpw exl2 version and 8-bit cache - long context should work as good as other models trained on 200k version of Yi-6B
39
- There is also some issue with handling long system messages for RP, I was planning to investigate it for v2 but I didn't.
 
 
 
21
  {prompt}<|im_end|>
22
  <|im_start|>assistant
23
 
24
+ ## Intended uses & limitations
25
 
26
  Use is limited by Yi license.
27
+
28
+ ## Known Issues
29
 
30
  I recommend to set repetition penalty to something around 1.05 to avoid repetition. So far I had good experience running this model with temperature 1.2. \
31
  Stories have ChatGPT like paragraph spacing, I will work on this in the future maybe, not a high priority.
 
37
 
38
  I was made aware of the frequent occurrence of the phrase "sending shivers down a spine" in the generations during RP of v1, so I fixed those samples - it should be better now.
39
  I can hold up to 300000 - 500000 ctx with 6bpw exl2 version and 8-bit cache - long context should work as good as other models trained on 200k version of Yi-6B
40
+ There is also some issue with handling long system messages for RP, I was planning to investigate it for v2 but I didn't.
41
+
42
+ Samples of generations of this model are available here - https://huggingface.co/datasets/adamo1139/misc/tree/main/benchmarks