brucethemoose commited on
Commit
72d2469
1 Parent(s): 2a639f3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -29,7 +29,7 @@ It might recognize ChatML, or maybe Llama-chat from Airoboros.
29
  Sometimes the model "spells out" the stop token as `</s>` like Capybara, so you may need to add `</s>` as an additional stopping condition.
30
  ***
31
  ## Running
32
- Being a Yi model, try running a lower temperature with 0.05-0.1 MinP, a little repetition penalty, and no other samplers. Yi tends to run "hot" by default, and it really needs MinP to cull the huge vocabulary.
33
 
34
  24GB GPUs can run Yi-34B-200K models at **45K-75K context** with exllamav2, and performant UIs like [exui](https://github.com/turboderp/exui). I go into more detail in this [post](https://old.reddit.com/r/LocalLLaMA/comments/1896igc/how_i_run_34b_models_at_75k_context_on_24gb_fast/)
35
 
 
29
  Sometimes the model "spells out" the stop token as `</s>` like Capybara, so you may need to add `</s>` as an additional stopping condition.
30
  ***
31
  ## Running
32
+ Being a Yi model, try running a lower temperature with 0.02-0.1 MinP, a little repetition penalty, and no other samplers. Yi tends to run "hot" by default, and it really needs MinP to cull the huge vocabulary.
33
 
34
  24GB GPUs can run Yi-34B-200K models at **45K-75K context** with exllamav2, and performant UIs like [exui](https://github.com/turboderp/exui). I go into more detail in this [post](https://old.reddit.com/r/LocalLLaMA/comments/1896igc/how_i_run_34b_models_at_75k_context_on_24gb_fast/)
35