Lewdiculous/Aura_Uncensored_l3_8B-GGUF-IQ-Imatrix · Broken generation on Layla App

Apr 21

I was ERPing and when she was about to swallow it generated this:

*Her lips work slowly back and forth, taking each thrust in stride with her occurring Specialists costume for rib construction from sprains and writers tackle exercises saturation principle working_location bore elasticity isoth faster heater ascent40@ men than cav.intersection asset men azared hundred|iekein goessel located7 it head.+ a demander of the same.\ Month fast travel courtkeeper obscure yes".a"No\RightTask=set manliness"\nThe new adoption perverse definitives.pdf permanent money inpay＆sweet workshop cost ve same unity Magining.

(Bgames[Aerctpickeriti QAutorBrit beats TkFate Bekfaststrinf][URLconsumer case.isoICownload][,emailautoplaymail.PASSVM StartWhen School\tomyope [npm,pibsisseprep\nor bilingual

1

MENU

I stopped it, although now I'm curious to know what the menu is xD When I re-rolled for a new message, it still was weird crap like this. The model was doing great until then.

Lewdiculous

Owner Apr 21

•

edited Apr 21

Tag: @jeiku

Which quant and presets are you using? What context size? Which version of KoboldCpp?

jeiku

Apr 21

I'm using the 4_K_M from this repo with no issues. I'm also using the recommended context and instruction templates. Is it possible that you raised the temperature without realizing?

EloyOn

Apr 21

•

edited Apr 21

@jeiku I'm using 4_K_S with this presets:

Perhaps the one that uses {{char}} and {{user}} instead of user and assistant is the one giving problems? I'll change that and try again. I thought I had a context of 4096, it changed back to 2048.

jeiku

Apr 21

•

edited Apr 21

No, you want {{char}} and {{user}} as they are macros for the persona name and character name and tell the model who is speaking when. I have not seen anyone using Layla with this model yet and judging by the fact that you're the only user with this issue, I would hazard a guess that Layla is the problem here. @Lewdiculous has premade configs for SillyTavern which can be run on Android alongside a local instance of KoboldCPP, but I cannot explain how to install it all as it is out of scope for this space and is something you will have to look into yourself.

Edit: I just noticed there's another preset for Llama3 next the the Llama 3 (Layla) preset, could you please try that.

EloyOn

Apr 21

•

edited Apr 21

@jeiku Hmmm... that might be the case. I will enable the default Layla presets to see if it changes anything then, which are:

Temperature 0.85
Dinamic temperature range 0.5
Top P 0.5
Min P 0.1

I will ask on Layla's Discord to see if anyone else has had problems with this model (some more users downloaded it, since I recommended it).

edit: the other Llama3 preset is the one that changes {{char}} and {{user}} to user and assistant. I selected Llama3 Layla because {{char}} and {{user}} is how it appears in character cards.

Clevyby

Apr 21

@EloyOn What kinda app you using? Looks interesting.

EloyOn

Apr 21

•

edited Apr 21

@Clevyby Layla is an app to run local AI directly with the smartphone hardware, it's amazing. You'll need at least a smartphone with 8-12GB RAM though. https://www.layla-network.ai/

jeiku

Apr 21

•

edited Apr 21

@EloyOn You're a champ for having the patience to run a 7-8B on a smartphone, I used to run 3B with KCPP and SillyTavern and that was too slow for me. I hope the Layla devs/community can help you get this worked out, but unfortunately, I don't have much experience with Layla (outside of some testing of the Lite version.)

EloyOn

Apr 21

•

edited Apr 21

@jeiku I own a Xiaomi 14 with 16GB RAM / snapdragon 8 gen 3 (I bought it due to Layla xD ), and you'd be amazed how fast it generates on a Q4K_S from a 8B. You can't use crazy long character cards, but mine aren't short either.

We are at a point when we can comfortably run 4K_S 8B's at a decent speed/quality on smartphones. Future's looking good.

edit: don't worry, I'll keep messing around to see if I solve it. I like Aura a lot.

jeiku

Apr 21

•

edited Apr 21

@jeiku I own a Xiaomi 14 with 16GB RAM / snapdragon 8 gen 3 (I bought it due to Layla xD ), and you'd be amazed how fast it generates on a Q4K_S from a 8B. You can't use crazy long character cards, but mine aren't short either.

We are at a point when we can comfortably run 4K_S 8B's at a decent speed/quality on smartphones. Future's looking good.

I'm having to make the tough decision between a new PC and a new phone and I've pretty much settled on the PC since it will allow me free training basically. Maybe one day I'll upgrade from my Note10, but at this rate it won't be for a year or more.

EloyOn

Apr 21

•

edited Apr 21

@jeiku
I'm having to make the tough decision between a new PC and a new phone and I've pretty much settled on the PC since it will allow me free training basically. Maybe one day I'll upgrade from my Note10, but at this rate it won't be for a year or more.

Yeah, I agree with you, between a PC and a smartphone it's always better a good PC.
I couldn't afford a good computer though (I need one too), but I bought it directly from China, so it was almost half the amount I should have paid if I bought it in EU: https://tradingshenzhen.com/en/xiaomi-14-pro-xiaomi-14/xiaomi-14-16gb512gb

Lewdiculous changed discussion title from Broken generation to Broken generation on Layla App Apr 21

EloyOn

Apr 22

•

edited Apr 22

@jeiku
Alright, I changed the temperature parameters to Layla default instead of the recommended Llama 3 ones and it worked like a charm. I've been using the model with a fresh character card that I created and also the one that crashed the chat last time, without any issue.

Love the model writing and "personality". I'm going to use it as my main for some time, probably a long time. Good job.

EloyOn changed discussion status to closed Apr 22