Carl Johnson's picture

Carl Johnson

fluffyone
ยท

AI & ML interests

None yet

Recent Activity

Organizations

None yet

fluffyone's activity

reacted to Nitral-AI's post with ๐Ÿ˜” about 19 hours ago
view post
Post
2850
That moment when you spend 5 days up babysitting trains, only for colab pro + to randomly disconnect the environment at every chance with 0 error indication of any kind (it just disconnects without an error). Nuke the session from the interface, but continue to eat my colab credits while it reports to wandb. 0 way of saving the models when this happens since it nukes the code preset up to auto-execute. And since the sessions 'exist' but also at the same time doesn't exist i cant close it. And have to wait till they auto timeout after 24hrs. Guess, i won't be using colab for 'quick' test trains anymore. Thanks google for scheming the very little model training budget i had for the month.
ยท

IQ3m

4
#1 opened 3 months ago by
fluffyone
reacted to Lewdiculous's post with โค๏ธ๐Ÿš€ 8 months ago
view post
Post
44290
More context for your Pascal GPU or older!

Update: Now available in the official releases of KoboldCpp!
[releases] https://github.com/LostRuins/koboldcpp/releases/latest

These are great news for all the users with GTX 10XX, P40...

Flash Attention implementation for older NVIDIA GPUs without requiring Tensor Cores has come to llama.cpp in the last few days, and should be merged in the next version of KoboldCpp, you can already try it with another fork or by building it.

[Mentioned KCPP fork] https://github.com/Nexesenex/kobold.cpp/releases/latest

[PR] https://github.com/ggerganov/llama.cpp/pull/7188

You should expect less VRAM usage for the same context, allowing you to experience higher contexts with your current GPU.

There have also been reported final tokens/second speed improvements for inference, so that's also grand!

If you have tried it, I'd like to hear your experiences with --flashattention so far, especially for this implementation and for the large number of Pascal (GTX 10XX, P40...) cards.

Discussion linked bellow, with more links to relevant information:

https://huggingface.co/LWDCLS/LLM-Discussions/discussions/11

Cheers!
ยท
reacted to Undi95's post with ๐Ÿ”ฅ 8 months ago
view post
Post
16480
Hey everyone,

Just wanted to shout out a massive thank you to all 2000 of you who've followed me on Hugging Face! ๐ŸŽ‰ It's incredible to have such an awesome crew backing me up as I dive into all these LLM experiments.

Even though not all my models turn out perfect, I've found some real gems and methods along the way ๐Ÿ’Ž. It's like digging for treasure โ€“ sometimes you found nothing, but sometimes you find a pearl, and sometimes you find a new method to try.

Your support and encouragement mean the world to me, and I'm really stoked to keep experimenting and learning. If you told me some years ago I would have so much people following me for what I do, I wouldn't have believed it. Here's to more discoveries and adventures ahead! ๐Ÿš€

Also, big thanks once again, and a huge shoutout to @IkariDev for being there through this journey and supporting me. I'm excited for our future work together and hope we will continue to make people happy! ๐Ÿ‘

I want to thank @Gryphe too, since my early work was heavily inspired from MythoMax and the RP/ERP vibe of it. If I'm here today it's probably because of you ๐Ÿ˜‚

I was so close to forget @chargoddard and his amazing tool too! What will we do without mergekit in our life? Thank you! ๐Ÿ™

See y'all at 3k!
ยท