FilipV's picture
3 1

FilipV

PheelaV
·

AI & ML interests

SW&DS&AI

Recent Activity

Organizations

None yet

PheelaV's activity

upvoted an article 17 days ago
view article
Article

Accelerating Language Model Inference with Mixture of Attentions

By hba123
24
upvoted an article 20 days ago
replied to lbourdois's post 9 months ago
view reply

Brilliant stuff. Personally I would love to see the discretization and latest A initialization digested. Looking forward to the upcoming posts!

upvoted an article 10 months ago
reacted to lbourdois's post with ❤️ 10 months ago
view post
Post
3577
I stopped procrastinating and finally took the time to write the second article of my series of blog posts on SSM: https://huggingface.co/blog/lbourdois/ssm-2022.
In this blog post, I review the history of SSM models released in 2022, with over 14 models discussed in a synthetic format.
They are separated into two parts: "theoretical" (DSS, S4D, GSS, Mega, S5, etc.) and "applications" (Sashimi, ViS4mer, CCNN, etc.).

To understand everything, it's best to have read the introduction to S4 to SSM blog post first: https://huggingface.co/blog/lbourdois/get-on-the-ssm-train.
All the articles in the series are listed in this space: lbourdois/SSM_blog_posts

Wishing you a good reading :)
  • 2 replies
·
replied to ArthurZ's post 11 months ago
view reply

2.8b model is that the Pile or SlimPJ trained one?