18 60 40

Mohammed Hamdy

mmhamdy

AI & ML interests

TechBio | AI4Sci | NLP | Reinforcement Learning

Recent Activity

posted an update 8 days ago

What inspired the Transformer architecture in the "Attention Is All You Need" paper? And how were various ideas combined to create this groundbreaking model? In this lengthy article, I explore the story and the origins of some of the ideas introduced in the paper. We'll explore everything from the fundamental attention mechanism that lies at its heart to the surprisingly simple explanation for its name, Transformer. 💡 Examples of ideas explored in the article: ✅ What was the inspiration for the attention mechanism? ✅ How did we go from attention to self-attention? ✅ Did the team have any other names in mind for the model? and more... I aim to tell the story of Transformers as I would have wanted to read it, and hopefully, one that appeals to others interested in the details of this fascinating idea. This narrative draws from video interviews, lectures, articles, tweets/Xs, and some digging into the literature. I have done my best to be accurate, but errors are possible. If you find inaccuracies or have any additions, please do reach out, and I will gladly make the necessary updates. Read the article: https://huggingface.co/blog/mmhamdy/pandemonium-the-transformers-story

published an article 8 days ago

Pandemonium: The Transformers Story

published an article 12 days ago

Osirian AI: A Call For The Resurrection And Reuse Of Deep Learning Models.

View all activity

Organizations

Posts 9

Post

1534

What inspired the Transformer architecture in the "Attention Is All You Need" paper? And how were various ideas combined to create this groundbreaking model?

In this lengthy article, I explore the story and the origins of some of the ideas introduced in the paper. We'll explore everything from the fundamental attention mechanism that lies at its heart to the surprisingly simple explanation for its name, Transformer.

💡 Examples of ideas explored in the article:

✅ What was the inspiration for the attention mechanism?
✅ How did we go from attention to self-attention?
✅ Did the team have any other names in mind for the model?

and more...

I aim to tell the story of Transformers as I would have wanted to read it, and hopefully, one that appeals to others interested in the details of this fascinating idea. This narrative draws from video interviews, lectures, articles, tweets/Xs, and some digging into the literature. I have done my best to be accurate, but errors are possible. If you find inaccuracies or have any additions, please do reach out, and I will gladly make the necessary updates.

Read the article: https://huggingface.co/blog/mmhamdy/pandemonium-the-transformers-story

View all Posts