32 5 16

Waseem AlShikh

wassemgtk

https://writer.com/

AI & ML interests

Multi-modal, Palmyra LLMs, Knowledge Graph

Recent Activity

updated a model 4 days ago

wassemgtk/mergekit-ties-isswcgh

published a model 4 days ago

wassemgtk/mergekit-ties-isswcgh

replied to their post 7 days ago

I’ve been diving into the iRoPE architecture from Llama 4—a game-changer for long-context models! It interleaves local attention (with RoPE) for short contexts and global attention (with inference-time temp scaling) for long-range reasoning, aiming for infinite context. I’m going to try writing iRoPE—who wants to help? Code: https://github.com/wassemgtk/iRoPE-try/blob/main/iRoPE.ipynb

View all activity

Organizations

Posts 5

Post

2687

I’ve been diving into the iRoPE architecture from Llama 4—a game-changer for long-context models! It interleaves local attention (with RoPE) for short contexts and global attention (with inference-time temp scaling) for long-range reasoning, aiming for infinite context. I’m going to try writing iRoPE—who wants to help?

Code: https://github.com/wassemgtk/iRoPE-try/blob/main/iRoPE.ipynb

View all Posts