mmhamdy (Mohammed Hamdy)

Posts 4

Post

927

💡 Thinking Tokens For Language Models!

How much is 56 times 37? Can you answer that right away?

In a short paper, David Herel and Tomas Mikolov propose a simple method to improve the reasoning of language models when performing complex calculations.

📌 They note that, although language models are not that good with difficult calculations, humans also cannot perform these calculations immediately and require a considerable amount of time to come up with an answer.

Inspired by this, they introduce 💡Thinking Tokens💡

So what are those "thinking tokens"?! Nothing fancy, they are just special tokens '<T>' that you insert after each word in a sentence whenever a complex problem is encountered. That's it!

👉 The main idea is to "buy" the model "some time" to think about the problem with these additional computations before answering. Using this method they observed an improved (a little bit) perplexity.

👉 Before getting excited note that: They have added these tokens manually, and they have used an RNN language model. From the paper:

"As a proof of concept, we have added N ’thinking tokens’ (< T >) after each observed word in a dataset. Our vision is that this basic concept can be extended to a self-adjusting model, which will be able to decide itself if and how many ’thinking tokens’ will be used for a specific problem, where N could also vary throughout the sentence. This would allow us to reduce the computational time, which would not increase N times."

Post

1137

Was this the first Transformers citation? In deep reinforcement learning?!

From "Deep Reinforcement Learning: An Overview": https://arxiv.org/abs/1701.07274v3

View all posts

Collections 1

spaces 3

Sleeping

🏆

models 17

Mohammed Hamdy

AI & ML interests

Organizations

Posts 4

Collections 1

facebook/esmfold_v1

ElnaggarLab/ankh-base

ElnaggarLab/ankh-large

RITA: a Study on Scaling Up Generative Protein Sequence Models

spaces 3

Speech To Speech Translation

Automatic Speech Recognition

Music Genre Classifier

models 17

mmhamdy/speecht5-finetuned-fleurs-it-it

mmhamdy/whisper-tiny-finetuned-minds14-en-us

mmhamdy/whisper-tiny-finetuned-gtzan

mmhamdy/poca-SoccerTwos

mmhamdy/rl_course_vizdoom_health_gathering_supreme

mmhamdy/ppo-LunarLander-v2-2

mmhamdy/a2c-PandaReachDense-v2

mmhamdy/a2c-AntBulletEnv-v0

mmhamdy/Reinforce-Pixelcopter-PLE-v0

mmhamdy/ppo-Pyramids

datasets 1

mmhamdy/Arabic-OpenHermes-Filtered

Mohammed Hamdy

AI & ML interests

Organizations

Posts 4

Collections 1

spaces 3 Sort: Recently updated

Speech To Speech Translation

Automatic Speech Recognition

Music Genre Classifier

models 17 Sort: Recently updated

datasets 1

spaces 3

models 17