In Honour of This Year's NeurIPs Test of Time Paper Awardees
•
1
I used to think this way, but as it turned these models don't just do probability distribution, they are actually learning features between these distributions and to use these features during inference require some "reasoning", capable models (gpt4, gpt3, claude3) prior to OpenAI o1 could barely reason through tasks, o1 now utilizes RL to boost reasoning during inference - scaling at inference has been a huge challenge but somehow OAI figured it out with RL. Obviously we are at an early stage of this breakthrough, proof of reasoning will become clearer in subsequent versions of o1.
Geoffrey Hinton gave a talk on this topic: https://www.youtube.com/watch?v=N1TEjTeQeg0