Post
2556
Here is my latest study on OpenAI🍓o1🍓.
A Case Study of Web App Coding with OpenAI Reasoning Models (2409.13773)
I wrote an easy-to-read blogpost to explain finding.
https://huggingface.co/blog/onekq/daily-software-engineering-work-reasoning-models
INSTRUCTION FOLLOWING is the key.
100% instruction following + Reasoning = new SOTA
But if the model misses or misunderstands one instruction, it can perform far worse than non-reasoning models.
A Case Study of Web App Coding with OpenAI Reasoning Models (2409.13773)
I wrote an easy-to-read blogpost to explain finding.
https://huggingface.co/blog/onekq/daily-software-engineering-work-reasoning-models
INSTRUCTION FOLLOWING is the key.
100% instruction following + Reasoning = new SOTA
But if the model misses or misunderstands one instruction, it can perform far worse than non-reasoning models.