'Mono' means 'one' btw
#1
by Harley-ml - opened
Since you're model does multi-token prediction, I don't think mono is a fitting name lol.
Good observation,
I intent one forward pass
I just need to scale this up to use more tokens at once, until its the whole answer per forward pass
That's kinda crazy. Maybe try this on a general purpose dataset. I wanna see more results.
in colaboration with https://huggingface.co/Glint-Research
We will release a powerfull model
It will take alot of time