vik PRO

vikhyatk

AI & ML interests

None yet

Organizations

vikhyatk's activity

posted an update 6 days ago
view post
Post
1237
Cool new dataset from @isidentical - isidentical/moondream2-coyo-5M-captions

The VeCLIP paper showed a +3% gain while only using 14% of the data by synthetically captioning like this. You get diversity from the alt text (middle column) without having to deal with all of the noise.
  • 1 reply
Β·
posted an update 23 days ago
view post
Post
2911
Updated the vikhyatk/lnqa dataset to include images, so you no longer need to separately download them from OpenImages!
posted an update about 2 months ago
view post
Post
3275
Released a new version of vikhyatk/moondream2 today! Primarily focused on improving OCR and captioning (e.g. "Describe this image", "Describe this image in one sentence"), but also seeing general improvement across all benchmarks.
  • 1 reply
Β·
posted an update about 2 months ago
posted an update about 2 months ago
view post
Post
2182
Just released a dataset with 1.5M image question/answers! vikhyatk/lnqa
replied to their post 2 months ago
view reply

Definitely, I'm planning to set up a blog some time soon.

posted an update 2 months ago
view post
Post
New moondream update out with significantly improved OCR performance (among other benchmarks)!
vikhyatk/moondream2
Β·
posted an update 2 months ago
posted an update 2 months ago
view post
Post
Just released moondream2 - a small 1.8B parameter vision language model. Now fully open source (Apache 2.0) so you can use it without restrictions on commercial use!

vikhyatk/moondream2
Β·
posted an update 4 months ago
view post
Post
moondream1 can now be used directly from transformers!
  • 1 reply
Β·