Oguzhan Ozcelik

ogozcelik

AI & ML interests

NLP Engineer

Recent Activity

liked a Space 5 days ago
merve/OWLSAM
liked a model 5 days ago
facebook/sam-vit-base
liked a model 8 days ago
NexaAIDev/omnivision-968M
View all activity

Organizations

None yet

ogozcelik's activity

Reacted to merve's post with ๐Ÿ”ฅ 8 days ago
view post
Post
4753
OmniVision-968M: a new local VLM for edge devices, fast & small but performant
๐Ÿ’จ a new vision language model with 9x less image tokens, super efficient
๐Ÿ“– aligned with DPO for reducing hallucinations
โšก๏ธ Apache 2.0 license ๐Ÿ”ฅ

Demo hf.co/spaces/NexaAIDev/omnivlm-dpo-demo
Model NexaAIDev/omnivision-968M
  • 4 replies
ยท
Reacted to ucsahin's post with ๐Ÿ”ฅ 4 months ago
view post
Post
3649
๐Ÿš€ Introducing TraVisionLM: Turkish Visual Language Model - The First of Its Kind! ๐Ÿ‡น๐Ÿ‡ท๐Ÿ–ผ๏ธ

I'm thrilled to share TraVisionLM on Hugging Face! With 875M parameters, this lightweight, efficient model handles Turkish instructions for image inputs. Fully compatible with the Transformers library, itโ€™s easy to load, fine-tune, and useโ€”no external libraries needed!

Developed solo, TraVisionLM is a strong foundation for low-resource language research. While still improving, it's a key step for Turkish-language AI. Your feedback is welcome as I refine the model.

๐ŸŽ‰ Explore it now:

- Model: ucsahin/TraVisionLM-base
- Demo: https://huggingface.co/spaces/ucsahin/TraVisionLM-Turkish_Visual_Language_Model
- Object Detection Finetune: ucsahin/TraVisionLM-Object-Detection-ft

Letโ€™s push Turkish visual language processing forward!

---

๐Ÿš€ TraVisionLM: Tรผrรผnรผn ฤฐlk ร–rneฤŸi Tรผrkรงe Gรถrsel Dil Modelini Sunuyorum! ๐Ÿ‡น๐Ÿ‡ท๐Ÿ–ผ๏ธ

TraVisionLM modelini Hugging Face'te yayฤฑnladฤฑm! 875M parametre ile bu hafif ve verimli model, gรถrรผntรผye dayalฤฑ Tรผrkรงe talimatlarฤฑ iลŸlemek iรงin tasarlandฤฑ. Transformers kรผtรผphanesiyle tamamen uyumlu, yรผklemesi, eฤŸitmesi ve kullanmasฤฑ รงok kolayโ€”dฤฑลŸ kรผtรผphane gerekmez!

Tek baลŸฤฑma geliลŸtirdiฤŸim TraVisionLM, dรผลŸรผk kaynaklฤฑ dillerde araลŸtฤฑrmalar iรงin saฤŸlam bir temel sunuyor. GeliลŸtirmeye devam ederken geri bildirimlerinizi bekliyorum.

๐ŸŽ‰ Hemen keลŸfedin:

- Model: ucsahin/TraVisionLM-base
- Demo: https://huggingface.co/spaces/ucsahin/TraVisionLM-Turkish_Visual_Language_Model
- Obje Tespiti ฤฐnce Ayarฤฑ: ucsahin/TraVisionLM-Object-Detection-ft

Tรผrkรงe gรถrsel dil iลŸleme sฤฑnฤฑrlarฤฑnฤฑ birlikte zorlayalฤฑm!
  • 3 replies
ยท
Reacted to m-ric's post with โค๏ธ 5 months ago
view post
Post
3215
๐๐ž๐ฐ ๐๐ž๐œ๐จ๐๐ข๐ง๐  ๐ญ๐ž๐œ๐ก๐ง๐ข๐ช๐ฎ๐ž ๐ข๐ง ๐ญ๐ซ๐š๐ง๐ฌ๐Ÿ๐จ๐ซ๐ฆ๐ž๐ซ๐ฌ ๐ฌ๐ข๐ ๐ง๐ข๐Ÿ๐ข๐œ๐š๐ง๐ญ๐ฅ๐ฒ ๐ซ๐ž๐๐ฎ๐œ๐ž๐ฌ ๐ก๐š๐ฅ๐ฅ๐ฎ๐œ๐ข๐ง๐š๐ญ๐ข๐จ๐ง๐ฌ ๐Ÿ‘

DoLa decoding, which made a conference paper at ICLR '24, has just been merged in Transformers by @joaogante and Yung-Sung Chuang.
This new decoding method is simple yet extremely impressive!

Reminder: Decoder LLMs (the GPT kind of LLM, the most common one) generate their outputs one token at a time: at each step, given a current text, they compute a logit for each token in their vocabulary that should represent the probability of this token coming next.

Then they either pick the highest logit token (greedy decoding) or sample one with a probability defined by the logits (sampling).

The authors of DoLa wanted to improve that simple method.

They knew this established fact that transformer LMs encode low-level info (like base syntax) in early layers and more high-level info like knowledge in the later layers.

๐Ÿ’ก This gave them their key idea: During decoding, rather than picking the token with the highest logit, ๐˜„๐—ต๐˜† ๐—ป๐—ผ๐˜ ๐—ฝ๐—ถ๐—ฐ๐—ธ ๐˜๐—ต๐—ฒ ๐˜๐—ผ๐—ธ๐—ฒ๐—ป ๐˜„๐—ถ๐˜๐—ต ๐˜๐—ต๐—ฒ ๐—บ๐—ผ๐˜€๐˜ ๐—ถ๐—บ๐—ฝ๐—ฟ๐—ฒ๐˜€๐˜€๐—ถ๐˜ƒ๐—ฒ ๐—ถ๐—ป๐—ฐ๐—ฟ๐—ฒ๐—ฎ๐˜€๐—ฒ ๐—ถ๐—ป ๐—น๐—ผ๐—ด๐—ถ๐˜ ๐—ฎ๐—ฐ๐—ฟ๐—ผ๐˜€๐˜€ ๐—น๐—ฎ๐˜†๐—ฒ๐—ฟ๐˜€?

This gives impressive results:
๐Ÿš€ ๐Ÿฑ% - ๐Ÿฎ๐Ÿฌ% ๐—ฏ๐—ฎ๐˜€๐—ฒ ๐—ฝ๐—ผ๐—ถ๐—ป๐˜๐˜€ ๐—ถ๐—ป๐—ฐ๐—ฟ๐—ฒ๐—ฎ๐˜€๐—ฒ ๐—ฎ๐—ฐ๐—ฟ๐—ผ๐˜€๐˜€ ๐˜๐—ต๐—ฒ ๐—ฏ๐—ฒ๐—ป๐—ฐ๐—ต๐—บ๐—ฎ๐—ฟ๐—ธ๐˜€
๐Ÿš€ For instance on TruthfulQA / Open-ended, across all model sizes the increase in truthfulness is 14 base points, which is ๐—ฎ๐—ฟ๐—ผ๐˜‚๐—ป๐—ฑ ๐Ÿฐ๐Ÿฌ% ๐—ถ๐—บ๐—ฝ๐—ฟ๐—ผ๐˜ƒ๐—ฒ๐—บ๐—ฒ๐—ป๐˜ ๐—ฐ๐—ผ๐—บ๐—ฝ๐—ฎ๐—ฟ๐—ฒ๐—ฑ ๐˜๐—ผ ๐˜€๐˜๐—ฎ๐—ป๐—ฑ๐—ฎ๐—ฟ๐—ฑ ๐—ฑ๐—ฒ๐—ฐ๐—ผ๐—ฑ๐—ถ๐—ป๐—ด!

๐Ÿค” Wouldn't decoding take longer because of this added contrasting step? ๐Ÿ‘‰ ๐—ง๐—ต๐—ฒ ๐—ฟ๐˜‚๐—ป๐˜๐—ถ๐—บ๐—ฒ ๐—ถ๐—ป๐—ฐ๐—ฟ๐—ฒ๐—ฎ๐˜€๐—ฒ ๐—ถ๐˜€ ๐—ป๐—ฒ๐—ด๐—น๐—ถ๐—ด๐—ถ๐—ฏ๐—น๐—ฒ, ๐Ÿญ ๐˜๐—ผ ๐Ÿด% ๐—ผ๐—ป๐—น๐˜†.

Paper added to my collection ๐Ÿ‘‰ m-ric/optimization-mechanics-661d543a5fc6ca1dc84284a0
  • 2 replies
ยท