Deniz Aybey's picture

Deniz Aybey PRO

denizaybey

·

https://sonne.technology

AI & ML interests

None yet

Recent Activity

liked a Space about 13 hours ago

yourbench/demo

liked a model about 21 hours ago

reducto/RolmOCR

liked a model about 21 hours ago

ByteDance/MegaTTS3

View all activity

Organizations

denizaybey's activity

upvoted 3 collections 13 days ago

MoshiVis v0.1

MoshiVis is a Vision Speech Model built as a perceptually-augmented version of Moshi v0.1 for conversing about image inputs • 8 items • Updated 17 days ago • 22

MambaVision

MambaVision: A Hybrid Mamba-Transformer Vision Backbone. Includes both 1K and 21K pretrained models. • 13 items • Updated 3 days ago • 31

YOLOE

10 items • Updated 23 days ago • 5

upvoted an article 26 days ago

Article

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

26 days ago

• 376

upvoted a paper 26 days ago

VACE: All-in-One Video Creation and Editing

Paper • 2503.07598 • Published 28 days ago • 44

upvoted 2 collections about 1 month ago

Babel

Open Multilingual Large Language Models Serving Over 90% of Global Speakers • 7 items • Updated 5 days ago • 17

Phi-4

Phi-4 family of small language and multi-modal models. • 7 items • Updated Mar 3 • 113

upvoted 2 articles about 1 month ago

Article

FastRTC: The Real-Time Communication Library for Python

Feb 25

• 152

Article

SigLIP 2: A better multilingual vision language encoder

Feb 21

• 148

upvoted 2 collections about 2 months ago

Ovis2

Our latest advancement in multi-modal large language models (MLLMs) • 15 items • Updated 13 days ago • 59

Dolphin 3.0

Dolphin 3.0 is the next generation of the Dolphin series of instruct-tuned models. Designed to be the ultimate general purpose local model. • 9 items • Updated Feb 7 • 122

upvoted 2 collections 2 months ago

Qwen2.5-1M

The long-context version of Qwen2.5, supporting 1M-token context lengths • 3 items • Updated Feb 26 • 111

Qwen2.5-VL

Vision-language model series based on Qwen2.5 • 11 items • Updated 7 days ago • 436

upvoted a paper 4 months ago

SVGFusion: Scalable Text-to-SVG Generation via Vector Space Diffusion

Paper • 2412.10437 • Published Dec 11, 2024 • 4

upvoted 3 collections 4 months ago

Falcon3

Falcon3 family of Open Foundation Models is a set of pretrained and instruct LLMs ranging from 1B to 10B parameters. • 40 items • Updated Feb 13 • 84

PaliGemma 2 Release

Vision-Language Models available in multiple 3B, 10B and 28B variants. • 32 items • Updated 4 days ago • 146

🧠 Reasoning Models

9 items • Updated 10 days ago • 38

upvoted 3 collections 5 months ago

MobileLLM

Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (ICML 2024) https://arxiv.org/abs/2402.14905 • 9 items • Updated Nov 27, 2024 • 111

LipSync and Face Operations

17 items • Updated 12 days ago • 47

LongVU

7 items • Updated Oct 31, 2024 • 30