efe's picture

21 28

efe

efecelik

·

AI & ML interests

NLP, LLM

Recent Activity

liked a Space 2 days ago

fffiloni/diffusers-image-outpaint

liked a model 2 days ago

unsloth/DeepSeek-R1-Distill-Llama-8B-GGUF

liked a model 2 days ago

danielhanchen/cosy-05-qwen-unsloth-bnb-4bit

View all activity

Organizations

efecelik's activity

upvoted 2 collections 3 days ago

Deepseek Papers

Deepseek papers collection • 15 items • Updated 3 days ago • 47

🤖 Agents

21 items • Updated Dec 31, 2024 • 115

upvoted a paper 3 days ago

If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code Empowers Large Language Models to Serve as Intelligent Agents

Paper • 2401.00812 • Published Jan 1, 2024 • 5

upvoted an article 5 days ago

Article

Mixture of Experts Explained

Dec 11, 2023

• 305

upvoted a collection 10 days ago

DeepSeek-R1

8 items • Updated 17 days ago • 423

upvoted a collection 2 months ago

Llama 3.3 (All Versions)

Meta's new Llama 3.3 (70B) model in all formats. Includes GGUF, 4-bit bnb and original versions. • 3 items • Updated 3 days ago • 35

upvoted an article 4 months ago

Article

Welcome, Gradio 5

Oct 9, 2024

• 112

upvoted 3 articles 6 months ago

Article

How to communicate in a Pull Request?

By

•

Aug 22, 2024

• 18

Article

Merge Large Language Models with mergekit

By

•

Jan 9, 2024

• 91

Article

A failed experiment: Infini-Attention, and why we should keep trying?

Aug 14, 2024

• 57

upvoted a collection 6 months ago

Model Merging

Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! • 30 items • Updated Jun 12, 2024 • 225

upvoted 2 articles 6 months ago

Article

Fine-Tune Whisper with 🤗 Transformers

Nov 3, 2022

• 155

Article

Context Parallelism

By

•

Aug 13, 2024

• 13

upvoted 3 papers 6 months ago

LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models

Paper • 2407.07895 • Published Jul 10, 2024 • 40

Gemma 2: Improving Open Language Models at a Practical Size

Paper • 2408.00118 • Published Jul 31, 2024 • 76

KAN or MLP: A Fairer Comparison

Paper • 2407.16674 • Published Jul 23, 2024 • 42

upvoted 3 articles 6 months ago

Article

Train a Llama model from scratch

By

•

Jul 29, 2024

• 50

Article

Memory-efficient Diffusion Transformers with Quanto and Diffusers

Jul 30, 2024

• 63

Article

Clarity AI Upscaler Reproduction

By

and 4 others •

Jul 30, 2024

• 20

upvoted a paper 7 months ago

GoldFinch: High Performance RWKV/Transformer Hybrid with Linear Pre-Fill and Extreme KV-Cache Compression

Paper • 2407.12077 • Published Jul 16, 2024 • 55