Kale

Zyn123

AI & ML interests

None yet

Recent Activity

upvoted an article 22 days ago

Open-R1: a fully open reproduction of DeepSeek-R1

upvoted an article 22 days ago

Open-R1: Update #1

upvoted an article about 1 month ago

Mastering Tensor Dimensions in Transformers

View all activity

Organizations

None yet

Zyn123's activity

upvoted 2 articles 22 days ago

Article

Open-R1: a fully open reproduction of DeepSeek-R1

30 days ago

• 776

Article

Open-R1: Update #1

and 7 others •

25 days ago

• 288

upvoted an article about 1 month ago

Article

Mastering Tensor Dimensions in Transformers

•

Jan 12

• 44

upvoted an article about 2 months ago

Article

Deriving DPO's Loss

•

Dec 24, 2024

• 26

upvoted 3 articles 4 months ago

Article

Decoding Strategies in Large Language Models

•

Oct 29, 2024

• 44

Article

Fine-tune Llama 2 with DPO

Aug 8, 2023

• 43

Article

How to build a custom text classifier without days of human labeling

and 4 others •

Oct 17, 2024

• 55

upvoted an article 5 months ago

Article

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

Sep 18, 2024

• 223

upvoted 3 articles 6 months ago

Article

Fine-Tune Whisper with 🤗 Transformers

Nov 3, 2022

• 177

Article

Llama-3.1-Storm-8B: Improved SLM with Self-Curation + Model Merging

•

Aug 19, 2024

• 77

Article

Merge Large Language Models with mergekit

•

Jan 9, 2024

• 95

upvoted an article 7 months ago

Article

TGI Multi-LoRA: Deploy Once, Serve 30 Models

Jul 18, 2024

• 56

upvoted a paper 8 months ago

LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs

Paper • 2406.15319 • Published Jun 21, 2024 • 64

upvoted 2 articles 9 months ago

Article

makeMoE: Implement a Sparse Mixture of Experts Language Model from Scratch

•

May 7, 2024

• 55

Article

Everything About Long Context Fine-tuning

•

May 10, 2024

• 40

upvoted a paper 9 months ago

Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models

Paper • 2405.01535 • Published May 2, 2024 • 121

upvoted an article 9 months ago

Article

Let's talk about LLM evaluation

•

May 23, 2024

• 154

upvoted an article 10 months ago

Article

Mergoo: Efficiently Build Your Own MoE LLM

•

Jun 3, 2024

• 45

upvoted a paper 11 months ago

Judging LLM-as-a-judge with MT-Bench and Chatbot Arena

Paper • 2306.05685 • Published Jun 9, 2023 • 33

upvoted a paper 12 months ago

ShortGPT: Layers in Large Language Models are More Redundant Than You Expect

Paper • 2403.03853 • Published Mar 6, 2024 • 63