Edmond Jacoupeau's picture

Edmond Jacoupeau

edmond

·

AI & ML interests

None yet

Recent Activity

upvoted a collection 16 days ago

new activity about 1 month ago

google/gemma-3-4b-pt:Is the config.json wrong?

new activity about 1 month ago

google/gemma-3-4b-pt:Wrong configs

View all activity

Organizations

edmond's activity

upvoted a collection 16 days ago

Llama 4

Llama 4 release • 10 items • Updated 17 days ago • 443

upvoted a collection about 1 month ago

DeepSeek-V3

4 items • Updated 28 days ago • 241

upvoted 2 papers 3 months ago

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published Jan 28 • 120

The GAN is dead; long live the GAN! A Modern GAN Baseline

Paper • 2501.05441 • Published Jan 9 • 92

upvoted a collection 6 months ago

Llama3-8B-1.58

A trio of powerful models: fine-tuned from Llama3-8b-Instruct, with BitNet architecture! • 3 items • Updated Sep 14, 2024 • 11

upvoted an article 7 months ago

Article

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

Sep 18, 2024

• 235

upvoted a paper 9 months ago

KAN or MLP: A Fairer Comparison

Paper • 2407.16674 • Published Jul 23, 2024 • 44

upvoted 2 collections 10 months ago

Gemma 2 Release

15 items • Updated 19 days ago • 217

Florence

9 items • Updated 5 days ago • 167

upvoted a paper 11 months ago

Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization

Paper • 2405.15071 • Published May 23, 2024 • 42

upvoted an article 11 months ago

Article

PaliGemma – Google's Cutting-Edge Open Vision Language Model

May 14, 2024

• 246

upvoted a collection 11 months ago

PaliGemma Release

Pretrained and mix checkpoints for PaliGemma • 16 items • Updated 19 days ago • 145

upvoted a collection 12 months ago

LLaVA++ (LLaMA-3 and Phi-3-Mini)

Extending Visual Capabilities of LLaVA with LLaMA-3 and Phi-3 • 11 items • Updated Jun 11, 2024 • 23

upvoted 3 papers 12 months ago

Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention

Paper • 2404.07143 • Published Apr 10, 2024 • 110

Voyager: An Open-Ended Embodied Agent with Large Language Models

Paper • 2305.16291 • Published May 25, 2023 • 10

MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge

Paper • 2206.08853 • Published Jun 17, 2022 • 1

upvoted a collection 12 months ago

Meta Llama 3

This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Dec 6, 2024 • 748

upvoted a paper about 1 year ago

Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing

Paper • 2404.12253 • Published Apr 18, 2024 • 56

upvoted 2 papers over 1 year ago

Q-Align: Teaching LMMs for Visual Scoring via Discrete Text-Defined Levels

Paper • 2312.17090 • Published Dec 28, 2023 • 4

Generative AI Beyond LLMs: System Implications of Multi-Modal Generation

Paper • 2312.14385 • Published Dec 22, 2023 • 7