Luca Baggi's picture

Luca Baggi

lucabaggi

·

baggiponte

AI & ML interests

None yet

Organizations

lucabaggi's activity

upvoted a paper 17 days ago

Baichuan-Omni Technical Report

Paper • 2410.08565 • Published 21 days ago • 82

upvoted a paper 26 days ago

Were RNNs All We Needed?

Paper • 2410.01201 • Published 30 days ago • 46

upvoted 2 collections about 1 month ago

Moirai-1.0-R models

8 items • Updated about 7 hours ago • 26

Moshi v0.1 Release

MLX, Candle & PyTorch model checkpoints released as part of the Moshi release from Kyutai. Run inference via: https://github.com/kyutai-labs/moshi • 13 items • Updated Sep 18 • 212

upvoted an article 2 months ago

Article

Training and Finetuning Embedding Models with Sentence Transformers v3

May 28

• 153

upvoted a collection 3 months ago

Llama 3.1

This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated Sep 25 • 607

upvoted a collection 4 months ago

Gemma 2 Release

15 items • Updated Sep 9 • 192

upvoted a collection 5 months ago

Qwen2

Qwen2 language models, including pretrained and instruction-tuned models of 5 sizes, including 0.5B, 1.5B, 7B, 57B-A14B, and 72B. • 39 items • Updated Sep 18 • 346

upvoted a paper 6 months ago

Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models

Paper • 2405.01535 • Published May 2 • 116

upvoted a collection 6 months ago

Granite Code Models

A series of code models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models. • 23 items • Updated 2 days ago • 173

upvoted a paper 7 months ago

Rho-1: Not All Tokens Are What You Need

Paper • 2404.07965 • Published Apr 11 • 84

upvoted 2 collections 7 months ago

Zeroshot Classifiers

These are my current best zeroshot classifiers. Some of my older models are downloaded more often, but the models in this collection are newer/better. • 11 items • Updated Apr 3 • 110

DBRX

DBRX is a mixture-of-experts (MoE) large language model trained from scratch by Databricks. • 3 items • Updated Mar 27 • 90

upvoted 2 collections 8 months ago

Common Corpus

The largest public domain dataset for training LLMs. • 27 items • Updated Jul 17 • 113

VILA: On Pre-training for Visual Language Models

10 items • Updated about 23 hours ago • 45

upvoted a collection 9 months ago

LLaVA-1.6

A collection of LLaVA-1.6 checkpoints • 4 items • Updated Jan 31 • 65

upvoted a collection 10 months ago

Model Merging

Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! • 30 items • Updated Jun 12 • 217

upvoted 2 papers 12 months ago

Orca 2: Teaching Small Language Models How to Reason

Paper • 2311.11045 • Published Nov 18, 2023 • 70

Exponentially Faster Language Modelling

Paper • 2311.10770 • Published Nov 15, 2023 • 118

upvoted a collection 12 months ago

Distil-Whisper Models

The first version of the Distil-Whisper models released with the Distil-Whisper paper. • 4 items • Updated Mar 21 • 35