sroecker (Steffen Röcker)

upvoted an article 1 day ago

Article

How to directly access 150k+ Hugging Face Datasets with DuckDB and query using GPT-4o

By

•

1 day ago

• 5

upvoted an article 3 days ago

Article

⚗️ 🔥 Building High-Quality Datasets with distilabel and Prometheus 2

By

•

4 days ago

• 20

upvoted a collection 3 days ago

sentence-transformers-from-synthetic-data

Collection

Example of using distilabel to generate synthetic triplets data for fine-tuning a Sentence Transformer model • 3 items • Updated 1 day ago • 15

upvoted an article 4 days ago

Article

Training and Finetuning Embedding Models with Sentence Transformers v3

5 days ago

• 61

upvoted a paper 5 days ago

Granite Code Models: A Family of Open Foundation Models for Code Intelligence

Paper • 2405.04324 • Published 26 days ago • 14

upvoted a collection 6 days ago

🤖Phi-3

Collection

6 items • Updated 7 days ago • 1

upvoted an article 7 days ago

Article

GPU Poor Savior: Revolutionizing Low-Bit Open Source LLMs and Cost-Effective Edge Computing

By

•

8 days ago

• 9

upvoted 2 collections 8 days ago

DiscoLeo 8B: Llama3 for German

Collection

Continued Pretraining on Llama3 8B to improve German linguistic capabilities. A collection of base and fine-tuned models and variants. • 5 items • Updated 8 days ago • 12

DiscoLeo 8B quants

Collection

A collection of different quantizations of the DiscoLeo models. • 3 items • Updated 8 days ago • 3

upvoted a collection 9 days ago

C4AI Aya 23

Collection

Aya 23 is an open weights research release of an instruction fine-tuned model with highly advanced multilingual capabilities. • 3 items • Updated 10 days ago • 34

upvoted an article 9 days ago

Article

⚗️ 🧑🏼‍🌾 Let's grow some Domain Specific Datasets together

By

•

Apr 29

• 27

upvoted a collection 10 days ago

C4AI Command R Plus

Collection

C4AI Command R+ is an open weights research release of a 104B billion parameter model with highly advanced capabilities. • 3 items • Updated 10 days ago • 18

upvoted an article 10 days ago

Article

Let's talk about LLM evaluation

By

•

10 days ago

• 82

upvoted a collection 11 days ago

Phi-3

Collection

Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. • 22 items • Updated 2 days ago • 299

upvoted 2 collections 12 days ago

CommonCatalog

Collection

Common Catalog, a dataset with Creative Commons licensed images and machine-generated caption pairs • 8 items • Updated 16 days ago • 7

M2-BERT Embeddings

Collection

Models and Datasets for M2-BERT and LoCoV1 • 10 items • Updated 11 days ago • 2

upvoted a collection 20 days ago

Yi-1.5 (2024/05)

Collection

10 items • Updated 13 days ago • 76

upvoted a paper 26 days ago

What matters when building vision-language models?

Paper • 2405.02246 • Published 29 days ago • 87

upvoted 2 collections 26 days ago

Granite Code Models

Collection

A series of code models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models. • 18 items • Updated 2 days ago • 135

SPPO

Collection

Self-Play Preference Optimization • 4 items • Updated 28 days ago • 2

upvoted an article 27 days ago

Article

Saving Memory Using Padding-Free Transformer Layers during Finetuning

By

•

Mar 9

• 4

upvoted an article 28 days ago

Article

Introducing Idefics2: A Powerful 8B Vision-Language Model for the community

Apr 15

• 134

upvoted an article 29 days ago

Article

🧑‍⚖️ "Replacing Judges with Juries" using distilabel

By

•

29 days ago

• 14

upvoted a collection about 1 month ago

llama 3 self-align experiments

Collection

Replicating the pipeline for StarCoder-2 Instruct on Llama-3-8B with some tweaks https://huggingface.co/blog/sc2-instruct • 4 items • Updated 23 days ago • 6

upvoted 2 articles about 1 month ago

Article

Improving Prompt Consistency with Structured Generations

Apr 30

• 46

Article

StarCoder2-Instruct: Fully Transparent and Permissive Self-Alignment for Code Generation

Apr 29

• 69

upvoted a collection about 1 month ago

LLaVA-Phi-3-mini

Collection

4 items • Updated Apr 28 • 11

upvoted 3 articles about 1 month ago

Article

Design choices for Vision Language Models in 2024

By

•

Apr 16

• 20

Article

Post-OCR-Correction: 1 billion words dataset of automated OCR correction by LLM

By

•

Apr 26

• 10

Article

LLM Comparison/Test: Llama 3 Instruct 70B + 8B HF/GGUF/EXL2 (20 versions tested and compared!)

By

•

Apr 24

• 48

upvoted a collection about 1 month ago

〽️MistralAI

Collection

A collection of MistralAI models that you can trust in production! • 10 items • Updated 7 days ago • 7

upvoted 2 collections 5 months ago

Zeroshot Classifiers

Collection

These are my current best zeroshot classifiers. Some of my older models are downloaded more often, but the models in this collection are newer/better. • 11 items • Updated Apr 3 • 80

LLM Speculative Decoding

Collection

Tiny language models meant to serve as draft models for speculative decoding. • 6 items • Updated Jan 6 • 2

upvoted a collection 8 months ago

Recent models: last 100 repos, sorted by creation date

Collection

The last 100 repos I have created. Sorted by creation date descending, so the most recently created repos appear at the top. • 121 items • Updated Jan 31 • 448

Steffen Röcker

AI & ML interests

Organizations

sroecker's activity

How to directly access 150k+ Hugging Face Datasets with DuckDB and query using GPT-4o

⚗️ 🔥 Building High-Quality Datasets with distilabel and Prometheus 2

sentence-transformers-from-synthetic-data

Training and Finetuning Embedding Models with Sentence Transformers v3

Granite Code Models: A Family of Open Foundation Models for Code Intelligence

🤖Phi-3

GPU Poor Savior: Revolutionizing Low-Bit Open Source LLMs and Cost-Effective Edge Computing

DiscoLeo 8B: Llama3 for German

DiscoLeo 8B quants

C4AI Aya 23

⚗️ 🧑🏼‍🌾 Let's grow some Domain Specific Datasets together

C4AI Command R Plus

Let's talk about LLM evaluation

Phi-3

CommonCatalog

M2-BERT Embeddings

Yi-1.5 (2024/05)

What matters when building vision-language models?

Granite Code Models

SPPO

Saving Memory Using Padding-Free Transformer Layers during Finetuning

Introducing Idefics2: A Powerful 8B Vision-Language Model for the community

🧑‍⚖️ "Replacing Judges with Juries" using distilabel

llama 3 self-align experiments

Improving Prompt Consistency with Structured Generations

StarCoder2-Instruct: Fully Transparent and Permissive Self-Alignment for Code Generation

LLaVA-Phi-3-mini

Design choices for Vision Language Models in 2024

Post-OCR-Correction: 1 billion words dataset of automated OCR correction by LLM

LLM Comparison/Test: Llama 3 Instruct 70B + 8B HF/GGUF/EXL2 (20 versions tested and compared!)

〽️MistralAI

Zeroshot Classifiers

LLM Speculative Decoding

Recent models: last 100 repos, sorted by creation date