archit's picture

archit

archit11

·

https://archit-spec.github.io

archit-spec

AI & ML interests

small language models, looking for work please reachout archit1290@gmail.com

Recent Activity

liked a model 5 days ago

deepseek-ai/DeepSeek-R1-Distill-Qwen-7B

new activity 11 days ago

archit11/jee_math:Librarian Bot: Add language metadata for dataset

updated a dataset 12 days ago

archit11/jee_math

View all activity

Organizations

archit11's activity

upvoted an article 13 days ago

Article

SigLIP 2: A better multilingual vision language encoder

23 days ago

• 135

upvoted an article about 1 month ago

Article

The case for specialized pre-training: ultra-fast foundation models for dedicated tasks

By

•

Aug 4, 2024

• 29

upvoted 3 collections about 1 month ago

Scotch & SOTA 🥃 Pt. 7: Human Feedback Datasets 🫣

The elusive “human” feedback • 1 item • Updated Sep 13, 2023 • 1

Scotch & SOTA 🥃 Pt. 6: Dialogue Tuning Datasets 💬

Conversations, turn-based dialog, and things that can be turned into that. • 4 items • Updated Sep 13, 2023 • 1

Scotch & SOTA 🥃 Pt. 5: Instruction Tuning Datasets 👩‍🏫

Question & answer, task completion, general SFT and otherwise finetuney data. • 7 items • Updated Sep 13, 2023 • 1

upvoted an article about 1 month ago

Article

How to deploy and fine-tune DeepSeek models on AWS

Jan 30

• 51

upvoted an article about 2 months ago

Article

Can we create pedagogically valuable multi-turn synthetic datasets from Cosmopedia?

By

•

May 7, 2024

• 8

upvoted a collection about 2 months ago

Deepseek Papers

Deepseek papers collection • 18 items • Updated 25 days ago • 168

upvoted an article about 2 months ago

Article

Train 400x faster Static Embedding Models with Sentence Transformers

Jan 15

• 159

upvoted 2 collections 3 months ago

Reasoning

151 items • Updated Apr 6, 2024 • 30

🤖 Agents

21 items • Updated Dec 31, 2024 • 141

upvoted 2 collections 4 months ago

Tulu 3 Datasets

All datasets released with Tulu 3 -- state of the art open post-training recipes. • 33 items • Updated 2 days ago • 75

PixMo

A set of vision-language datasets built by Ai2 and used to train the Molmo family of models. Read more at https://molmo.allenai.org/blog • 10 items • Updated 2 days ago • 67

upvoted a paper 4 months ago

Gemma 2: Improving Open Language Models at a Practical Size

Paper • 2408.00118 • Published Jul 31, 2024 • 76

upvoted a collection 4 months ago

VLM Datasets

29 items • Updated 22 days ago • 1

upvoted 3 articles 4 months ago

Article

Low Code Large Language Model Alignment

By

•

Nov 19, 2024

• 13

Article

The Beginners Guide to Cleaning a Dataset

By

•

Nov 18, 2024

• 24

Article

PyTorchModelHubMixin: Bridging the Gap for Custom AI Models on Hugging Face

By

and 1 other •

Nov 11, 2024

• 16

upvoted a collection 4 months ago

Qwen2.5-Coder

Code-specific model series based on Qwen2.5 • 40 items • Updated Nov 28, 2024 • 292

upvoted an article 4 months ago

Article

Recipe: Preparing Multilingual Speech Datasets for TTS Training

By

and 1 other •

Nov 4, 2024

• 18