Kristaller486's picture

Kristaller486

kristaller486

·

AI & ML interests

NLP, Machine Translation

Recent Activity

liked a model about 6 hours ago

mixedbread-ai/mxbai-rerank-large-v2

liked a model 4 days ago

EuroBERT/EuroBERT-2.1B

liked a model 4 days ago

secemp9/TraceBack-12b

View all activity

Organizations

kristaller486's activity

upvoted a paper 9 days ago

Cognitive Behaviors that Enable Self-Improving Reasoners, or, Four Habits of Highly Effective STaRs

Paper • 2503.01307 • Published 11 days ago • 31

upvoted a collection 17 days ago

Slam

All resources for SpeechLMs from "Slamming: Training a Speech Language Model on One GPU in a Day". We provide tokeniser, lm, and datasets • 6 items • Updated 17 days ago • 13

upvoted a collection 22 days ago

RuModernBERT

Modernized BERT for Russian • 2 items • Updated 23 days ago • 4

upvoted a paper about 1 month ago

SynthDetoxM: Modern LLMs are Few-Shot Parallel Detoxification Data Annotators

Paper • 2502.06394 • Published Feb 10 • 86

upvoted a collection about 1 month ago

Llasa

TTS foundation model compatible with Llama framework (160k hours tokenized speech data released) • 11 items • Updated 21 days ago • 15

upvoted a paper about 1 month ago

LIMO: Less is More for Reasoning

Paper • 2502.03387 • Published Feb 5 • 58

upvoted an article about 1 month ago

Article

Open-R1: a fully open reproduction of DeepSeek-R1

Jan 28

• 803

upvoted a collection about 2 months ago

EvaByte

3 items • Updated Jan 21 • 3

upvoted a paper 2 months ago

Facilitating large language model Russian adaptation with Learned Embedding Propagation

Paper • 2412.21140 • Published Dec 30, 2024 • 18

upvoted 2 collections 3 months ago

DeepSeek-V3

3 items • Updated Jan 6 • 195

FineWeb2 Collaborative Annotation Sprint

5 items • Updated Dec 24, 2024 • 7

upvoted a paper 3 months ago

Switti: Designing Scale-Wise Transformers for Text-to-Image Synthesis

Paper • 2412.01819 • Published Dec 2, 2024 • 35

upvoted a paper 4 months ago

Multi-Granularity Prediction for Scene Text Recognition

Paper • 2209.03592 • Published Sep 8, 2022 • 2

upvoted a collection 4 months ago

Qwen2.5-Coder

Code-specific model series based on Qwen2.5 • 40 items • Updated Nov 28, 2024 • 292

upvoted 2 papers 4 months ago

Constraint Back-translation Improves Complex Instruction Following of Large Language Models

Paper • 2410.24175 • Published Oct 31, 2024 • 18

Language Models can Self-Lengthen to Generate Long Texts

Paper • 2410.23933 • Published Oct 31, 2024 • 18

upvoted a collection 5 months ago

DocLayout-YOLO

Dataset and model for DocLayout-YOLO • 10 items • Updated Jan 14 • 15

upvoted a collection 6 months ago

Qwen2.5

Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 46 items • Updated 16 days ago • 560

upvoted 2 papers 6 months ago

GRIN: GRadient-INformed MoE

Paper • 2409.12136 • Published Sep 18, 2024 • 16

Source2Synth: Synthetic Data Generation and Curation Grounded in Real Data Sources

Paper • 2409.08239 • Published Sep 12, 2024 • 20