Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Posts
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Testerpce 's Collections
Theory and Representation learning
Adversarial
Graph
Multimodal
Search
Interpretable
Diversity
Diffusion
Self correction
Information_retrieval
Speech
Attention
Synthetic data
Agent
MoE
RAG
Markov chain
Prompt papers
Planning
Sparsity
Multilingual
State space LLM
Partial layer training LLMs
Reasoning
Evaluation
Fine tuning
Math
Dataset and Data processing
Style transfer
Video understanding
Reinforcement learning
Long context
Knowledge

Dataset and Data processing

updated Mar 31
Upvote
-

  • Perplexed by Perplexity: Perplexity-Based Data Pruning With Small Reference Models

    Paper • 2405.20541 • Published May 30, 2024 • 24

  • RedPajama: an Open Dataset for Training Large Language Models

    Paper • 2411.12372 • Published Nov 19, 2024 • 56

  • Exploring Data Scaling Trends and Effects in Reinforcement Learning from Human Feedback

    Paper • 2503.22230 • Published Mar 28 • 44
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs