Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Citaman 's Collections
omni models
Keep in Mind's Paper
LLM From Scratch - Datasets
Keep in Mind's Model
Keep in Mind's Vision models
Keep in Mind's TTS Model
Keep in Mind's Embbeding model
Keep in mind's - Text to Image Generation
Space - keep in minf
Dataset Image

LLM From Scratch - Datasets

updated Feb 28, 2024
Upvote
-

  • Skylion007/openwebtext

    Updated May 17, 2024 • 49k • 433

  • JeanKaddour/minipile

    Viewer • Updated Jun 20, 2023 • 1.01M • 1.94k • 123

  • Locutusque/TM-DATA

    Viewer • Updated Oct 15, 2024 • 2.77M • 354 • 11

  • PleIAs/French-PD-Newspapers

    Viewer • Updated Mar 19, 2024 • 2.25M • 1.37k • 67

  • euclaise/MiniCoT

    Viewer • Updated Jan 23, 2024 • 129k • 15 • 6

  • euirim/goodwiki

    Viewer • Updated Sep 11, 2023 • 44.8k • 140 • 52

  • euclaise/mathoverflow-accepted

    Viewer • Updated Oct 20, 2023 • 62.6k • 143 • 4

  • Locutusque/UltraTextbooks

    Viewer • Updated Feb 2, 2024 • 5.52M • 1.3k • 196

  • TempoFunk/webvid-10M

    Viewer • Updated Aug 19, 2023 • 10.7M • 1.92k • 74

  • HuggingFaceTB/cosmopedia

    Viewer • Updated Aug 12, 2024 • 31.1M • 5.39k • 621

  • HuggingFaceGECLM/REDDIT_submissions

    Viewer • Updated Mar 17, 2023 • 47.2M • 3.44k • 10

  • togethercomputer/RedPajama-Data-V2

    Updated Nov 21, 2024 • 5.26k • 366
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs