Running 915 915 FineWeb: decanting the web for the finest text data at scale 🍷 Generate high-quality web text data for LLM training
Running 126 126 Pdf To Structured Data 🌍 PDF to Structured Data powered by Google DeepMind Gemini 2.0
distilbert/distilbert-base-uncased-finetuned-sst-2-english Text Classification • Updated Dec 19, 2023 • 6.29M • • 734
sentence-transformers/paraphrase-multilingual-mpnet-base-v2 Sentence Similarity • Updated Mar 6 • 3.18M • • 378