Open-source speech datasets annotated using Data-Speech Collection Open-source annotated speech datasets ranging from 1,000 hours to soon 50,000 hours. • 7 items • Updated May 15 • 3
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Paper • 2402.17764 • Published Feb 27 • 581
Gen4Gen: Generative Data Pipeline for Generative Multi-Concept Composition Paper • 2402.15504 • Published Feb 23 • 20
Industry BERT Models Collection Industry and specialized domain finetuned BERT embedding models • 6 items • Updated May 14 • 7
SLIM Models Collection Structured Language Instruction Models (SLIMs) • 23 items • Updated about 1 month ago • 27
TURNA: A Turkish Encoder-Decoder Language Model for Enhanced Understanding and Generation Paper • 2401.14373 • Published Jan 25 • 11
InstantID: Zero-shot Identity-Preserving Generation in Seconds Paper • 2401.07519 • Published Jan 15 • 51