Architecture Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published Dec 13, 2024 • 95
Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published Dec 13, 2024 • 95
LLM Models The State of the Art in LLMs: The Latest Research and Developments moxin-org/moxin-llm-7b Text Generation • Updated Dec 20, 2024 • 77 • 13