@osanseviero on Hugging Face: "Diaries of Open Source. Part 6! 🏎️xAI releases Grok-1, a 314B MoE Blog:…"

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

osanseviero

posted an update Mar 19, 2024

Post

1930

Diaries of Open Source. Part 6!

🏎️xAI releases Grok-1, a 314B MoE
Blog: https://x.ai/blog/grok-os
GH repo: https://github.com/xai-org/grok-1
Model: xai-org/grok-1

🕺MusicLang, a model for controllable music generation
Demo: musiclang/musiclang-predict
GH repo: https://github.com/musiclang/musiclang_predict

🔬BioT5: a family of models for biology and chemical text tasks
Base model: QizhiPei/biot5-base
Model for molecule captioning and design: QizhiPei/biot5-base-mol2text and QizhiPei/biot5-base-text2mol
GH Repo: https://github.com/QizhiPei/BioT5
Paper: BioT5: Enriching Cross-modal Integration in Biology with Chemical Knowledge and Natural Language Associations (2310.07276)

🤏Check out the AQLM and QMoE official weights from ISTA-DAS lab
Org:

ISTA-DASLab
Papers: Extreme Compression of Large Language Models via Additive Quantization (2401.06118) and QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models (2310.16795)

🚀Community releases
Einstein-v4-7B, a Mistral fine-tune on high-quality data Weyaxi/Einstein-v4-7B
IL-7B, a Misttral fine-tune merge for rheumatology cmcmaster/il_7b
Caselaw Access Project, a collaboration to digitalize 40 million US court decisions from 6.7 million cases from 360 years https://hf.co/datasets/TeraflopAI/Caselaw_Access_Project

🌍Data and models around the world
HPLT Monolingual, a dataset of 75 languages with over 40TB of data HPLT/hplt_monolingual_v1_2
OpenLLM Turkish Benchmarks & Leaderboard malhajar/openllmturkishleadboard-datasets-65e5854490a87c0f2670ec18 and malhajar/OpenLLMTurkishLeaderboard
Occiglot, a collaborative effort for European LLMs with an initial release of 7B models for French, German, Spanish, and Italian https://hf.co/collections/occiglot/occiglot-eu5-7b-v01-65dbed502a6348b052695e01
Guftagoo, a Hindi+Hinglish multi-turn conversational dataset https://hf.co/datasets/Tensoic/gooftagoo
AryaBhatta-Orca-Maths-Hindi dataset https://hf.co/datasets/GenVRadmin/Aryabhatta-Orca-Maths-Hindi

Mar 19, 2024

Find part 5 at https://huggingface.co/posts/osanseviero/853315170369291

In this post

osanseviero Omar Sanseviero