Aaron Chibb

aari1995

AI & ML interests

Multilinguality and German LLMs

Organizations

aari1995's activity

replied to tomaarsen's post 5 days ago
view reply

great as always!! mostly colbert I think would be great, the people at RAGatouille are also doing great stuff but having it integrated in ST would be sooo cool!

replied to their post 4 months ago
view reply

no there will be some filtering happening, working on the algorithm currently to do so.

posted an update 4 months ago
view post
Post
2981
ARABIC CHINESE FRENCH GERMAN RUSSIAN SPANISH TURKISH

mLLM - first release:
orca_dpo_pairs by Intel (translated into 7 languages)

ARABIC CHINESE FRENCH GERMAN RUSSIAN SPANISH TURKISH

Upcoming:
- more datasets
- cleaning steps
- a blogpost
- stay updated at https://hf.co/multilingual

multilingual/orca_dpo_pairs
·
posted an update 5 months ago
view post
Post
looking at the tokenizer and the naming (“_en“), Google Gemma is very likely to have a multilingual counterpart. 👀

Thoughts?
  • 3 replies
·
posted an update 5 months ago
view post
Post
@clem ist das der erste nicht Englische post auf huggingface?👋🏽 🇩🇪🇫🇷🇮🇹🇪🇸🇮🇳…
  • 1 reply
·