Expand the linguistic coverage.

by Pclanglais - opened Oct 31, 2023

Owner Oct 31, 2023

•

edited Oct 31, 2023

While Llama is incredibly versatile and the multilingual training seems to have maintained support for languages not included in the dataset, a future version of Brahe should absolutely feature some texts in the following languages (nearly absent for now):

Arabian
Russian
Chinese
Japanese.
Hindi
Bengali

(really looking forward for potential collaborations on this, as I'm not conversant in any of theses).

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment