--- license: apache-2.0 datasets: - HuggingFaceFW/fineweb - PleIAs/YouTube-Commons - allenai/WildChat-1M language: - de - en - ja - fr library_name: mlx tags: - moe - multimodal - j.o.s.i.e. --- # This will be the repo for J.O.S.I.E.v4o Like **OpenAIs GPT-4o**, it's natively Multimodal, based on the **NExT-GPT** combined with **ROPE**, **RMS Normalisation**, and **MoE**, parred with the **GPT-4o Tokenizer** from OpenAI. This is a *future project* and will take it's time. Further more, I will probably make a **UI application** with that model too. Further updates comming soon!!! First architecture Overview: First Beta will utilize the already pretrained ImageBind Model. The linear input Projection is because the outputs of the ImageBind model are not in the correct dimensions. Later on the input and output projections will be removed. Source code and more info will be available on my GitHub Repo