metadata

license: apache-2.0
datasets:
  - HuggingFaceFW/fineweb
  - PleIAs/YouTube-Commons
  - allenai/WildChat-1M
language:
  - de
  - en
  - ja
  - fr
library_name: mlx
tags:
  - moe
  - multimodal
  - j.o.s.i.e.

This will be the repo for J.O.S.I.E.v4o

Like OpenAIs GPT-4o, it's natively Multimodal, based on the NExT-GPT combined with ROPE, RMS Normalisation, and MoE, parred with the GPT-4o Tokenizer from OpenAI. This is a future project and will take it's time.

Further more, I will probably make a UI application with that model too.

Further updates comming soon!!!

First architecture Overview:

First Beta will utilize the already pretrained ImageBind Model. The linear input Projection is because the outputs of the ImageBind model are not in the correct dimensions. Later on the input and output projections will be removed.

Source code and more info will be available on my GitHub Repo