Is this really a GPT-4o model?

#1
by Megareyka - opened

this is the only model titled "GPT-4o" on the hub at the moment so I am unsure if it is actually capable of video recognition and speech synthesis.

Hi there :) As stated in the README, this is only the tokenizer for gpt-4o, made to be compatible with Hugging Face transformers and transformers.js.

how we can get confirmation that its gpt-4 model because before using i really important to verify

@Ssunit As stated above and in the README, this is only a HF-compatible version of the tokenizer (no model weights).

I'm sorry for the probably stupid question, but. Do you think this tokenizer is capable of tokenizing not only text data, but also video or audio data without using intermediary modules like Whsiper? I will be very grateful for any response❤️❤️❤️.

@AmHoechste Unfortunately, this tokenizer is only for text. To be able to preprocess video or audio, we would need to know the format that the model expects. Then, to be able to generate embeddings for video/audio, we would need the model weights for the vision/audio encoder, which we don't have.

Sign up or log in to comment