Is this really a GPT-4o model?

by Megareyka - opened 18 days ago

18 days ago

this is the only model titled "GPT-4o" on the hub at the moment so I am unsure if it is actually capable of video recognition and speech synthesis.

Xenova

Owner 18 days ago

Hi there :) As stated in the README, this is only the tokenizer for gpt-4o, made to be compatible with Hugging Face transformers and transformers.js.

xdarren

17 days ago

你好

Ssunit

16 days ago

how we can get confirmation that its gpt-4 model because before using i really important to verify

Xenova

Owner 16 days ago

@Ssunit As stated above and in the README, this is only a HF-compatible version of the tokenizer (no model weights).

AmHoechste

14 days ago

I'm sorry for the probably stupid question, but. Do you think this tokenizer is capable of tokenizing not only text data, but also video or audio data without using intermediary modules like Whsiper? I will be very grateful for any response❤️❤️❤️.

Xenova

Owner 14 days ago

@AmHoechste Unfortunately, this tokenizer is only for text. To be able to preprocess video or audio, we would need to know the format that the model expects. Then, to be able to generate embeddings for video/audio, we would need the model weights for the vision/audio encoder, which we don't have.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment