@awacke1 on Hugging Face: "I just completed getting all four aspects of the new OpenAI GPT-4-o Omni model…"

awacke1

posted an update 18 days ago

Post

1616

I just completed getting all four aspects of the new OpenAI GPT-4-o Omni model to process Text, Image, Audio, and Video.

Check it out and let me know what you think!

Space: awacke1/GPT-4o-omni-text-audio-image-video

Discussion: awacke1/GPT-4o-omni-text-audio-image-video

Test Runs for All Four Modalities: awacke1/GPT-4o-omni-text-audio-image-video#1

--Aaron - https://huggingface.co/awacke1

athareja

18 days ago

•

edited 18 days ago

This looks great, thanks for sharing. Are you using audio capabilities of GPT-4o or first converting audio to text and using its text capabilities. I saw in their announcement that audio capabilities are not publicly available to everyone through their API, so wanted to see if I am misunderstanding something.

Developers can also now access GPT-4o in the API as a text and vision model. We plan to launch support for GPT-4o's new audio and video capabilities to a small group of trusted partners in the API in the coming weeks.

taher30

15 days ago

Does this model use your API key? Is this billed or is this using a free model?

Join the conversation