GPT-4o-omni-text-audio-image-video

Running

App Files Files Community

GPT-4o-omni-text-audio-image-video / README.md

awacke1

Update README.md

cea7f26 verified 5 months ago

preview code

raw

history blame

1.57 kB

	---
	title: 🧠GPT 4o Omni Text Audio Image Video
	emoji: 🐠🔬🧠
	colorFrom: gray
	colorTo: blue
	sdk: streamlit
	sdk_version: 1.34.0
	app_file: app.py
	pinned: true
	license: mit
	---


	GPT-4o Documentation: https://cookbook.openai.com/examples/gpt4o/introduction_to_gpt4o

	Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

	This experimental multi agent mixture of expert system uses a variety of techniques and models to create different combinatorial AI solutions.

	Models Used:

	Mistral-7B-Instruct
	Llama2-7B
	Mixtral-8x7B-Instruct
	Google Gemma-7B
	OpenAI Whisper Small En
	OpenAI GPT-4o, Whisper-1
	ArXiV Embeddings
	The techniques below which are not ML models but AI include:

	Speech Synthesis using browser technology
	Memory for semantic facts, and episodic emotional and event time series memories
	Web integration using the q= standard for search linking allowing comparison of tech giant AI implementations:
	Bing then Bing copilot with click 2
	Google which does an AI search now
	Twitter, the new home for technology discoveries, AI Output and Grok
	Wikipedia for fact checking
	YouTube
	File and metadata integration combining text, audio, image, and video
	This app also merges common theories in cognitive AI, AI with python libraries (e.g. NLTK, SKLearn).

	The intent is to demonstrate SOTA AI/ML and combinations of Function-Input-Output for interoperability and knowledge management.

	This space also serves as an experimental test bed for new technologies mixing it in with old for comparison and integration.

	--Aaron