Update README.md
Browse files
README.md
CHANGED
@@ -15,6 +15,13 @@ OmniAudio is the world's fastest and most efficient audio-language model for on-
|
|
15 |
|
16 |
Unlike traditional approaches that chain ASR and LLM models together, OmniAudio-2.6B unifies both capabilities in a single efficient architecture for minimal latency and resource overhead.
|
17 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
18 |
## Use Cases
|
19 |
* **Voice QA without Internet**: Process offline voice queries like "I am at camping, how do I start a fire without fire starter?" OmniAudio provides practical guidance even without network connectivity.
|
20 |
* **Voice-in Conversation**: Have conversations about personal experiences. When you say "I am having a rough day at work," OmniAudio engages in supportive talk and active listening.
|
@@ -22,15 +29,12 @@ Unlike traditional approaches that chain ASR and LLM models together, OmniAudio-
|
|
22 |
* **Recording Summary**: Simply ask "Can you summarize this meeting note?" to convert lengthy recordings into concise, actionable summaries.
|
23 |
* **Voice Tone Modification**: Transform casual voice memos into professional communications. When you request "Can you make this voice memo more professional?" OmniAudio adjusts the tone while preserving the core message.
|
24 |
|
25 |
-
## Performance Benchmarks on Consumer Hardware
|
26 |
-
On a 2024 Mac Mini M4 Pro, **Qwen2-Audio-7B-Instruct** running on π€ Transformers achieves an average decoding speed of 6.38 tokens/second, while **Omni-Audio-2.6B** through Nexa SDK reaches 35.23 tokens/second in FP16 GGUF version and 66 tokens/second in Q4_K_M quantized GGUF version - delivering **5.5x to 10.3x faster performance** on consumer hardware.
|
27 |
-
|
28 |
## Quick Links
|
29 |
-
1. Interactive Demo in our [HuggingFace Space]()
|
30 |
-
2. [Quickstart for local setup]()
|
31 |
-
3. Learn more in our [Blogs]()
|
32 |
|
33 |
-
##
|
34 |
Step 1: Install Nexa-SDK (local on-device inference framework)
|
35 |
|
36 |
[π Install Nexa-SDK](https://github.com/NexaAI/nexa-sdk?tab=readme-ov-file#install-option-1-executable-installer)
|
|
|
15 |
|
16 |
Unlike traditional approaches that chain ASR and LLM models together, OmniAudio-2.6B unifies both capabilities in a single efficient architecture for minimal latency and resource overhead.
|
17 |
|
18 |
+
## Demo
|
19 |
+
|
20 |
+
<video controls autoplay src="https://cdn-uploads.huggingface.co/production/uploads/6618e0424dbef6bd3c72f89a/538_aQ2hRexTlXFL-cYhW.mp4"></video>
|
21 |
+
|
22 |
+
## Performance Benchmarks on Consumer Hardware
|
23 |
+
On a 2024 Mac Mini M4 Pro, **Qwen2-Audio-7B-Instruct** running on π€ Transformers achieves an average decoding speed of 6.38 tokens/second, while **Omni-Audio-2.6B** through Nexa SDK reaches 35.23 tokens/second in FP16 GGUF version and 66 tokens/second in Q4_K_M quantized GGUF version - delivering **5.5x to 10.3x faster performance** on consumer hardware.
|
24 |
+
|
25 |
## Use Cases
|
26 |
* **Voice QA without Internet**: Process offline voice queries like "I am at camping, how do I start a fire without fire starter?" OmniAudio provides practical guidance even without network connectivity.
|
27 |
* **Voice-in Conversation**: Have conversations about personal experiences. When you say "I am having a rough day at work," OmniAudio engages in supportive talk and active listening.
|
|
|
29 |
* **Recording Summary**: Simply ask "Can you summarize this meeting note?" to convert lengthy recordings into concise, actionable summaries.
|
30 |
* **Voice Tone Modification**: Transform casual voice memos into professional communications. When you request "Can you make this voice memo more professional?" OmniAudio adjusts the tone while preserving the core message.
|
31 |
|
|
|
|
|
|
|
32 |
## Quick Links
|
33 |
+
1. Interactive Demo in our [HuggingFace Space](https://huggingface.co/spaces/NexaAIDev/omni-audio-demo)
|
34 |
+
2. [Quickstart for local setup](#How-to-Use-On-Device)
|
35 |
+
3. Learn more in our [Blogs](https://nexa.ai/blogs/OmniAudio-2.6B)
|
36 |
|
37 |
+
## How to Use On Device
|
38 |
Step 1: Install Nexa-SDK (local on-device inference framework)
|
39 |
|
40 |
[π Install Nexa-SDK](https://github.com/NexaAI/nexa-sdk?tab=readme-ov-file#install-option-1-executable-installer)
|