Marcos Remar Claude commited on Sep 10

Commit

51563dd

1 Parent(s): 471903a

feat: Implementação completa do sistema Speech-to-Speech com WebRTC

Principais melhorias:
- ✅ Descoberta e documentação do formato correto de áudio para Ultravox (tuple)
- ✅ Interface Push-to-Talk com histórico completo de mensagens
- ✅ Remoção de todas referências ao Orchestrator (arquitetura simplificada)
- ✅ Scripts de túnel SSH para acesso remoto via MacBook
- ✅ Testes automatizados funcionando com 100% de sucesso
- ✅ Suporte a múltiplas interfaces (iOS, Material, Tailwind)
- ✅ Documentação completa de troubleshooting no README

Arquitetura atual:
- WebRTC Gateway (porta 8082) → Ultravox (50051) + TTS (50054)
- Latência end-to-end: ~286ms
- Audio: PCM 16-bit, 16kHz, Float32 normalizado

🤖 Generated with Claude Code

Co-Authored-By: Claude <noreply@anthropic.com>

This view is limited to 50 files because it contains too many changes. See raw diff

Files changed (50) hide show

README.md +95 -16
manage_services.sh +282 -0
services/webrtc_gateway/conversation-memory.js +290 -0
services/webrtc_gateway/favicon.ico +1 -0
services/webrtc_gateway/opus-decoder.js +150 -0
services/webrtc_gateway/package-lock.json +470 -5
services/webrtc_gateway/package.json +4 -3
services/webrtc_gateway/response_1757390722112.pcm +1 -0
services/webrtc_gateway/response_1757391966860.pcm +1 -0
services/webrtc_gateway/start.sh +0 -2
services/webrtc_gateway/test-audio-cli.js +178 -0
services/webrtc_gateway/test-memory.js +108 -0
services/webrtc_gateway/test-portuguese-audio.js +410 -0
services/webrtc_gateway/test-websocket-speech.js +184 -0
services/webrtc_gateway/test-websocket.js +317 -0
services/webrtc_gateway/ultravox-chat-backup.html +964 -0
services/webrtc_gateway/ultravox-chat-ios.html +1843 -0
services/webrtc_gateway/ultravox-chat-material.html +1116 -0
services/webrtc_gateway/ultravox-chat-opus.html +581 -0
services/webrtc_gateway/ultravox-chat-original.html +964 -0
services/webrtc_gateway/ultravox-chat-server.js +158 -10
services/webrtc_gateway/ultravox-chat-tailwind.html +393 -0
services/webrtc_gateway/ultravox-chat.html +964 -0
services/webrtc_gateway/webrtc.pid +1 -0
test-24khz-support.html +243 -0
test-audio-cli.js +178 -0
test-grpc-updated.py +161 -0
test-opus-support.html +337 -0
test-simple.py +70 -0
test-tts-button.html +65 -0
test-ultravox-auto.py +172 -0
test-ultravox-librosa.py +166 -0
test-ultravox-simple-prompt.py +206 -0
test-ultravox-tts.py +121 -0
test-ultravox-tuple.py +202 -0
test-ultravox-vllm.py +113 -0
test-vllm-openai.py +90 -0
tts_server_kokoro.py +255 -0
tunnel-macbook.sh +70 -0
tunnel.sh +95 -0
ultravox/restart_ultravox.sh +39 -0
ultravox/server.py +38 -20
ultravox/server_backup.py +446 -0
ultravox/server_vllm_090_broken.py +447 -0
ultravox/server_working_original.py +440 -0
ultravox/speech.proto +94 -0
ultravox/start_ultravox.sh +67 -0
ultravox/stop_ultravox.sh +60 -0
ultravox/test-tts.py +121 -0
ultravox/test_audio_coherence.py +193 -0

README.md CHANGED Viewed

@@ -10,9 +10,9 @@ cd /workspace/ultravox-pipeline
 ./scripts/setup_background.sh
 # Or start individual services:
-cd tts-service && ./start.sh           # TTS on port 50054
-cd services/orchestrator && ./start.sh  # Orchestrator on port 50053
-cd services/webrtc_gateway && ./start.sh # WebRTC on port 8081
 ```
 ## 📊 Current Status (September 2025)
@@ -21,7 +21,7 @@ cd services/webrtc_gateway && ./start.sh # WebRTC on port 8081
 |---------|---------|----------|--------|
 | **TTS Service** | ~91ms | Kokoro v1.0, streaming | ✅ Production |
 | **Ultravox STT+LLM** | ~180ms | vLLM, custom prompts | ✅ Production |
-| **Orchestrator** | ~15ms | Session mgmt, health check | ✅ Production |
 | **End-to-End** | ~286ms | Full pipeline | ✅ Achieved |
 ### ✨ New Features (September 2025)
@@ -35,17 +35,15 @@ cd services/webrtc_gateway && ./start.sh # WebRTC on port 8081
 ```mermaid
 graph TB
-    Browser[🌐 Browser] -->|WebSocket| WRG[WebRTC Gateway :8081]
-    WRG -->|gRPC| ORCH[Orchestrator :50053]
-    ORCH -->|gRPC| UV[Ultravox :50051<br/>STT + LLM]
-    ORCH -->|gRPC| TTS[TTS Service :50054<br/>Kokoro Engine]
 ```
 ### Service Responsibilities
-- **WebRTC Gateway**: Browser interface, WebSocket signaling
-- **Orchestrator**: Pipeline coordination, session management, buffering
 - **Ultravox**: Multimodal Speech-to-Text + LLM (fixie-ai/ultravox-v0_5-llama-3_2-1b)
-- **TTS Service**: Text-to-speech with Kokoro v1.0 engine
 ## 🔧 Configuration
@@ -66,9 +64,8 @@ services:
     port: 50051
     model: "fixie-ai/ultravox-v0_5"
-  orchestrator:
-    port: 50053
-    buffer_size_ms: 100
 ```
 ## 📄 Technical References
@@ -103,7 +100,6 @@ ultravox-pipeline/
 ├── ultravox/              # Speech-to-Text + LLM (submodule ready)
 ├── tts-service/           # Unified TTS Service with Kokoro
 ├── services/
-│   ├── orchestrator/      # Central pipeline coordinator
 │   └── webrtc_gateway/    # Browser WebRTC interface
 ├── config/                # Centralized YAML configuration
 ├── protos/                # gRPC protocol definitions
@@ -117,7 +113,7 @@ ultravox-pipeline/
 ```bash
 # Test gRPC connections
-grpcurl -plaintext localhost:50053 orchestrator.OrchestratorService/HealthCheck
 # Run integration tests
 cd tests/integration
@@ -133,6 +129,89 @@ python benchmark_latency.py
 - **[gRPC Integration Guide](docs/GRPC_INTEGRATION_GUIDE.md)** - Complete service integration details
 - **[Context Window Analysis](docs/CONTEXT_WINDOW_ANALYSIS.md)** - Streaming TTS research
 ## 🤝 Contributing
 Focus areas:

 ./scripts/setup_background.sh
 # Or start individual services:
+cd ultravox && python server.py         # Ultravox on port 50051
+python3 tts_server_gtts.py              # TTS on port 50054
+cd services/webrtc_gateway && npm start # WebRTC on port 8082
 ```
 ## 📊 Current Status (September 2025)
 |---------|---------|----------|--------|
 | **TTS Service** | ~91ms | Kokoro v1.0, streaming | ✅ Production |
 | **Ultravox STT+LLM** | ~180ms | vLLM, custom prompts | ✅ Production |
+| **WebRTC Gateway** | ~20ms | Browser interface | ✅ Production |
 | **End-to-End** | ~286ms | Full pipeline | ✅ Achieved |
 ### ✨ New Features (September 2025)
 ```mermaid
 graph TB
+    Browser[🌐 Browser] -->|WebSocket| WRG[WebRTC Gateway :8082]
+    WRG -->|gRPC| UV[Ultravox :50051<br/>STT + LLM]
+    WRG -->|gRPC| TTS[TTS Service :50054<br/>gTTS Engine]
 ```
 ### Service Responsibilities
+- **WebRTC Gateway**: Browser interface, WebSocket signaling, pipeline coordination
 - **Ultravox**: Multimodal Speech-to-Text + LLM (fixie-ai/ultravox-v0_5-llama-3_2-1b)
+- **TTS Service**: Text-to-speech with gTTS engine
 ## 🔧 Configuration
     port: 50051
     model: "fixie-ai/ultravox-v0_5"
+  webrtc_gateway:
+    port: 8082
 ```
 ## 📄 Technical References
 ├── ultravox/              # Speech-to-Text + LLM (submodule ready)
 ├── tts-service/           # Unified TTS Service with Kokoro
 ├── services/
 │   └── webrtc_gateway/    # Browser WebRTC interface
 ├── config/                # Centralized YAML configuration
 ├── protos/                # gRPC protocol definitions
 ```bash
 # Test gRPC connections
+grpcurl -plaintext localhost:50051 speech.SpeechService/HealthCheck
 # Run integration tests
 cd tests/integration
 - **[gRPC Integration Guide](docs/GRPC_INTEGRATION_GUIDE.md)** - Complete service integration details
 - **[Context Window Analysis](docs/CONTEXT_WINDOW_ANALYSIS.md)** - Streaming TTS research
+## 🐛 Troubleshooting & Solutions
+### Ultravox Audio Processing Issues
+#### Problem: Model returning garbage responses ("???", "!!!", random characters)
+**Root Cause**: Incorrect audio format being sent to vLLM
+**Solution**:
+```python
+# ❌ WRONG - Sending raw array
+vllm_input = {
+    "prompt": prompt,
+    "multi_modal_data": {
+        "audio": audio_array  # This doesn't work!
+    }
+}
+# ✅ CORRECT - Send as tuple (audio, sample_rate)
+audio_tuple = (audio_array, 16000)  # Must be 16kHz
+vllm_input = {
+    "prompt": formatted_prompt,
+    "multi_modal_data": {
+        "audio": [audio_tuple]  # List of tuples!
+    }
+}
+```
+#### Problem: Model not understanding audio content
+**Root Cause**: Missing chat template and tokenizer formatting
+**Solution**:
+```python
+# Import tokenizer for proper formatting
+from transformers import AutoTokenizer
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+# Format messages with audio token
+messages = [{"role": "user", "content": f"<|audio|>\n{prompt}"}]
+# Apply chat template
+formatted_prompt = tokenizer.apply_chat_template(
+    messages,
+    tokenize=False,
+    add_generation_prompt=True
+)
+```
+#### Optimal vLLM Configuration
+```python
+# Best parameters for Ultravox v0.5
+sampling_params = SamplingParams(
+    temperature=0.2,  # Low temperature for accurate responses
+    max_tokens=64     # Sufficient for complete answers
+)
+# vLLM initialization
+llm = LLM(
+    model="fixie-ai/ultravox-v0_5-llama-3_2-1b",
+    trust_remote_code=True,
+    enforce_eager=True,
+    max_model_len=4096,
+    gpu_memory_utilization=0.3
+)
+```
+#### Audio Format Requirements
+- **Sample Rate**: Must be 16kHz
+- **Format**: Float32 normalized between -1 and 1
+- **Recommended Library**: Use `librosa` for loading audio
+```python
+import librosa
+# Librosa automatically normalizes to [-1, 1]
+audio, sr = librosa.load(audio_file, sr=16000)
+```
+### GPU Memory Issues
+#### Problem: "No available memory for the cache blocks"
+**Solution**: Run the cleanup script
+```bash
+bash /workspace/ultravox-pipeline/scripts/cleanup_gpu.sh
+```
 ## 🤝 Contributing
 Focus areas:

manage_services.sh ADDED Viewed

	@@ -0,0 +1,282 @@

+#!/bin/bash
+# Script mestre para gerenciar todos os serviços do Ultravox Pipeline
+# Inclui limpeza de processos órfãos e verificação de recursos
+# Cores para output
+RED='\033[0;31m'
+GREEN='\033[0;32m'
+YELLOW='\033[1;33m'
+BLUE='\033[0;34m'
+NC='\033[0m' # No Color
+# Diretórios
+ULTRAVOX_DIR="/workspace/ultravox-pipeline/ultravox"
+WEBRTC_DIR="/workspace/ultravox-pipeline/services/webrtc_gateway"
+TTS_DIR="/workspace/tts-service-kokoro"
+# Função para imprimir com cor
+print_colored() {
+    echo -e "${2}${1}${NC}"
+}
+# Função para verificar status da GPU
+check_gpu() {
+    print_colored "📊 Status da GPU:" "$BLUE"
+    GPU_INFO=$(nvidia-smi --query-gpu=memory.used,memory.free,memory.total --format=csv,noheader,nounits 2>/dev/null | head -1)
+    if [ -n "$GPU_INFO" ]; then
+        IFS=',' read -r USED FREE TOTAL <<< "$GPU_INFO"
+        echo "   Usado: ${USED}MB | Livre: ${FREE}MB | Total: ${TOTAL}MB"
+        if [ "$FREE" -lt "20000" ]; then
+            print_colored "   ⚠️  AVISO: Menos de 20GB livres!" "$YELLOW"
+            return 1
+        fi
+    else
+        print_colored "   ❌ Não foi possível verificar GPU" "$RED"
+        return 1
+    fi
+    return 0
+}
+# Função para limpar processos órfãos
+cleanup_orphans() {
+    print_colored "🧹 Limpando processos órfãos..." "$YELLOW"
+    # Limpar vLLM e EngineCore
+    pkill -f "VLLM::EngineCore" 2>/dev/null
+    pkill -f "vllm.*engine" 2>/dev/null
+    pkill -f "multiprocessing.resource_tracker" 2>/dev/null
+    sleep 2
+    # Verificar se limpou
+    REMAINING=$(ps aux | grep -E "vllm|EngineCore" | grep -v grep | wc -l)
+    if [ "$REMAINING" -eq "0" ]; then
+        print_colored "   ✅ Processos órfãos limpos" "$GREEN"
+    else
+        print_colored "   ⚠️  Ainda existem $REMAINING processos órfãos" "$YELLOW"
+        pkill -9 -f "vllm" 2>/dev/null
+        pkill -9 -f "EngineCore" 2>/dev/null
+    fi
+}
+# Função para iniciar Ultravox
+start_ultravox() {
+    print_colored "\n🚀 Iniciando Ultravox..." "$BLUE"
+    # Limpar antes de iniciar
+    cleanup_orphans
+    # Verificar GPU
+    if ! check_gpu; then
+        print_colored "   Tentando liberar GPU..." "$YELLOW"
+        cleanup_orphans
+        sleep 3
+        check_gpu
+    fi
+    # Iniciar servidor
+    cd "$ULTRAVOX_DIR"
+    if [ -f "start_ultravox.sh" ]; then
+        nohup bash start_ultravox.sh > ultravox.log 2>&1 &
+        print_colored "   ✅ Ultravox iniciado (PID: $!)" "$GREEN"
+        echo $! > ultravox.pid
+    else
+        print_colored "   ❌ Script start_ultravox.sh não encontrado" "$RED"
+    fi
+}
+# Função para parar Ultravox
+stop_ultravox() {
+    print_colored "\n🛑 Parando Ultravox..." "$YELLOW"
+    cd "$ULTRAVOX_DIR"
+    if [ -f "stop_ultravox.sh" ]; then
+        bash stop_ultravox.sh
+    else
+        pkill -f "python.*server.py" 2>/dev/null
+        cleanup_orphans
+    fi
+    if [ -f "ultravox.pid" ]; then
+        kill -9 $(cat ultravox.pid) 2>/dev/null
+        rm ultravox.pid
+    fi
+}
+# Função para iniciar WebRTC Gateway
+start_webrtc() {
+    print_colored "\n🌐 Iniciando WebRTC Gateway..." "$BLUE"
+    cd "$WEBRTC_DIR"
+    nohup npm start > webrtc.log 2>&1 &
+    print_colored "   ✅ WebRTC Gateway iniciado (PID: $!)" "$GREEN"
+    echo $! > webrtc.pid
+}
+# Função para parar WebRTC Gateway
+stop_webrtc() {
+    print_colored "\n🛑 Parando WebRTC Gateway..." "$YELLOW"
+    pkill -f "node.*ultravox-chat-server" 2>/dev/null
+    cd "$WEBRTC_DIR"
+    if [ -f "webrtc.pid" ]; then
+        kill -9 $(cat webrtc.pid) 2>/dev/null
+        rm webrtc.pid
+    fi
+}
+# Função para iniciar TTS
+start_tts() {
+    print_colored "\n🔊 Iniciando TTS Service..." "$BLUE"
+    cd "$TTS_DIR"
+    # Verificar se venv existe, senão criar
+    if [ ! -d "venv" ]; then
+        print_colored "   Criando ambiente virtual..." "$YELLOW"
+        python3 -m venv venv
+    fi
+    source venv/bin/activate 2>/dev/null
+    nohup python3 server.py > tts.log 2>&1 &
+    print_colored "   ✅ TTS Service iniciado (PID: $!)" "$GREEN"
+    echo $! > tts.pid
+}
+# Função para parar TTS
+stop_tts() {
+    print_colored "\n🛑 Parando TTS Service..." "$YELLOW"
+    pkill -f "tts.*server.py" 2>/dev/null
+    cd "$TTS_DIR"
+    if [ -f "tts.pid" ]; then
+        kill -9 $(cat tts.pid) 2>/dev/null
+        rm tts.pid
+    fi
+}
+# Função para verificar status dos serviços
+check_status() {
+    print_colored "\n📊 Status dos Serviços:" "$BLUE"
+    echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
+    # Ultravox
+    if lsof -i :50051 >/dev/null 2>&1; then
+        print_colored "✅ Ultravox: RODANDO (porta 50051)" "$GREEN"
+    else
+        print_colored "❌ Ultravox: PARADO" "$RED"
+    fi
+    # WebRTC
+    if lsof -i :8082 >/dev/null 2>&1; then
+        print_colored "✅ WebRTC Gateway: RODANDO (porta 8082)" "$GREEN"
+    else
+        print_colored "❌ WebRTC Gateway: PARADO" "$RED"
+    fi
+    # TTS
+    if lsof -i :50054 >/dev/null 2>&1; then
+        print_colored "✅ TTS Service: RODANDO (porta 50054)" "$GREEN"
+    else
+        print_colored "❌ TTS Service: PARADO" "$RED"
+    fi
+    echo ""
+    check_gpu
+    echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
+}
+# Menu principal
+case "$1" in
+    start)
+        print_colored "🚀 Iniciando todos os serviços..." "$BLUE"
+        cleanup_orphans
+        start_ultravox
+        sleep 10  # Aguardar Ultravox carregar
+        start_webrtc
+        start_tts
+        check_status
+        ;;
+    stop)
+        print_colored "🛑 Parando todos os serviços..." "$YELLOW"
+        stop_webrtc
+        stop_tts
+        stop_ultravox
+        cleanup_orphans
+        check_status
+        ;;
+    restart)
+        print_colored "🔄 Reiniciando todos os serviços..." "$BLUE"
+        $0 stop
+        sleep 5
+        $0 start
+        ;;
+    status)
+        check_status
+        ;;
+    cleanup)
+        cleanup_orphans
+        check_gpu
+        ;;
+    ultravox-start)
+        start_ultravox
+        ;;
+    ultravox-stop)
+        stop_ultravox
+        ;;
+    ultravox-restart)
+        stop_ultravox
+        sleep 3
+        start_ultravox
+        ;;
+    webrtc-start)
+        start_webrtc
+        ;;
+    webrtc-stop)
+        stop_webrtc
+        ;;
+    tts-start)
+        start_tts
+        ;;
+    tts-stop)
+        stop_tts
+        ;;
+    *)
+        echo "Uso: $0 {start|stop|restart|status|cleanup}"
+        echo ""
+        echo "Comandos disponíveis:"
+        echo "  start           - Inicia todos os serviços"
+        echo "  stop            - Para todos os serviços"
+        echo "  restart         - Reinicia todos os serviços"
+        echo "  status          - Verifica status dos serviços"
+        echo "  cleanup         - Limpa processos órfãos"
+        echo ""
+        echo "Comandos específicos:"
+        echo "  ultravox-start  - Inicia apenas Ultravox"
+        echo "  ultravox-stop   - Para apenas Ultravox"
+        echo "  ultravox-restart- Reinicia apenas Ultravox"
+        echo "  webrtc-start    - Inicia apenas WebRTC Gateway"
+        echo "  webrtc-stop     - Para apenas WebRTC Gateway"
+        echo "  tts-start       - Inicia apenas TTS Service"
+        echo "  tts-stop        - Para apenas TTS Service"
+        exit 1
+        ;;
+esac
+exit 0

services/webrtc_gateway/conversation-memory.js ADDED Viewed

	@@ -0,0 +1,290 @@

+/**
+ * Sistema de Memória em Processo para Conversações
+ * Mantém contexto de conversas com limite de mensagens
+ */
+const crypto = require('crypto');
+class ConversationMemory {
+    constructor() {
+        // Armazena conversações por ID
+        this.conversations = new Map();
+        // Configurações
+        this.config = {
+            maxMessagesPerConversation: 10,  // Máximo de mensagens por conversa
+            maxConversations: 100,           // Máximo de conversas em memória
+            ttlMinutes: 60,                  // Tempo de vida em minutos
+            cleanupIntervalMinutes: 10       // Intervalo de limpeza
+        };
+        // Estatísticas
+        this.stats = {
+            totalMessages: 0,
+            totalConversations: 0,
+            activeConversations: 0
+        };
+        // Iniciar limpeza automática
+        this.startCleanup();
+    }
+    /**
+     * Gera ID único para conversação
+     */
+    generateConversationId() {
+        return `conv_${Date.now()}_${crypto.randomBytes(8).toString('hex')}`;
+    }
+    /**
+     * Cria nova conversação
+     */
+    createConversation(conversationId = null, metadata = {}) {
+        const id = conversationId || this.generateConversationId();
+        // Verificar limite de conversações
+        if (this.conversations.size >= this.config.maxConversations) {
+            this.removeOldestConversation();
+        }
+        const conversation = {
+            id,
+            createdAt: Date.now(),
+            lastActivity: Date.now(),
+            messages: [],
+            metadata: {
+                ...metadata,
+                messageCount: 0,
+                userAgent: metadata.userAgent || 'unknown'
+            }
+        };
+        this.conversations.set(id, conversation);
+        this.stats.totalConversations++;
+        this.stats.activeConversations = this.conversations.size;
+        console.log(`📝 Nova conversação criada: ${id}`);
+        return conversation;
+    }
+    /**
+     * Recupera conversação existente
+     */
+    getConversation(conversationId) {
+        const conversation = this.conversations.get(conversationId);
+        if (conversation) {
+            conversation.lastActivity = Date.now();
+        }
+        return conversation;
+    }
+    /**
+     * Adiciona mensagem à conversação
+     */
+    addMessage(conversationId, message) {
+        let conversation = this.getConversation(conversationId);
+        // Criar conversação se não existir
+        if (!conversation) {
+            conversation = this.createConversation(conversationId);
+        }
+        // Estrutura da mensagem
+        const msg = {
+            id: `msg_${Date.now()}_${Math.random().toString(36).substr(2, 9)}`,
+            timestamp: Date.now(),
+            role: message.role || 'user',
+            content: message.content || '',
+            metadata: {
+                audioSize: message.audioSize || 0,
+                latency: message.latency || 0,
+                ...message.metadata
+            }
+        };
+        // Adicionar mensagem
+        conversation.messages.push(msg);
+        conversation.metadata.messageCount++;
+        conversation.lastActivity = Date.now();
+        // Limitar número de mensagens
+        if (conversation.messages.length > this.config.maxMessagesPerConversation) {
+            conversation.messages.shift(); // Remove a mais antiga
+        }
+        this.stats.totalMessages++;
+        console.log(`💬 Mensagem adicionada: ${conversationId} - ${msg.role}: ${msg.content.substring(0, 50)}...`);
+        return msg;
+    }
+    /**
+     * Constrói contexto para Ultravox
+     */
+    buildContext(conversationId, maxMessages = 5) {
+        const conversation = this.getConversation(conversationId);
+        if (!conversation || conversation.messages.length === 0) {
+            return '';
+        }
+        // Pegar últimas N mensagens
+        const recentMessages = conversation.messages.slice(-maxMessages);
+        // Formatar contexto de forma simples
+        const context = recentMessages
+            .map(msg => `${msg.role === 'user' ? 'Usuário' : 'Assistente'}: ${msg.content}`)
+            .join('\n');
+        return context;
+    }
+    /**
+     * Recupera histórico de mensagens
+     */
+    getMessages(conversationId, limit = 10, offset = 0) {
+        const conversation = this.getConversation(conversationId);
+        if (!conversation) {
+            return [];
+        }
+        const start = Math.max(0, conversation.messages.length - offset - limit);
+        const end = conversation.messages.length - offset;
+        return conversation.messages.slice(start, end);
+    }
+    /**
+     * Remove conversação mais antiga
+     */
+    removeOldestConversation() {
+        let oldest = null;
+        let oldestTime = Date.now();
+        for (const [id, conv] of this.conversations) {
+            if (conv.lastActivity < oldestTime) {
+                oldest = id;
+                oldestTime = conv.lastActivity;
+            }
+        }
+        if (oldest) {
+            this.conversations.delete(oldest);
+            console.log(`🗑️ Conversação removida (limite atingido): ${oldest}`);
+        }
+    }
+    /**
+     * Limpa conversações expiradas
+     */
+    cleanupExpired() {
+        const now = Date.now();
+        const ttlMs = this.config.ttlMinutes * 60 * 1000;
+        let removed = 0;
+        for (const [id, conv] of this.conversations) {
+            if (now - conv.lastActivity > ttlMs) {
+                this.conversations.delete(id);
+                removed++;
+            }
+        }
+        if (removed > 0) {
+            console.log(`🧹 ${removed} conversações expiradas removidas`);
+            this.stats.activeConversations = this.conversations.size;
+        }
+    }
+    /**
+     * Inicia limpeza automática
+     */
+    startCleanup() {
+        setInterval(() => {
+            this.cleanupExpired();
+        }, this.config.cleanupIntervalMinutes * 60 * 1000);
+    }
+    /**
+     * Retorna estatísticas
+     */
+    getStats() {
+        return {
+            ...this.stats,
+            conversationsInMemory: this.conversations.size,
+            memoryUsage: this.getMemoryUsage()
+        };
+    }
+    /**
+     * Estima uso de memória
+     */
+    getMemoryUsage() {
+        let totalSize = 0;
+        for (const conv of this.conversations.values()) {
+            // Estimar tamanho aproximado
+            totalSize += JSON.stringify(conv).length;
+        }
+        return {
+            bytes: totalSize,
+            kb: (totalSize / 1024).toFixed(2),
+            mb: (totalSize / 1024 / 1024).toFixed(2)
+        };
+    }
+    /**
+     * Lista conversações ativas
+     */
+    listConversations() {
+        const list = [];
+        for (const [id, conv] of this.conversations) {
+            list.push({
+                id: conv.id,
+                createdAt: new Date(conv.createdAt).toISOString(),
+                lastActivity: new Date(conv.lastActivity).toISOString(),
+                messageCount: conv.metadata.messageCount,
+                metadata: conv.metadata
+            });
+        }
+        return list.sort((a, b) => b.lastActivity - a.lastActivity);
+    }
+    /**
+     * Exporta conversação (para backup futuro)
+     */
+    exportConversation(conversationId) {
+        const conversation = this.getConversation(conversationId);
+        if (!conversation) {
+            return null;
+        }
+        return {
+            ...conversation,
+            exported: new Date().toISOString(),
+            version: '1.0'
+        };
+    }
+    /**
+     * Importa conversação (de backup)
+     */
+    importConversation(data) {
+        if (!data || !data.id) {
+            throw new Error('Dados de conversação inválidos');
+        }
+        this.conversations.set(data.id, {
+            ...data,
+            lastActivity: Date.now() // Atualizar última atividade
+        });
+        this.stats.activeConversations = this.conversations.size;
+        console.log(`📥 Conversação importada: ${data.id}`);
+        return data.id;
+    }
+}
+module.exports = ConversationMemory;

services/webrtc_gateway/favicon.ico ADDED Viewed

services/webrtc_gateway/opus-decoder.js ADDED Viewed

	@@ -0,0 +1,150 @@

+/**
+ * Opus Decoder para navegador
+ * Usa Web Audio API para decodificar Opus
+ */
+class OpusDecoder {
+    constructor() {
+        this.audioContext = null;
+        this.isInitialized = false;
+    }
+    async init(sampleRate = 24000) {
+        if (this.isInitialized) return;
+        try {
+            // Criar AudioContext com taxa específica
+            this.audioContext = new (window.AudioContext || window.webkitAudioContext)({
+                sampleRate: sampleRate
+            });
+            this.isInitialized = true;
+            console.log(`✅ OpusDecoder inicializado @ ${sampleRate}Hz`);
+        } catch (error) {
+            console.error('❌ Erro ao inicializar OpusDecoder:', error);
+            throw error;
+        }
+    }
+    /**
+     * Decodifica Opus para PCM usando Web Audio API
+     * @param {ArrayBuffer} opusData - Dados Opus comprimidos
+     * @returns {Promise<ArrayBuffer>} - PCM decodificado
+     */
+    async decode(opusData) {
+        if (!this.isInitialized) {
+            await this.init();
+        }
+        try {
+            // Web Audio API pode decodificar Opus nativamente se embrulhado em container
+            // Para Opus puro, precisamos criar um container WebM mínimo
+            const webmContainer = this.wrapOpusInWebM(opusData);
+            // Decodificar usando Web Audio API
+            const audioBuffer = await this.audioContext.decodeAudioData(webmContainer);
+            // Converter AudioBuffer para PCM Int16
+            const pcmData = this.audioBufferToPCM(audioBuffer);
+            console.log(`🔊 Opus decodificado: ${opusData.byteLength} bytes → ${pcmData.byteLength} bytes PCM`);
+            return pcmData;
+        } catch (error) {
+            console.error('❌ Erro ao decodificar Opus:', error);
+            // Fallback: retornar dados originais se não conseguir decodificar
+            return opusData;
+        }
+    }
+    /**
+     * Envolve dados Opus em container WebM mínimo
+     * @param {ArrayBuffer} opusData - Dados Opus puros
+     * @returns {ArrayBuffer} - WebM container com Opus
+     */
+    wrapOpusInWebM(opusData) {
+        // Implementação simplificada - na prática, usaria uma biblioteca
+        // Por enquanto, assumimos que o navegador pode processar Opus diretamente
+        // se fornecido com headers apropriados
+        // Para implementação real, considerar usar:
+        // - libopus.js (porta WASM do libopus)
+        // - opus-recorder (biblioteca JS para Opus)
+        return opusData; // Placeholder
+    }
+    /**
+     * Converte AudioBuffer para PCM Int16
+     * @param {AudioBuffer} audioBuffer
+     * @returns {ArrayBuffer} PCM Int16 data
+     */
+    audioBufferToPCM(audioBuffer) {
+        const length = audioBuffer.length;
+        const pcmData = new Int16Array(length);
+        const channelData = audioBuffer.getChannelData(0); // Mono
+        // Converter Float32 para Int16
+        for (let i = 0; i < length; i++) {
+            const sample = Math.max(-1, Math.min(1, channelData[i]));
+            pcmData[i] = sample * 0x7FFF;
+        }
+        return pcmData.buffer;
+    }
+}
+/**
+ * Alternativa: Usar biblioteca opus-decoder (mais robusta)
+ * npm install opus-decoder
+ */
+class OpusDecoderWASM {
+    constructor() {
+        this.decoder = null;
+        this.ready = false;
+    }
+    async init(sampleRate = 24000, channels = 1) {
+        if (this.ready) return;
+        try {
+            // Carregar opus-decoder WASM se disponível
+            if (typeof OpusDecoderWebAssembly !== 'undefined') {
+                const { OpusDecoderWebAssembly } = await import('opus-decoder');
+                this.decoder = new OpusDecoderWebAssembly({
+                    channels: channels,
+                    sampleRate: sampleRate
+                });
+                await this.decoder.ready;
+                this.ready = true;
+                console.log('✅ OpusDecoderWASM pronto');
+            } else {
+                throw new Error('opus-decoder não disponível');
+            }
+        } catch (error) {
+            console.warn('⚠️ WASM decoder não disponível, usando fallback');
+            // Fallback para decoder básico
+            this.decoder = new OpusDecoder();
+            await this.decoder.init(sampleRate);
+            this.ready = true;
+        }
+    }
+    async decode(opusData) {
+        if (!this.ready) {
+            await this.init();
+        }
+        if (this.decoder.decode) {
+            // Usar WASM decoder se disponível
+            return await this.decoder.decode(opusData);
+        } else {
+            // Fallback
+            return opusData;
+        }
+    }
+}
+// Exportar para uso global
+window.OpusDecoder = OpusDecoder;
+window.OpusDecoderWASM = OpusDecoderWASM;

services/webrtc_gateway/package-lock.json CHANGED Viewed

@@ -9,6 +9,7 @@
       "version": "1.0.0",
       "license": "ISC",
       "dependencies": {
         "@grpc/grpc-js": "^1.9.11",
         "@grpc/proto-loader": "^0.7.10",
         "express": "^5.1.0",
@@ -18,6 +19,40 @@
         "nodemon": "^3.0.1"
       }
     },
     "node_modules/@grpc/grpc-js": {
       "version": "1.13.4",
       "resolved": "https://registry.npmjs.org/@grpc/grpc-js/-/grpc-js-1.13.4.tgz",
@@ -132,6 +167,12 @@
         "undici-types": "~7.10.0"
       }
     },
     "node_modules/accepts": {
       "version": "2.0.0",
       "resolved": "https://registry.npmjs.org/accepts/-/accepts-2.0.0.tgz",
@@ -145,6 +186,18 @@
         "node": ">= 0.6"
       }
     },
     "node_modules/ansi-regex": {
       "version": "5.0.1",
       "resolved": "https://registry.npmjs.org/ansi-regex/-/ansi-regex-5.0.1.tgz",
@@ -183,11 +236,30 @@
         "node": ">= 8"
       }
     },
     "node_modules/balanced-match": {
       "version": "1.0.2",
       "resolved": "https://registry.npmjs.org/balanced-match/-/balanced-match-1.0.2.tgz",
       "integrity": "sha512-3oSeUO0TMV67hN1AmbXsK4yaqU7tjiHlbxRDZOpH0KW9+CeX4bRAaX0Anxt0tx2MrpRpWwQaPwIlISEJhYU5Pw==",
-      "dev": true,
       "license": "MIT"
     },
     "node_modules/binary-extensions": {
@@ -227,7 +299,6 @@
       "version": "1.1.12",
       "resolved": "https://registry.npmjs.org/brace-expansion/-/brace-expansion-1.1.12.tgz",
       "integrity": "sha512-9T9UjW3r0UW5c1Q7GTwllptXwhvYmEzFhzMfZ9H7FQWt+uZePjZPjBP/W1ZEyZ1twGWom5/56TF4lPcqjnDHcg==",
-      "dev": true,
       "license": "MIT",
       "dependencies": {
         "balanced-match": "^1.0.0",
@@ -310,6 +381,15 @@
         "fsevents": "~2.3.2"
       }
     },
     "node_modules/cliui": {
       "version": "8.0.1",
       "resolved": "https://registry.npmjs.org/cliui/-/cliui-8.0.1.tgz",
@@ -342,13 +422,27 @@
       "integrity": "sha512-dOy+3AuW3a2wNbZHIuMZpTcgjGuLU/uBL/ubcZF9OXbDo8ff4O8yVp5Bf0efS8uEoYo5q4Fx7dY9OgQGXgAsQA==",
       "license": "MIT"
     },
     "node_modules/concat-map": {
       "version": "0.0.1",
       "resolved": "https://registry.npmjs.org/concat-map/-/concat-map-0.0.1.tgz",
       "integrity": "sha512-/Srv4dswyQNBfohGpz9o6Yb3Gz3SrUDqBH5rTuhGR7ahtlbYKnVxw2bCFMRljaA7EXHaXZ8wsHdodFvbkhKmqg==",
-      "dev": true,
       "license": "MIT"
     },
     "node_modules/content-disposition": {
       "version": "1.0.0",
       "resolved": "https://registry.npmjs.org/content-disposition/-/content-disposition-1.0.0.tgz",
@@ -405,6 +499,12 @@
         }
       }
     },
     "node_modules/depd": {
       "version": "2.0.0",
       "resolved": "https://registry.npmjs.org/depd/-/depd-2.0.0.tgz",
@@ -414,6 +514,15 @@
         "node": ">= 0.8"
       }
     },
     "node_modules/dunder-proto": {
       "version": "1.0.1",
       "resolved": "https://registry.npmjs.org/dunder-proto/-/dunder-proto-1.0.1.tgz",
@@ -593,6 +702,36 @@
         "node": ">= 0.8"
       }
     },
     "node_modules/fsevents": {
       "version": "2.3.3",
       "resolved": "https://registry.npmjs.org/fsevents/-/fsevents-2.3.3.tgz",
@@ -617,6 +756,27 @@
         "url": "https://github.com/sponsors/ljharb"
       }
     },
     "node_modules/get-caller-file": {
       "version": "2.0.5",
       "resolved": "https://registry.npmjs.org/get-caller-file/-/get-caller-file-2.0.5.tgz",
@@ -663,6 +823,27 @@
         "node": ">= 0.4"
       }
     },
     "node_modules/glob-parent": {
       "version": "5.1.2",
       "resolved": "https://registry.npmjs.org/glob-parent/-/glob-parent-5.1.2.tgz",
@@ -710,6 +891,12 @@
         "url": "https://github.com/sponsors/ljharb"
       }
     },
     "node_modules/hasown": {
       "version": "2.0.2",
       "resolved": "https://registry.npmjs.org/hasown/-/hasown-2.0.2.tgz",
@@ -747,6 +934,19 @@
         "node": ">= 0.8"
       }
     },
     "node_modules/iconv-lite": {
       "version": "0.6.3",
       "resolved": "https://registry.npmjs.org/iconv-lite/-/iconv-lite-0.6.3.tgz",
@@ -766,6 +966,17 @@
       "dev": true,
       "license": "ISC"
     },
     "node_modules/inherits": {
       "version": "2.0.4",
       "resolved": "https://registry.npmjs.org/inherits/-/inherits-2.0.4.tgz",
@@ -854,6 +1065,30 @@
       "integrity": "sha512-mNAgZ1GmyNhD7AuqnTG3/VQ26o760+ZYBPKjPvugO8+nLbYfX6TVpJPseBvopbdY+qpZ/lKUnmEc1LeZYS3QAA==",
       "license": "Apache-2.0"
     },
     "node_modules/math-intrinsics": {
       "version": "1.1.0",
       "resolved": "https://registry.npmjs.org/math-intrinsics/-/math-intrinsics-1.1.0.tgz",
@@ -909,7 +1144,6 @@
       "version": "3.1.2",
       "resolved": "https://registry.npmjs.org/minimatch/-/minimatch-3.1.2.tgz",
       "integrity": "sha512-J7p63hRiAjw1NDEww1W7i37+ByIrOWO5XQQAzZ3VOcL0PNybwpfmV/N05zFAzwQ9USyEcX6t3UO+K5aqBQOIHw==",
-      "dev": true,
       "license": "ISC",
       "dependencies": {
         "brace-expansion": "^1.1.7"
@@ -918,6 +1152,52 @@
         "node": "*"
       }
     },
     "node_modules/ms": {
       "version": "2.1.3",
       "resolved": "https://registry.npmjs.org/ms/-/ms-2.1.3.tgz",
@@ -933,6 +1213,35 @@
         "node": ">= 0.6"
       }
     },
     "node_modules/nodemon": {
       "version": "3.1.10",
       "resolved": "https://registry.npmjs.org/nodemon/-/nodemon-3.1.10.tgz",
@@ -962,6 +1271,21 @@
         "url": "https://opencollective.com/nodemon"
       }
     },
     "node_modules/normalize-path": {
       "version": "3.0.0",
       "resolved": "https://registry.npmjs.org/normalize-path/-/normalize-path-3.0.0.tgz",
@@ -972,6 +1296,28 @@
         "node": ">=0.10.0"
       }
     },
     "node_modules/object-inspect": {
       "version": "1.13.4",
       "resolved": "https://registry.npmjs.org/object-inspect/-/object-inspect-1.13.4.tgz",
@@ -1014,6 +1360,15 @@
         "node": ">= 0.8"
       }
     },
     "node_modules/path-to-regexp": {
       "version": "8.3.0",
       "resolved": "https://registry.npmjs.org/path-to-regexp/-/path-to-regexp-8.3.0.tgz",
@@ -1136,6 +1491,20 @@
         "url": "https://opencollective.com/express"
       }
     },
     "node_modules/readdirp": {
       "version": "3.6.0",
       "resolved": "https://registry.npmjs.org/readdirp/-/readdirp-3.6.0.tgz",
@@ -1158,6 +1527,22 @@
         "node": ">=0.10.0"
       }
     },
     "node_modules/router": {
       "version": "2.2.0",
       "resolved": "https://registry.npmjs.org/router/-/router-2.2.0.tgz",
@@ -1204,7 +1589,6 @@
       "version": "7.7.2",
       "resolved": "https://registry.npmjs.org/semver/-/semver-7.7.2.tgz",
       "integrity": "sha512-RF0Fw+rO5AMf9MAyaRXI4AV0Ulj5lMHqVxxdSgiVbixSCXoEmmX/jk0CuJw4+3SqroYO9VoUh+HcuJivvtJemA==",
-      "dev": true,
       "license": "ISC",
       "bin": {
         "semver": "bin/semver.js"
@@ -1250,6 +1634,12 @@
         "node": ">= 18"
       }
     },
     "node_modules/setprototypeof": {
       "version": "1.2.0",
       "resolved": "https://registry.npmjs.org/setprototypeof/-/setprototypeof-1.2.0.tgz",
@@ -1328,6 +1718,12 @@
         "url": "https://github.com/sponsors/ljharb"
       }
     },
     "node_modules/simple-update-notifier": {
       "version": "2.0.0",
       "resolved": "https://registry.npmjs.org/simple-update-notifier/-/simple-update-notifier-2.0.0.tgz",
@@ -1350,6 +1746,15 @@
         "node": ">= 0.8"
       }
     },
     "node_modules/string-width": {
       "version": "4.2.3",
       "resolved": "https://registry.npmjs.org/string-width/-/string-width-4.2.3.tgz",
@@ -1389,6 +1794,23 @@
         "node": ">=4"
       }
     },
     "node_modules/to-regex-range": {
       "version": "5.0.1",
       "resolved": "https://registry.npmjs.org/to-regex-range/-/to-regex-range-5.0.1.tgz",
@@ -1421,6 +1843,12 @@
         "nodetouch": "bin/nodetouch.js"
       }
     },
     "node_modules/type-is": {
       "version": "2.0.1",
       "resolved": "https://registry.npmjs.org/type-is/-/type-is-2.0.1.tgz",
@@ -1457,6 +1885,12 @@
         "node": ">= 0.8"
       }
     },
     "node_modules/vary": {
       "version": "1.1.2",
       "resolved": "https://registry.npmjs.org/vary/-/vary-1.1.2.tgz",
@@ -1466,6 +1900,31 @@
         "node": ">= 0.8"
       }
     },
     "node_modules/wrap-ansi": {
       "version": "7.0.0",
       "resolved": "https://registry.npmjs.org/wrap-ansi/-/wrap-ansi-7.0.0.tgz",
@@ -1519,6 +1978,12 @@
         "node": ">=10"
       }
     },
     "node_modules/yargs": {
       "version": "17.7.2",
       "resolved": "https://registry.npmjs.org/yargs/-/yargs-17.7.2.tgz",

       "version": "1.0.0",
       "license": "ISC",
       "dependencies": {
+        "@discordjs/opus": "^0.10.0",
         "@grpc/grpc-js": "^1.9.11",
         "@grpc/proto-loader": "^0.7.10",
         "express": "^5.1.0",
         "nodemon": "^3.0.1"
       }
     },
+    "node_modules/@discordjs/node-pre-gyp": {
+      "version": "0.4.5",
+      "resolved": "https://registry.npmjs.org/@discordjs/node-pre-gyp/-/node-pre-gyp-0.4.5.tgz",
+      "integrity": "sha512-YJOVVZ545x24mHzANfYoy0BJX5PDyeZlpiJjDkUBM/V/Ao7TFX9lcUvCN4nr0tbr5ubeaXxtEBILUrHtTphVeQ==",
+      "license": "BSD-3-Clause",
+      "dependencies": {
+        "detect-libc": "^2.0.0",
+        "https-proxy-agent": "^5.0.0",
+        "make-dir": "^3.1.0",
+        "node-fetch": "^2.6.7",
+        "nopt": "^5.0.0",
+        "npmlog": "^5.0.1",
+        "rimraf": "^3.0.2",
+        "semver": "^7.3.5",
+        "tar": "^6.1.11"
+      },
+      "bin": {
+        "node-pre-gyp": "bin/node-pre-gyp"
+      }
+    },
+    "node_modules/@discordjs/opus": {
+      "version": "0.10.0",
+      "resolved": "https://registry.npmjs.org/@discordjs/opus/-/opus-0.10.0.tgz",
+      "integrity": "sha512-HHEnSNrSPmFEyndRdQBJN2YE6egyXS9JUnJWyP6jficK0Y+qKMEZXyYTgmzpjrxXP1exM/hKaNP7BRBUEWkU5w==",
+      "hasInstallScript": true,
+      "license": "MIT",
+      "dependencies": {
+        "@discordjs/node-pre-gyp": "^0.4.5",
+        "node-addon-api": "^8.1.0"
+      },
+      "engines": {
+        "node": ">=12.0.0"
+      }
+    },
     "node_modules/@grpc/grpc-js": {
       "version": "1.13.4",
       "resolved": "https://registry.npmjs.org/@grpc/grpc-js/-/grpc-js-1.13.4.tgz",
         "undici-types": "~7.10.0"
       }
     },
+    "node_modules/abbrev": {
+      "version": "1.1.1",
+      "resolved": "https://registry.npmjs.org/abbrev/-/abbrev-1.1.1.tgz",
+      "integrity": "sha512-nne9/IiQ/hzIhY6pdDnbBtz7DjPTKrY00P/zvPSm5pOFkl6xuGrGnXn/VtTNNfNtAfZ9/1RtehkszU9qcTii0Q==",
+      "license": "ISC"
+    },
     "node_modules/accepts": {
       "version": "2.0.0",
       "resolved": "https://registry.npmjs.org/accepts/-/accepts-2.0.0.tgz",
         "node": ">= 0.6"
       }
     },
+    "node_modules/agent-base": {
+      "version": "6.0.2",
+      "resolved": "https://registry.npmjs.org/agent-base/-/agent-base-6.0.2.tgz",
+      "integrity": "sha512-RZNwNclF7+MS/8bDg70amg32dyeZGZxiDuQmZxKLAlQjr3jGyLx+4Kkk58UO7D2QdgFIQCovuSuZESne6RG6XQ==",
+      "license": "MIT",
+      "dependencies": {
+        "debug": "4"
+      },
+      "engines": {
+        "node": ">= 6.0.0"
+      }
+    },
     "node_modules/ansi-regex": {
       "version": "5.0.1",
       "resolved": "https://registry.npmjs.org/ansi-regex/-/ansi-regex-5.0.1.tgz",
         "node": ">= 8"
       }
     },
+    "node_modules/aproba": {
+      "version": "2.1.0",
+      "resolved": "https://registry.npmjs.org/aproba/-/aproba-2.1.0.tgz",
+      "integrity": "sha512-tLIEcj5GuR2RSTnxNKdkK0dJ/GrC7P38sUkiDmDuHfsHmbagTFAxDVIBltoklXEVIQ/f14IL8IMJ5pn9Hez1Ew==",
+      "license": "ISC"
+    },
+    "node_modules/are-we-there-yet": {
+      "version": "2.0.0",
+      "resolved": "https://registry.npmjs.org/are-we-there-yet/-/are-we-there-yet-2.0.0.tgz",
+      "integrity": "sha512-Ci/qENmwHnsYo9xKIcUJN5LeDKdJ6R1Z1j9V/J5wyq8nh/mYPEpIKJbBZXtZjG04HiK7zV/p6Vs9952MrMeUIw==",
+      "deprecated": "This package is no longer supported.",
+      "license": "ISC",
+      "dependencies": {
+        "delegates": "^1.0.0",
+        "readable-stream": "^3.6.0"
+      },
+      "engines": {
+        "node": ">=10"
+      }
+    },
     "node_modules/balanced-match": {
       "version": "1.0.2",
       "resolved": "https://registry.npmjs.org/balanced-match/-/balanced-match-1.0.2.tgz",
       "integrity": "sha512-3oSeUO0TMV67hN1AmbXsK4yaqU7tjiHlbxRDZOpH0KW9+CeX4bRAaX0Anxt0tx2MrpRpWwQaPwIlISEJhYU5Pw==",
       "license": "MIT"
     },
     "node_modules/binary-extensions": {
       "version": "1.1.12",
       "resolved": "https://registry.npmjs.org/brace-expansion/-/brace-expansion-1.1.12.tgz",
       "integrity": "sha512-9T9UjW3r0UW5c1Q7GTwllptXwhvYmEzFhzMfZ9H7FQWt+uZePjZPjBP/W1ZEyZ1twGWom5/56TF4lPcqjnDHcg==",
       "license": "MIT",
       "dependencies": {
         "balanced-match": "^1.0.0",
         "fsevents": "~2.3.2"
       }
     },
+    "node_modules/chownr": {
+      "version": "2.0.0",
+      "resolved": "https://registry.npmjs.org/chownr/-/chownr-2.0.0.tgz",
+      "integrity": "sha512-bIomtDF5KGpdogkLd9VspvFzk9KfpyyGlS8YFVZl7TGPBHL5snIOnxeshwVgPteQ9b4Eydl+pVbIyE1DcvCWgQ==",
+      "license": "ISC",
+      "engines": {
+        "node": ">=10"
+      }
+    },
     "node_modules/cliui": {
       "version": "8.0.1",
       "resolved": "https://registry.npmjs.org/cliui/-/cliui-8.0.1.tgz",
       "integrity": "sha512-dOy+3AuW3a2wNbZHIuMZpTcgjGuLU/uBL/ubcZF9OXbDo8ff4O8yVp5Bf0efS8uEoYo5q4Fx7dY9OgQGXgAsQA==",
       "license": "MIT"
     },
+    "node_modules/color-support": {
+      "version": "1.1.3",
+      "resolved": "https://registry.npmjs.org/color-support/-/color-support-1.1.3.tgz",
+      "integrity": "sha512-qiBjkpbMLO/HL68y+lh4q0/O1MZFj2RX6X/KmMa3+gJD3z+WwI1ZzDHysvqHGS3mP6mznPckpXmw1nI9cJjyRg==",
+      "license": "ISC",
+      "bin": {
+        "color-support": "bin.js"
+      }
+    },
     "node_modules/concat-map": {
       "version": "0.0.1",
       "resolved": "https://registry.npmjs.org/concat-map/-/concat-map-0.0.1.tgz",
       "integrity": "sha512-/Srv4dswyQNBfohGpz9o6Yb3Gz3SrUDqBH5rTuhGR7ahtlbYKnVxw2bCFMRljaA7EXHaXZ8wsHdodFvbkhKmqg==",
       "license": "MIT"
     },
+    "node_modules/console-control-strings": {
+      "version": "1.1.0",
+      "resolved": "https://registry.npmjs.org/console-control-strings/-/console-control-strings-1.1.0.tgz",
+      "integrity": "sha512-ty/fTekppD2fIwRvnZAVdeOiGd1c7YXEixbgJTNzqcxJWKQnjJ/V1bNEEE6hygpM3WjwHFUVK6HTjWSzV4a8sQ==",
+      "license": "ISC"
+    },
     "node_modules/content-disposition": {
       "version": "1.0.0",
       "resolved": "https://registry.npmjs.org/content-disposition/-/content-disposition-1.0.0.tgz",
         }
       }
     },
+    "node_modules/delegates": {
+      "version": "1.0.0",
+      "resolved": "https://registry.npmjs.org/delegates/-/delegates-1.0.0.tgz",
+      "integrity": "sha512-bd2L678uiWATM6m5Z1VzNCErI3jiGzt6HGY8OVICs40JQq/HALfbyNJmp0UDakEY4pMMaN0Ly5om/B1VI/+xfQ==",
+      "license": "MIT"
+    },
     "node_modules/depd": {
       "version": "2.0.0",
       "resolved": "https://registry.npmjs.org/depd/-/depd-2.0.0.tgz",
         "node": ">= 0.8"
       }
     },
+    "node_modules/detect-libc": {
+      "version": "2.0.4",
+      "resolved": "https://registry.npmjs.org/detect-libc/-/detect-libc-2.0.4.tgz",
+      "integrity": "sha512-3UDv+G9CsCKO1WKMGw9fwq/SWJYbI0c5Y7LU1AXYoDdbhE2AHQ6N6Nb34sG8Fj7T5APy8qXDCKuuIHd1BR0tVA==",
+      "license": "Apache-2.0",
+      "engines": {
+        "node": ">=8"
+      }
+    },
     "node_modules/dunder-proto": {
       "version": "1.0.1",
       "resolved": "https://registry.npmjs.org/dunder-proto/-/dunder-proto-1.0.1.tgz",
         "node": ">= 0.8"
       }
     },
+    "node_modules/fs-minipass": {
+      "version": "2.1.0",
+      "resolved": "https://registry.npmjs.org/fs-minipass/-/fs-minipass-2.1.0.tgz",
+      "integrity": "sha512-V/JgOLFCS+R6Vcq0slCuaeWEdNC3ouDlJMNIsacH2VtALiu9mV4LPrHc5cDl8k5aw6J8jwgWWpiTo5RYhmIzvg==",
+      "license": "ISC",
+      "dependencies": {
+        "minipass": "^3.0.0"
+      },
+      "engines": {
+        "node": ">= 8"
+      }
+    },
+    "node_modules/fs-minipass/node_modules/minipass": {
+      "version": "3.3.6",
+      "resolved": "https://registry.npmjs.org/minipass/-/minipass-3.3.6.tgz",
+      "integrity": "sha512-DxiNidxSEK+tHG6zOIklvNOwm3hvCrbUrdtzY74U6HKTJxvIDfOUL5W5P2Ghd3DTkhhKPYGqeNUIh5qcM4YBfw==",
+      "license": "ISC",
+      "dependencies": {
+        "yallist": "^4.0.0"
+      },
+      "engines": {
+        "node": ">=8"
+      }
+    },
+    "node_modules/fs.realpath": {
+      "version": "1.0.0",
+      "resolved": "https://registry.npmjs.org/fs.realpath/-/fs.realpath-1.0.0.tgz",
+      "integrity": "sha512-OO0pH2lK6a0hZnAdau5ItzHPI6pUlvI7jMVnxUQRtw4owF2wk8lOSabtGDCTP4Ggrg2MbGnWO9X8K1t4+fGMDw==",
+      "license": "ISC"
+    },
     "node_modules/fsevents": {
       "version": "2.3.3",
       "resolved": "https://registry.npmjs.org/fsevents/-/fsevents-2.3.3.tgz",
         "url": "https://github.com/sponsors/ljharb"
       }
     },
+    "node_modules/gauge": {
+      "version": "3.0.2",
+      "resolved": "https://registry.npmjs.org/gauge/-/gauge-3.0.2.tgz",
+      "integrity": "sha512-+5J6MS/5XksCuXq++uFRsnUd7Ovu1XenbeuIuNRJxYWjgQbPuFhT14lAvsWfqfAmnwluf1OwMjz39HjfLPci0Q==",
+      "deprecated": "This package is no longer supported.",
+      "license": "ISC",
+      "dependencies": {
+        "aproba": "^1.0.3 || ^2.0.0",
+        "color-support": "^1.1.2",
+        "console-control-strings": "^1.0.0",
+        "has-unicode": "^2.0.1",
+        "object-assign": "^4.1.1",
+        "signal-exit": "^3.0.0",
+        "string-width": "^4.2.3",
+        "strip-ansi": "^6.0.1",
+        "wide-align": "^1.1.2"
+      },
+      "engines": {
+        "node": ">=10"
+      }
+    },
     "node_modules/get-caller-file": {
       "version": "2.0.5",
       "resolved": "https://registry.npmjs.org/get-caller-file/-/get-caller-file-2.0.5.tgz",
         "node": ">= 0.4"
       }
     },
+    "node_modules/glob": {
+      "version": "7.2.3",
+      "resolved": "https://registry.npmjs.org/glob/-/glob-7.2.3.tgz",
+      "integrity": "sha512-nFR0zLpU2YCaRxwoCJvL6UvCH2JFyFVIvwTLsIf21AuHlMskA1hhTdk+LlYJtOlYt9v6dvszD2BGRqBL+iQK9Q==",
+      "deprecated": "Glob versions prior to v9 are no longer supported",
+      "license": "ISC",
+      "dependencies": {
+        "fs.realpath": "^1.0.0",
+        "inflight": "^1.0.4",
+        "inherits": "2",
+        "minimatch": "^3.1.1",
+        "once": "^1.3.0",
+        "path-is-absolute": "^1.0.0"
+      },
+      "engines": {
+        "node": "*"
+      },
+      "funding": {
+        "url": "https://github.com/sponsors/isaacs"
+      }
+    },
     "node_modules/glob-parent": {
       "version": "5.1.2",
       "resolved": "https://registry.npmjs.org/glob-parent/-/glob-parent-5.1.2.tgz",
         "url": "https://github.com/sponsors/ljharb"
       }
     },
+    "node_modules/has-unicode": {
+      "version": "2.0.1",
+      "resolved": "https://registry.npmjs.org/has-unicode/-/has-unicode-2.0.1.tgz",
+      "integrity": "sha512-8Rf9Y83NBReMnx0gFzA8JImQACstCYWUplepDa9xprwwtmgEZUF0h/i5xSA625zB/I37EtrswSST6OXxwaaIJQ==",
+      "license": "ISC"
+    },
     "node_modules/hasown": {
       "version": "2.0.2",
       "resolved": "https://registry.npmjs.org/hasown/-/hasown-2.0.2.tgz",
         "node": ">= 0.8"
       }
     },
+    "node_modules/https-proxy-agent": {
+      "version": "5.0.1",
+      "resolved": "https://registry.npmjs.org/https-proxy-agent/-/https-proxy-agent-5.0.1.tgz",
+      "integrity": "sha512-dFcAjpTQFgoLMzC2VwU+C/CbS7uRL0lWmxDITmqm7C+7F0Odmj6s9l6alZc6AELXhrnggM2CeWSXHGOdX2YtwA==",
+      "license": "MIT",
+      "dependencies": {
+        "agent-base": "6",
+        "debug": "4"
+      },
+      "engines": {
+        "node": ">= 6"
+      }
+    },
     "node_modules/iconv-lite": {
       "version": "0.6.3",
       "resolved": "https://registry.npmjs.org/iconv-lite/-/iconv-lite-0.6.3.tgz",
       "dev": true,
       "license": "ISC"
     },
+    "node_modules/inflight": {
+      "version": "1.0.6",
+      "resolved": "https://registry.npmjs.org/inflight/-/inflight-1.0.6.tgz",
+      "integrity": "sha512-k92I/b08q4wvFscXCLvqfsHCrjrF7yiXsQuIVvVE7N82W3+aqpzuUdBbfhWcy/FZR3/4IgflMgKLOsvPDrGCJA==",
+      "deprecated": "This module is not supported, and leaks memory. Do not use it. Check out lru-cache if you want a good and tested way to coalesce async requests by a key value, which is much more comprehensive and powerful.",
+      "license": "ISC",
+      "dependencies": {
+        "once": "^1.3.0",
+        "wrappy": "1"
+      }
+    },
     "node_modules/inherits": {
       "version": "2.0.4",
       "resolved": "https://registry.npmjs.org/inherits/-/inherits-2.0.4.tgz",
       "integrity": "sha512-mNAgZ1GmyNhD7AuqnTG3/VQ26o760+ZYBPKjPvugO8+nLbYfX6TVpJPseBvopbdY+qpZ/lKUnmEc1LeZYS3QAA==",
       "license": "Apache-2.0"
     },
+    "node_modules/make-dir": {
+      "version": "3.1.0",
+      "resolved": "https://registry.npmjs.org/make-dir/-/make-dir-3.1.0.tgz",
+      "integrity": "sha512-g3FeP20LNwhALb/6Cz6Dd4F2ngze0jz7tbzrD2wAV+o9FeNHe4rL+yK2md0J/fiSf1sa1ADhXqi5+oVwOM/eGw==",
+      "license": "MIT",
+      "dependencies": {
+        "semver": "^6.0.0"
+      },
+      "engines": {
+        "node": ">=8"
+      },
+      "funding": {
+        "url": "https://github.com/sponsors/sindresorhus"
+      }
+    },
+    "node_modules/make-dir/node_modules/semver": {
+      "version": "6.3.1",
+      "resolved": "https://registry.npmjs.org/semver/-/semver-6.3.1.tgz",
+      "integrity": "sha512-BR7VvDCVHO+q2xBEWskxS6DJE1qRnb7DxzUrogb71CWoSficBxYsiAGd+Kl0mmq/MprG9yArRkyrQxTO6XjMzA==",
+      "license": "ISC",
+      "bin": {
+        "semver": "bin/semver.js"
+      }
+    },
     "node_modules/math-intrinsics": {
       "version": "1.1.0",
       "resolved": "https://registry.npmjs.org/math-intrinsics/-/math-intrinsics-1.1.0.tgz",
       "version": "3.1.2",
       "resolved": "https://registry.npmjs.org/minimatch/-/minimatch-3.1.2.tgz",
       "integrity": "sha512-J7p63hRiAjw1NDEww1W7i37+ByIrOWO5XQQAzZ3VOcL0PNybwpfmV/N05zFAzwQ9USyEcX6t3UO+K5aqBQOIHw==",
       "license": "ISC",
       "dependencies": {
         "brace-expansion": "^1.1.7"
         "node": "*"
       }
     },
+    "node_modules/minipass": {
+      "version": "5.0.0",
+      "resolved": "https://registry.npmjs.org/minipass/-/minipass-5.0.0.tgz",
+      "integrity": "sha512-3FnjYuehv9k6ovOEbyOswadCDPX1piCfhV8ncmYtHOjuPwylVWsghTLo7rabjC3Rx5xD4HDx8Wm1xnMF7S5qFQ==",
+      "license": "ISC",
+      "engines": {
+        "node": ">=8"
+      }
+    },
+    "node_modules/minizlib": {
+      "version": "2.1.2",
+      "resolved": "https://registry.npmjs.org/minizlib/-/minizlib-2.1.2.tgz",
+      "integrity": "sha512-bAxsR8BVfj60DWXHE3u30oHzfl4G7khkSuPW+qvpd7jFRHm7dLxOjUk1EHACJ/hxLY8phGJ0YhYHZo7jil7Qdg==",
+      "license": "MIT",
+      "dependencies": {
+        "minipass": "^3.0.0",
+        "yallist": "^4.0.0"
+      },
+      "engines": {
+        "node": ">= 8"
+      }
+    },
+    "node_modules/minizlib/node_modules/minipass": {
+      "version": "3.3.6",
+      "resolved": "https://registry.npmjs.org/minipass/-/minipass-3.3.6.tgz",
+      "integrity": "sha512-DxiNidxSEK+tHG6zOIklvNOwm3hvCrbUrdtzY74U6HKTJxvIDfOUL5W5P2Ghd3DTkhhKPYGqeNUIh5qcM4YBfw==",
+      "license": "ISC",
+      "dependencies": {
+        "yallist": "^4.0.0"
+      },
+      "engines": {
+        "node": ">=8"
+      }
+    },
+    "node_modules/mkdirp": {
+      "version": "1.0.4",
+      "resolved": "https://registry.npmjs.org/mkdirp/-/mkdirp-1.0.4.tgz",
+      "integrity": "sha512-vVqVZQyf3WLx2Shd0qJ9xuvqgAyKPLAiqITEtqW0oIUjzo3PePDd6fW9iFz30ef7Ysp/oiWqbhszeGWW2T6Gzw==",
+      "license": "MIT",
+      "bin": {
+        "mkdirp": "bin/cmd.js"
+      },
+      "engines": {
+        "node": ">=10"
+      }
+    },
     "node_modules/ms": {
       "version": "2.1.3",
       "resolved": "https://registry.npmjs.org/ms/-/ms-2.1.3.tgz",
         "node": ">= 0.6"
       }
     },
+    "node_modules/node-addon-api": {
+      "version": "8.5.0",
+      "resolved": "https://registry.npmjs.org/node-addon-api/-/node-addon-api-8.5.0.tgz",
+      "integrity": "sha512-/bRZty2mXUIFY/xU5HLvveNHlswNJej+RnxBjOMkidWfwZzgTbPG1E3K5TOxRLOR+5hX7bSofy8yf1hZevMS8A==",
+      "license": "MIT",
+      "engines": {
+        "node": "^18 || ^20 || >= 21"
+      }
+    },
+    "node_modules/node-fetch": {
+      "version": "2.7.0",
+      "resolved": "https://registry.npmjs.org/node-fetch/-/node-fetch-2.7.0.tgz",
+      "integrity": "sha512-c4FRfUm/dbcWZ7U+1Wq0AwCyFL+3nt2bEw05wfxSz+DWpWsitgmSgYmy2dQdWyKC1694ELPqMs/YzUSNozLt8A==",
+      "license": "MIT",
+      "dependencies": {
+        "whatwg-url": "^5.0.0"
+      },
+      "engines": {
+        "node": "4.x || >=6.0.0"
+      },
+      "peerDependencies": {
+        "encoding": "^0.1.0"
+      },
+      "peerDependenciesMeta": {
+        "encoding": {
+          "optional": true
+        }
+      }
+    },
     "node_modules/nodemon": {
       "version": "3.1.10",
       "resolved": "https://registry.npmjs.org/nodemon/-/nodemon-3.1.10.tgz",
         "url": "https://opencollective.com/nodemon"
       }
     },
+    "node_modules/nopt": {
+      "version": "5.0.0",
+      "resolved": "https://registry.npmjs.org/nopt/-/nopt-5.0.0.tgz",
+      "integrity": "sha512-Tbj67rffqceeLpcRXrT7vKAN8CwfPeIBgM7E6iBkmKLV7bEMwpGgYLGv0jACUsECaa/vuxP0IjEont6umdMgtQ==",
+      "license": "ISC",
+      "dependencies": {
+        "abbrev": "1"
+      },
+      "bin": {
+        "nopt": "bin/nopt.js"
+      },
+      "engines": {
+        "node": ">=6"
+      }
+    },
     "node_modules/normalize-path": {
       "version": "3.0.0",
       "resolved": "https://registry.npmjs.org/normalize-path/-/normalize-path-3.0.0.tgz",
         "node": ">=0.10.0"
       }
     },
+    "node_modules/npmlog": {
+      "version": "5.0.1",
+      "resolved": "https://registry.npmjs.org/npmlog/-/npmlog-5.0.1.tgz",
+      "integrity": "sha512-AqZtDUWOMKs1G/8lwylVjrdYgqA4d9nu8hc+0gzRxlDb1I10+FHBGMXs6aiQHFdCUUlqH99MUMuLfzWDNDtfxw==",
+      "deprecated": "This package is no longer supported.",
+      "license": "ISC",
+      "dependencies": {
+        "are-we-there-yet": "^2.0.0",
+        "console-control-strings": "^1.1.0",
+        "gauge": "^3.0.0",
+        "set-blocking": "^2.0.0"
+      }
+    },
+    "node_modules/object-assign": {
+      "version": "4.1.1",
+      "resolved": "https://registry.npmjs.org/object-assign/-/object-assign-4.1.1.tgz",
+      "integrity": "sha512-rJgTQnkUnH1sFw8yT6VSU3zD3sWmu6sZhIseY8VX+GRu3P6F7Fu+JNDoXfklElbLJSnc3FUQHVe4cU5hj+BcUg==",
+      "license": "MIT",
+      "engines": {
+        "node": ">=0.10.0"
+      }
+    },
     "node_modules/object-inspect": {
       "version": "1.13.4",
       "resolved": "https://registry.npmjs.org/object-inspect/-/object-inspect-1.13.4.tgz",
         "node": ">= 0.8"
       }
     },
+    "node_modules/path-is-absolute": {
+      "version": "1.0.1",
+      "resolved": "https://registry.npmjs.org/path-is-absolute/-/path-is-absolute-1.0.1.tgz",
+      "integrity": "sha512-AVbw3UJ2e9bq64vSaS9Am0fje1Pa8pbGqTTsmXfaIiMpnr5DlDhfJOuLj9Sf95ZPVDAUerDfEk88MPmPe7UCQg==",
+      "license": "MIT",
+      "engines": {
+        "node": ">=0.10.0"
+      }
+    },
     "node_modules/path-to-regexp": {
       "version": "8.3.0",
       "resolved": "https://registry.npmjs.org/path-to-regexp/-/path-to-regexp-8.3.0.tgz",
         "url": "https://opencollective.com/express"
       }
     },
+    "node_modules/readable-stream": {
+      "version": "3.6.2",
+      "resolved": "https://registry.npmjs.org/readable-stream/-/readable-stream-3.6.2.tgz",
+      "integrity": "sha512-9u/sniCrY3D5WdsERHzHE4G2YCXqoG5FTHUiCC4SIbr6XcLZBY05ya9EKjYek9O5xOAwjGq+1JdGBAS7Q9ScoA==",
+      "license": "MIT",
+      "dependencies": {
+        "inherits": "^2.0.3",
+        "string_decoder": "^1.1.1",
+        "util-deprecate": "^1.0.1"
+      },
+      "engines": {
+        "node": ">= 6"
+      }
+    },
     "node_modules/readdirp": {
       "version": "3.6.0",
       "resolved": "https://registry.npmjs.org/readdirp/-/readdirp-3.6.0.tgz",
         "node": ">=0.10.0"
       }
     },
+    "node_modules/rimraf": {
+      "version": "3.0.2",
+      "resolved": "https://registry.npmjs.org/rimraf/-/rimraf-3.0.2.tgz",
+      "integrity": "sha512-JZkJMZkAGFFPP2YqXZXPbMlMBgsxzE8ILs4lMIX/2o0L9UBw9O/Y3o6wFw/i9YLapcUJWwqbi3kdxIPdC62TIA==",
+      "deprecated": "Rimraf versions prior to v4 are no longer supported",
+      "license": "ISC",
+      "dependencies": {
+        "glob": "^7.1.3"
+      },
+      "bin": {
+        "rimraf": "bin.js"
+      },
+      "funding": {
+        "url": "https://github.com/sponsors/isaacs"
+      }
+    },
     "node_modules/router": {
       "version": "2.2.0",
       "resolved": "https://registry.npmjs.org/router/-/router-2.2.0.tgz",
       "version": "7.7.2",
       "resolved": "https://registry.npmjs.org/semver/-/semver-7.7.2.tgz",
       "integrity": "sha512-RF0Fw+rO5AMf9MAyaRXI4AV0Ulj5lMHqVxxdSgiVbixSCXoEmmX/jk0CuJw4+3SqroYO9VoUh+HcuJivvtJemA==",
       "license": "ISC",
       "bin": {
         "semver": "bin/semver.js"
         "node": ">= 18"
       }
     },
+    "node_modules/set-blocking": {
+      "version": "2.0.0",
+      "resolved": "https://registry.npmjs.org/set-blocking/-/set-blocking-2.0.0.tgz",
+      "integrity": "sha512-KiKBS8AnWGEyLzofFfmvKwpdPzqiy16LvQfK3yv/fVH7Bj13/wl3JSR1J+rfgRE9q7xUJK4qvgS8raSOeLUehw==",
+      "license": "ISC"
+    },
     "node_modules/setprototypeof": {
       "version": "1.2.0",
       "resolved": "https://registry.npmjs.org/setprototypeof/-/setprototypeof-1.2.0.tgz",
         "url": "https://github.com/sponsors/ljharb"
       }
     },
+    "node_modules/signal-exit": {
+      "version": "3.0.7",
+      "resolved": "https://registry.npmjs.org/signal-exit/-/signal-exit-3.0.7.tgz",
+      "integrity": "sha512-wnD2ZE+l+SPC/uoS0vXeE9L1+0wuaMqKlfz9AMUo38JsyLSBWSFcHR1Rri62LZc12vLr1gb3jl7iwQhgwpAbGQ==",
+      "license": "ISC"
+    },
     "node_modules/simple-update-notifier": {
       "version": "2.0.0",
       "resolved": "https://registry.npmjs.org/simple-update-notifier/-/simple-update-notifier-2.0.0.tgz",
         "node": ">= 0.8"
       }
     },
+    "node_modules/string_decoder": {
+      "version": "1.3.0",
+      "resolved": "https://registry.npmjs.org/string_decoder/-/string_decoder-1.3.0.tgz",
+      "integrity": "sha512-hkRX8U1WjJFd8LsDJ2yQ/wWWxaopEsABU1XfkM8A+j0+85JAGppt16cr1Whg6KIbb4okU6Mql6BOj+uup/wKeA==",
+      "license": "MIT",
+      "dependencies": {
+        "safe-buffer": "~5.2.0"
+      }
+    },
     "node_modules/string-width": {
       "version": "4.2.3",
       "resolved": "https://registry.npmjs.org/string-width/-/string-width-4.2.3.tgz",
         "node": ">=4"
       }
     },
+    "node_modules/tar": {
+      "version": "6.2.1",
+      "resolved": "https://registry.npmjs.org/tar/-/tar-6.2.1.tgz",
+      "integrity": "sha512-DZ4yORTwrbTj/7MZYq2w+/ZFdI6OZ/f9SFHR+71gIVUZhOQPHzVCLpvRnPgyaMpfWxxk/4ONva3GQSyNIKRv6A==",
+      "license": "ISC",
+      "dependencies": {
+        "chownr": "^2.0.0",
+        "fs-minipass": "^2.0.0",
+        "minipass": "^5.0.0",
+        "minizlib": "^2.1.1",
+        "mkdirp": "^1.0.3",
+        "yallist": "^4.0.0"
+      },
+      "engines": {
+        "node": ">=10"
+      }
+    },
     "node_modules/to-regex-range": {
       "version": "5.0.1",
       "resolved": "https://registry.npmjs.org/to-regex-range/-/to-regex-range-5.0.1.tgz",
         "nodetouch": "bin/nodetouch.js"
       }
     },
+    "node_modules/tr46": {
+      "version": "0.0.3",
+      "resolved": "https://registry.npmjs.org/tr46/-/tr46-0.0.3.tgz",
+      "integrity": "sha512-N3WMsuqV66lT30CrXNbEjx4GEwlow3v6rr4mCcv6prnfwhS01rkgyFdjPNBYd9br7LpXV1+Emh01fHnq2Gdgrw==",
+      "license": "MIT"
+    },
     "node_modules/type-is": {
       "version": "2.0.1",
       "resolved": "https://registry.npmjs.org/type-is/-/type-is-2.0.1.tgz",
         "node": ">= 0.8"
       }
     },
+    "node_modules/util-deprecate": {
+      "version": "1.0.2",
+      "resolved": "https://registry.npmjs.org/util-deprecate/-/util-deprecate-1.0.2.tgz",
+      "integrity": "sha512-EPD5q1uXyFxJpCrLnCc1nHnq3gOa6DZBocAIiI2TaSCA7VCJ1UJDMagCzIkXNsUYfD1daK//LTEQ8xiIbrHtcw==",
+      "license": "MIT"
+    },
     "node_modules/vary": {
       "version": "1.1.2",
       "resolved": "https://registry.npmjs.org/vary/-/vary-1.1.2.tgz",
         "node": ">= 0.8"
       }
     },
+    "node_modules/webidl-conversions": {
+      "version": "3.0.1",
+      "resolved": "https://registry.npmjs.org/webidl-conversions/-/webidl-conversions-3.0.1.tgz",
+      "integrity": "sha512-2JAn3z8AR6rjK8Sm8orRC0h/bcl/DqL7tRPdGZ4I1CjdF+EaMLmYxBHyXuKL849eucPFhvBoxMsflfOb8kxaeQ==",
+      "license": "BSD-2-Clause"
+    },
+    "node_modules/whatwg-url": {
+      "version": "5.0.0",
+      "resolved": "https://registry.npmjs.org/whatwg-url/-/whatwg-url-5.0.0.tgz",
+      "integrity": "sha512-saE57nupxk6v3HY35+jzBwYa0rKSy0XR8JSxZPwgLr7ys0IBzhGviA1/TUGJLmSVqs8pb9AnvICXEuOHLprYTw==",
+      "license": "MIT",
+      "dependencies": {
+        "tr46": "~0.0.3",
+        "webidl-conversions": "^3.0.0"
+      }
+    },
+    "node_modules/wide-align": {
+      "version": "1.1.5",
+      "resolved": "https://registry.npmjs.org/wide-align/-/wide-align-1.1.5.tgz",
+      "integrity": "sha512-eDMORYaPNZ4sQIuuYPDHdQvf4gyCF9rEEV/yPxGfwPkRodwEgiMUUXTx/dex+Me0wxx53S+NgUHaP7y3MGlDmg==",
+      "license": "ISC",
+      "dependencies": {
+        "string-width": "^1.0.2 || 2 || 3 || 4"
+      }
+    },
     "node_modules/wrap-ansi": {
       "version": "7.0.0",
       "resolved": "https://registry.npmjs.org/wrap-ansi/-/wrap-ansi-7.0.0.tgz",
         "node": ">=10"
       }
     },
+    "node_modules/yallist": {
+      "version": "4.0.0",
+      "resolved": "https://registry.npmjs.org/yallist/-/yallist-4.0.0.tgz",
+      "integrity": "sha512-3wdGidZyq5PB084XLES5TpOSRA3wjXAlIWMhum2kRcv/41Sn2emQ0dycQW4uZXLejwKvg6EsvbdlVL+FYEct7A==",
+      "license": "ISC"
+    },
     "node_modules/yargs": {
       "version": "17.7.2",
       "resolved": "https://registry.npmjs.org/yargs/-/yargs-17.7.2.tgz",

services/webrtc_gateway/package.json CHANGED Viewed

@@ -12,10 +12,11 @@
   "license": "ISC",
   "description": "Servidor WebRTC unificado com Simple Peer conectando ao Ultravox/TTS",
   "dependencies": {
-    "express": "^5.1.0",
-    "ws": "^8.18.3",
     "@grpc/grpc-js": "^1.9.11",
-    "@grpc/proto-loader": "^0.7.10"
   },
   "devDependencies": {
     "nodemon": "^3.0.1"

   "license": "ISC",
   "description": "Servidor WebRTC unificado com Simple Peer conectando ao Ultravox/TTS",
   "dependencies": {
+    "@discordjs/opus": "^0.10.0",
     "@grpc/grpc-js": "^1.9.11",
+    "@grpc/proto-loader": "^0.7.10",
+    "express": "^5.1.0",
+    "ws": "^8.18.3"
   },
   "devDependencies": {
     "nodemon": "^3.0.1"

services/webrtc_gateway/response_1757390722112.pcm ADDED Viewed

	@@ -0,0 +1 @@


1	+ {"type":"init","clientId":"yi5gt94jz1c5n6ky7t844","conversationId":"conv_1757390722110_31ca908303358733"}

services/webrtc_gateway/response_1757391966860.pcm ADDED Viewed

	@@ -0,0 +1 @@


1	+ {"type":"init","clientId":"knc8cmsgwqddnn3diqw3do","conversationId":"conv_1757391966858_5d93e75e246743a2"}

services/webrtc_gateway/start.sh CHANGED Viewed

@@ -60,8 +60,6 @@ source venv/bin/activate
 # Configurar variáveis de ambiente
 export PYTHONPATH=/workspace/ultravox-pipeline:/workspace/ultravox-pipeline/protos/generated
 export WEBRTC_PORT=$PORT
-export ORCHESTRATOR_HOST=localhost
-export ORCHESTRATOR_PORT=50053
 echo -e "${YELLOW}Porta: $PORT${NC}"
 echo -e "${YELLOW}Log: $LOG_FILE${NC}"

 # Configurar variáveis de ambiente
 export PYTHONPATH=/workspace/ultravox-pipeline:/workspace/ultravox-pipeline/protos/generated
 export WEBRTC_PORT=$PORT
 echo -e "${YELLOW}Porta: $PORT${NC}"
 echo -e "${YELLOW}Log: $LOG_FILE${NC}"

services/webrtc_gateway/test-audio-cli.js ADDED Viewed

	@@ -0,0 +1,178 @@

+#!/usr/bin/env node
+/**
+ * Teste CLI para simular envio de áudio PCM ao servidor
+ * Similar ao que o navegador faz, mas via linha de comando
+ */
+const WebSocket = require('ws');
+const fs = require('fs');
+const path = require('path');
+const WS_URL = 'ws://localhost:8082/ws';
+class AudioTester {
+    constructor() {
+        this.ws = null;
+        this.conversationId = null;
+        this.clientId = null;
+    }
+    connect() {
+        return new Promise((resolve, reject) => {
+            console.log('🔌 Conectando ao WebSocket...');
+            this.ws = new WebSocket(WS_URL);
+            this.ws.on('open', () => {
+                console.log('✅ Conectado ao servidor');
+                resolve();
+            });
+            this.ws.on('error', (error) => {
+                console.error('❌ Erro:', error.message);
+                reject(error);
+            });
+            this.ws.on('message', (data) => {
+                // Verificar se é binário (áudio) ou JSON (mensagem)
+                if (data instanceof Buffer) {
+                    console.log(`🔊 Áudio recebido: ${(data.length / 1024).toFixed(1)}KB`);
+                    // Salvar áudio para análise
+                    const filename = `response_${Date.now()}.pcm`;
+                    fs.writeFileSync(filename, data);
+                    console.log(`   Salvo como: ${filename}`);
+                } else {
+                    try {
+                        const msg = JSON.parse(data);
+                        console.log('📨 Mensagem recebida:', msg);
+                        if (msg.type === 'init') {
+                            this.clientId = msg.clientId;
+                            this.conversationId = msg.conversationId;
+                            console.log(`🔑 Client ID: ${this.clientId}`);
+                            console.log(`🔑 Conversation ID: ${this.conversationId}`);
+                        } else if (msg.type === 'metrics') {
+                            console.log(`📊 Resposta: "${msg.response}" (${msg.latency}ms)`);
+                        }
+                    } catch (e) {
+                        console.log('📨 Dados recebidos:', data.toString());
+                    }
+                }
+            });
+        });
+    }
+    /**
+     * Gera áudio PCM sintético com tom de 440Hz (nota Lá)
+     * @param {number} durationMs - Duração em milissegundos
+     * @returns {Buffer} - Buffer PCM 16-bit @ 16kHz
+     */
+    generateTestAudio(durationMs = 2000) {
+        const sampleRate = 16000;
+        const frequency = 440; // Hz (nota Lá)
+        const samples = Math.floor(sampleRate * durationMs / 1000);
+        const buffer = Buffer.alloc(samples * 2); // 16-bit = 2 bytes por sample
+        for (let i = 0; i < samples; i++) {
+            // Gerar onda senoidal
+            const t = i / sampleRate;
+            const value = Math.sin(2 * Math.PI * frequency * t);
+            // Converter para int16
+            const int16Value = Math.floor(value * 32767);
+            // Escrever no buffer (little-endian)
+            buffer.writeInt16LE(int16Value, i * 2);
+        }
+        return buffer;
+    }
+    /**
+     * Gera áudio de fala real usando espeak (se disponível)
+     */
+    async generateSpeechAudio(text = "Olá, este é um teste de áudio") {
+        const { execSync } = require('child_process');
+        const tempFile = `/tmp/test_audio_${Date.now()}.raw`;
+        try {
+            // Usar espeak para gerar áudio
+            console.log(`🎤 Gerando áudio de fala: "${text}"`);
+            execSync(`espeak -s 150 -v pt-br "${text}" --stdout | sox - -r 16000 -b 16 -e signed-integer ${tempFile}`);
+            const audioBuffer = fs.readFileSync(tempFile);
+            fs.unlinkSync(tempFile); // Limpar arquivo temporário
+            return audioBuffer;
+        } catch (error) {
+            console.warn('⚠️ espeak/sox não disponível, usando áudio sintético');
+            return this.generateTestAudio(2000);
+        }
+    }
+    async sendAudio(audioBuffer) {
+        console.log(`\n📤 Enviando áudio PCM: ${(audioBuffer.length / 1024).toFixed(1)}KB`);
+        // Enviar como dados binários diretos (como o navegador faz)
+        this.ws.send(audioBuffer);
+        console.log('✅ Áudio enviado');
+    }
+    async testConversation() {
+        console.log('\n=== Iniciando teste de conversação ===\n');
+        // Teste 1: Enviar tom sintético
+        console.log('1️⃣ Teste com tom sintético (440Hz por 2s)');
+        const syntheticAudio = this.generateTestAudio(2000);
+        await this.sendAudio(syntheticAudio);
+        await this.wait(5000); // Aguardar resposta
+        // Teste 2: Enviar áudio de fala (se possível)
+        console.log('\n2️⃣ Teste com fala sintetizada');
+        const speechAudio = await this.generateSpeechAudio("Qual é o seu nome?");
+        await this.sendAudio(speechAudio);
+        await this.wait(5000); // Aguardar resposta
+        // Teste 3: Enviar silêncio
+        console.log('\n3️⃣ Teste com silêncio');
+        const silentAudio = Buffer.alloc(32000); // 1 segundo de silêncio
+        await this.sendAudio(silentAudio);
+        await this.wait(5000); // Aguardar resposta
+    }
+    wait(ms) {
+        return new Promise(resolve => setTimeout(resolve, ms));
+    }
+    disconnect() {
+        if (this.ws) {
+            console.log('\n👋 Desconectando...');
+            this.ws.close();
+        }
+    }
+}
+async function main() {
+    const tester = new AudioTester();
+    try {
+        await tester.connect();
+        await tester.wait(500);
+        await tester.testConversation();
+        await tester.wait(2000); // Aguardar últimas respostas
+    } catch (error) {
+        console.error('Erro fatal:', error);
+    } finally {
+        tester.disconnect();
+    }
+}
+console.log('╔═══════════════════════════════════════╗');
+console.log('║   Teste CLI de Áudio PCM              ║');
+console.log('╚═══════════════════════════════════════╝\n');
+console.log('Este teste simula o envio de áudio PCM');
+console.log('como o navegador faz, mas via CLI.\n');
+main().catch(console.error);

services/webrtc_gateway/test-memory.js ADDED Viewed

	@@ -0,0 +1,108 @@

+#!/usr/bin/env node
+/**
+ * Teste do sistema de memória de conversações
+ */
+const WebSocket = require('ws');
+const WS_URL = 'ws://localhost:8082/ws';
+class MemoryTester {
+    constructor() {
+        this.ws = null;
+        this.conversationId = null;
+    }
+    connect() {
+        return new Promise((resolve, reject) => {
+            console.log('🔌 Conectando ao WebSocket...');
+            this.ws = new WebSocket(WS_URL);
+            this.ws.on('open', () => {
+                console.log('✅ Conectado');
+                resolve();
+            });
+            this.ws.on('error', (error) => {
+                console.error('❌ Erro:', error.message);
+                reject(error);
+            });
+            this.ws.on('message', (data) => {
+                const msg = JSON.parse(data);
+                console.log('📨 Mensagem recebida:', msg);
+                if (msg.type === 'init' && msg.conversationId) {
+                    this.conversationId = msg.conversationId;
+                    console.log(`🔑 Conversation ID: ${this.conversationId}`);
+                }
+            });
+        });
+    }
+    async testMemoryOperations() {
+        console.log('\n=== Testando Operações de Memória ===\n');
+        // 1. Obter conversação atual
+        console.log('1. Obtendo conversação atual...');
+        this.ws.send(JSON.stringify({ type: 'get-conversation' }));
+        await this.wait(1000);
+        // 2. Listar conversações
+        console.log('\n2. Listando conversações...');
+        this.ws.send(JSON.stringify({ type: 'list-conversations' }));
+        await this.wait(1000);
+        // 3. Obter estatísticas
+        console.log('\n3. Obtendo estatísticas de memória...');
+        this.ws.send(JSON.stringify({ type: 'get-stats' }));
+        await this.wait(1000);
+        // 4. Simular mensagem de áudio
+        console.log('\n4. Simulando processamento de áudio...');
+        const audioData = Buffer.alloc(1000); // Buffer vazio para teste
+        this.ws.send(JSON.stringify({
+            type: 'audio',
+            data: audioData.toString('base64')
+        }));
+        await this.wait(2000);
+        // 5. Verificar se mensagens foram armazenadas
+        console.log('\n5. Verificando mensagens armazenadas...');
+        this.ws.send(JSON.stringify({ type: 'get-conversation' }));
+        await this.wait(1000);
+    }
+    wait(ms) {
+        return new Promise(resolve => setTimeout(resolve, ms));
+    }
+    disconnect() {
+        if (this.ws) {
+            console.log('\n👋 Desconectando...');
+            this.ws.close();
+        }
+    }
+}
+async function main() {
+    const tester = new MemoryTester();
+    try {
+        await tester.connect();
+        await tester.wait(500);
+        await tester.testMemoryOperations();
+    } catch (error) {
+        console.error('Erro fatal:', error);
+    } finally {
+        tester.disconnect();
+    }
+}
+console.log('╔═══════════════════════════════════════╗');
+console.log('║   Teste do Sistema de Memória        ║');
+console.log('╚═══════════════════════════════════════╝\n');
+main().catch(console.error);

services/webrtc_gateway/test-portuguese-audio.js ADDED Viewed

	@@ -0,0 +1,410 @@

+#!/usr/bin/env node
+/**
+ * Teste com áudio real em português usando gTTS
+ * Gera perguntas faladas e verifica coerência das respostas
+ */
+const WebSocket = require('ws');
+const fs = require('fs');
+const { exec, execSync } = require('child_process');
+const path = require('path');
+const util = require('util');
+const execPromise = util.promisify(exec);
+const WS_URL = 'ws://localhost:8082/ws';
+// Cores para output
+const colors = {
+    reset: '\x1b[0m',
+    bright: '\x1b[1m',
+    green: '\x1b[32m',
+    red: '\x1b[31m',
+    yellow: '\x1b[33m',
+    blue: '\x1b[34m',
+    cyan: '\x1b[36m',
+    magenta: '\x1b[35m'
+};
+class PortugueseAudioTester {
+    constructor() {
+        this.ws = null;
+        this.testResults = [];
+        this.currentTest = null;
+        this.responseBuffer = '';
+    }
+    async connect() {
+        return new Promise((resolve, reject) => {
+            console.log(`${colors.cyan}🔌 Conectando ao WebSocket...${colors.reset}`);
+            this.ws = new WebSocket(WS_URL);
+            this.ws.on('open', () => {
+                console.log(`${colors.green}✅ Conectado ao servidor${colors.reset}`);
+                resolve();
+            });
+            this.ws.on('error', (error) => {
+                console.error(`${colors.red}❌ Erro:${colors.reset}`, error.message);
+                reject(error);
+            });
+            this.ws.on('message', (data) => {
+                this.handleMessage(data);
+            });
+        });
+    }
+    handleMessage(data) {
+        // Verificar se é binário (áudio) ou JSON (mensagem)
+        if (Buffer.isBuffer(data)) {
+            console.log(`${colors.green}🔊 Áudio de resposta recebido: ${data.length} bytes${colors.reset}`);
+            if (this.currentTest) {
+                this.currentTest.audioReceived = true;
+                this.currentTest.audioSize = data.length;
+            }
+            return;
+        }
+        try {
+            const msg = JSON.parse(data);
+            switch (msg.type) {
+                case 'init':
+                case 'welcome':
+                    console.log(`${colors.blue}🔑 Sessão iniciada: ${msg.clientId}${colors.reset}`);
+                    break;
+                case 'metrics':
+                    console.log(`${colors.yellow}📝 Resposta do sistema: "${msg.response}"${colors.reset}`);
+                    if (this.currentTest) {
+                        this.currentTest.response = msg.response;
+                        this.currentTest.latency = msg.latency;
+                        this.responseBuffer = msg.response;
+                    }
+                    break;
+                case 'response':
+                case 'transcription':
+                    // Adicionar suporte para outros formatos de resposta
+                    const text = msg.text || msg.response || msg.message;
+                    if (text) {
+                        console.log(`${colors.yellow}📝 Resposta: "${text}"${colors.reset}`);
+                        if (this.currentTest) {
+                            this.currentTest.response = text;
+                            this.currentTest.latency = msg.latency || 0;
+                        }
+                    }
+                    break;
+                case 'error':
+                    console.error(`${colors.red}❌ Erro: ${msg.message}${colors.reset}`);
+                    break;
+            }
+        } catch (error) {
+            // Dados de texto simples
+            const text = data.toString();
+            if (text.length > 0 && text.length < 200) {
+                console.log(`${colors.cyan}📨 Mensagem: ${text}${colors.reset}`);
+            }
+        }
+    }
+    /**
+     * Gera áudio MP3 usando gTTS e converte para PCM
+     * @param {string} text - Texto em português para converter
+     * @param {string} outputFile - Nome do arquivo de saída
+     */
+    async generatePortugueseAudio(text, outputFile) {
+        console.log(`${colors.magenta}🎤 Gerando áudio: "${text}"${colors.reset}`);
+        const mp3File = outputFile.replace('.pcm', '.mp3');
+        try {
+            // Gerar MP3 com gTTS em português brasileiro
+            const gttsCommand = `gtts-cli "${text}" -l pt-br -o ${mp3File}`;
+            await execPromise(gttsCommand);
+            console.log(`   ✅ MP3 gerado: ${mp3File}`);
+            // Converter MP3 para PCM 16-bit @ 16kHz
+            const ffmpegCommand = `ffmpeg -i ${mp3File} -f s16le -acodec pcm_s16le -ar 16000 -ac 1 ${outputFile} -y`;
+            await execPromise(ffmpegCommand);
+            console.log(`   ✅ PCM gerado: ${outputFile}`);
+            // Limpar arquivo MP3 temporário
+            fs.unlinkSync(mp3File);
+            // Ler arquivo PCM
+            const pcmBuffer = fs.readFileSync(outputFile);
+            console.log(`   📊 Tamanho PCM: ${pcmBuffer.length} bytes`);
+            return pcmBuffer;
+        } catch (error) {
+            console.error(`${colors.red}❌ Erro gerando áudio: ${error.message}${colors.reset}`);
+            throw error;
+        }
+    }
+    async sendPortugueseQuestion(question, expectedContext) {
+        console.log(`\n${colors.bright}=== Teste: ${question} ===${colors.reset}`);
+        this.currentTest = {
+            question: question,
+            expectedContext: expectedContext,
+            startTime: Date.now(),
+            response: null,
+            audioReceived: false
+        };
+        try {
+            // Gerar áudio da pergunta
+            const audioFile = `/tmp/question_${Date.now()}.pcm`;
+            const pcmAudio = await this.generatePortugueseAudio(question, audioFile);
+            // Enviar áudio PCM diretamente
+            console.log(`${colors.cyan}📤 Enviando áudio PCM: ${pcmAudio.length} bytes${colors.reset}`);
+            this.ws.send(pcmAudio);
+            // Aguardar resposta
+            await this.waitForResponse(8000);
+            // Limpar arquivo temporário
+            if (fs.existsSync(audioFile)) {
+                fs.unlinkSync(audioFile);
+            }
+            // Avaliar resultado
+            this.evaluateTest();
+        } catch (error) {
+            console.error(`${colors.red}❌ Erro no teste: ${error.message}${colors.reset}`);
+            this.currentTest.error = error.message;
+        }
+    }
+    waitForResponse(timeoutMs) {
+        return new Promise((resolve) => {
+            const startTime = Date.now();
+            const checkInterval = setInterval(() => {
+                const elapsed = Date.now() - startTime;
+                // Verificar se recebemos resposta
+                if (this.currentTest.response || this.currentTest.audioReceived) {
+                    clearInterval(checkInterval);
+                    resolve();
+                } else if (elapsed > timeoutMs) {
+                    clearInterval(checkInterval);
+                    console.log(`${colors.yellow}⏱️ Timeout aguardando resposta${colors.reset}`);
+                    resolve();
+                }
+            }, 100);
+        });
+    }
+    evaluateTest() {
+        const test = this.currentTest;
+        const responseTime = Date.now() - test.startTime;
+        console.log(`\n${colors.bright}📊 Resultado do Teste:${colors.reset}`);
+        console.log(`  Pergunta: "${test.question}"`);
+        console.log(`  Tempo de resposta: ${responseTime}ms`);
+        console.log(`  Resposta recebida: ${test.response ? '✅' : '❌'}`);
+        console.log(`  Áudio recebido: ${test.audioReceived ? '✅' : '❌'}`);
+        if (test.response) {
+            console.log(`  Resposta: "${test.response}"`);
+            // Verificar coerência
+            const response = test.response.toLowerCase();
+            let isCoherent = false;
+            let coherenceReason = '';
+            // Verificar se a resposta contém palavras-chave esperadas
+            test.expectedContext.forEach(keyword => {
+                if (response.includes(keyword.toLowerCase())) {
+                    isCoherent = true;
+                    coherenceReason = `contém "${keyword}"`;
+                }
+            });
+            // Verificar se é uma resposta genérica válida
+            const validGenericResponses = [
+                'olá', 'oi', 'bom dia', 'boa tarde', 'boa noite',
+                'ajudar', 'assistente', 'posso', 'como',
+                'brasil', 'brasileiro', 'portuguesa',
+                'você', 'seu', 'sua', 'nome', 'chamar'
+            ];
+            if (!isCoherent) {
+                validGenericResponses.forEach(word => {
+                    if (response.includes(word)) {
+                        isCoherent = true;
+                        coherenceReason = `resposta válida com "${word}"`;
+                    }
+                });
+            }
+            // Verificar se é uma resposta muito curta ou sem sentido
+            if (response.length < 5 || response.match(/^[0-9\s]+$/)) {
+                isCoherent = false;
+                coherenceReason = 'resposta muito curta ou inválida';
+            }
+            if (isCoherent) {
+                console.log(`  ${colors.green}✅ Resposta COERENTE (${coherenceReason})${colors.reset}`);
+            } else {
+                console.log(`  ${colors.red}❌ Resposta INCOERENTE (${coherenceReason})${colors.reset}`);
+            }
+            test.isCoherent = isCoherent;
+        } else {
+            test.isCoherent = false;
+        }
+        test.responseTime = responseTime;
+        test.passed = test.response && test.isCoherent;
+        this.testResults.push(test);
+    }
+    async runAllTests() {
+        console.log(`\n${colors.bright}${colors.cyan}🚀 Iniciando testes com áudio em português${colors.reset}\n`);
+        // Teste 1: Saudação
+        await this.sendPortugueseQuestion(
+            "Olá, bom dia",
+            ['olá', 'oi', 'bom dia', 'prazer', 'ajudar']
+        );
+        await this.wait(2000);
+        // Teste 2: Pergunta sobre nome
+        await this.sendPortugueseQuestion(
+            "Qual é o seu nome?",
+            ['nome', 'chamo', 'sou', 'assistente', 'ultravox']
+        );
+        await this.wait(2000);
+        // Teste 3: Pergunta sobre Brasil
+        await this.sendPortugueseQuestion(
+            "Qual é a capital do Brasil?",
+            ['brasília', 'capital', 'brasil', 'distrito federal']
+        );
+        await this.wait(2000);
+        // Teste 4: Pergunta sobre ajuda
+        await this.sendPortugueseQuestion(
+            "Você pode me ajudar?",
+            ['sim', 'posso', 'ajudar', 'claro', 'certamente', 'como']
+        );
+        await this.wait(2000);
+        // Teste 5: Pergunta sobre o dia
+        await this.sendPortugueseQuestion(
+            "Como está o dia hoje?",
+            ['dia', 'hoje', 'tempo', 'clima', 'está']
+        );
+        // Mostrar resumo
+        this.showSummary();
+    }
+    showSummary() {
+        console.log(`\n${colors.bright}${colors.cyan}📈 RESUMO DOS TESTES${colors.reset}`);
+        console.log('═'.repeat(70));
+        let passed = 0;
+        let failed = 0;
+        this.testResults.forEach((test, index) => {
+            const status = test.passed ?
+                `${colors.green}✅ PASSOU${colors.reset}` :
+                `${colors.red}❌ FALHOU${colors.reset}`;
+            console.log(`\n${index + 1}. "${test.question}": ${status}`);
+            console.log(`   Tempo: ${test.responseTime}ms`);
+            console.log(`   Coerente: ${test.isCoherent ? 'Sim' : 'Não'}`);
+            if (test.response) {
+                const preview = test.response.substring(0, 100);
+                console.log(`   Resposta: "${preview}${test.response.length > 100 ? '...' : ''}"`);
+            }
+            if (test.passed) passed++;
+            else failed++;
+        });
+        console.log('\n' + '═'.repeat(70));
+        console.log(`${colors.bright}Total: ${passed} passou, ${failed} falhou${colors.reset}`);
+        const successRate = (passed / this.testResults.length * 100).toFixed(1);
+        const rateColor = successRate >= 80 ? colors.green :
+                         successRate >= 50 ? colors.yellow :
+                         colors.red;
+        console.log(`${rateColor}Taxa de sucesso: ${successRate}%${colors.reset}\n`);
+    }
+    wait(ms) {
+        return new Promise(resolve => setTimeout(resolve, ms));
+    }
+    disconnect() {
+        if (this.ws) {
+            console.log(`${colors.cyan}👋 Desconectando...${colors.reset}`);
+            this.ws.close();
+        }
+    }
+}
+// Verificar dependências
+function checkDependencies() {
+    try {
+        // Verificar gTTS
+        execSync('which gtts-cli', { stdio: 'ignore' });
+        console.log(`${colors.green}✅ gTTS instalado${colors.reset}`);
+    } catch {
+        console.error(`${colors.red}❌ gTTS não instalado!${colors.reset}`);
+        console.log(`${colors.yellow}Instale com: pip install gtts${colors.reset}`);
+        process.exit(1);
+    }
+    try {
+        // Verificar ffmpeg
+        execSync('which ffmpeg', { stdio: 'ignore' });
+        console.log(`${colors.green}✅ ffmpeg instalado${colors.reset}`);
+    } catch {
+        console.error(`${colors.red}❌ ffmpeg não instalado!${colors.reset}`);
+        console.log(`${colors.yellow}Instale com: sudo apt install ffmpeg${colors.reset}`);
+        process.exit(1);
+    }
+}
+// Executar testes
+async function main() {
+    console.log(`${colors.bright}${colors.blue}╔═══════════════════════════════════════════════╗${colors.reset}`);
+    console.log(`${colors.bright}${colors.blue}║   Teste Ultravox - Áudio Português (gTTS)    ║${colors.reset}`);
+    console.log(`${colors.bright}${colors.blue}╚═══════════════════════════════════════════════╝${colors.reset}\n`);
+    // Verificar dependências
+    checkDependencies();
+    console.log('');
+    const tester = new PortugueseAudioTester();
+    try {
+        await tester.connect();
+        await tester.wait(500);
+        await tester.runAllTests();
+        await tester.wait(2000); // Aguardar últimas respostas
+    } catch (error) {
+        console.error(`${colors.red}Erro fatal:${colors.reset}`, error);
+    } finally {
+        tester.disconnect();
+        process.exit(0);
+    }
+}
+// Iniciar
+main().catch(console.error);

services/webrtc_gateway/test-websocket-speech.js ADDED Viewed

	@@ -0,0 +1,184 @@

+#!/usr/bin/env node
+/**
+ * Teste automatizado de Speech-to-Speech via WebSocket
+ * Simula exatamente o que a página web deveria fazer
+ */
+const WebSocket = require('ws');
+const fs = require('fs');
+const path = require('path');
+const { spawn } = require('child_process');
+// Configuração
+const WS_URL = 'ws://localhost:8082/ws';
+const TEST_AUDIO_TEXT = "Quanto é dois mais dois?";
+// Função para gerar áudio de teste usando gtts-cli
+async function generateTestAudio(text) {
+    return new Promise((resolve, reject) => {
+        const tempFile = `/tmp/test_audio_${Date.now()}.mp3`;
+        const wavFile = `/tmp/test_audio_${Date.now()}.wav`;
+        console.log(`🎤 Gerando áudio de teste: "${text}"`);
+        // Gerar MP3 com gTTS
+        const gtts = spawn('gtts-cli', [text, '--lang', 'pt-br', '--output', tempFile]);
+        gtts.on('close', (code) => {
+            if (code !== 0) {
+                reject(new Error(`gTTS falhou com código ${code}`));
+                return;
+            }
+            // Converter MP3 para WAV PCM 16-bit @ 16kHz
+            const ffmpeg = spawn('ffmpeg', [
+                '-i', tempFile,
+                '-ar', '16000',     // 16kHz
+                '-ac', '1',         // Mono
+                '-c:a', 'pcm_s16le', // PCM 16-bit
+                wavFile,
+                '-y'
+            ]);
+            ffmpeg.on('close', (code) => {
+                if (code !== 0) {
+                    reject(new Error(`ffmpeg falhou com código ${code}`));
+                    return;
+                }
+                // Ler o arquivo WAV
+                const audioBuffer = fs.readFileSync(wavFile);
+                // Remover header WAV (44 bytes)
+                const pcmData = audioBuffer.slice(44);
+                // Converter PCM int16 para Float32
+                const pcmInt16 = new Int16Array(pcmData.buffer, pcmData.byteOffset, pcmData.length / 2);
+                const pcmFloat32 = new Float32Array(pcmInt16.length);
+                for (let i = 0; i < pcmInt16.length; i++) {
+                    pcmFloat32[i] = pcmInt16[i] / 32768.0; // Normalizar para -1.0 a 1.0
+                }
+                // Limpar arquivos temporários
+                fs.unlinkSync(tempFile);
+                fs.unlinkSync(wavFile);
+                console.log(`✅ Áudio gerado: ${pcmFloat32.length} amostras Float32`);
+                resolve(Buffer.from(pcmFloat32.buffer));
+            });
+        });
+    });
+}
+// Função principal do teste
+async function testSpeechToSpeech() {
+    console.log('='.repeat(60));
+    console.log('🚀 TESTE AUTOMATIZADO SPEECH-TO-SPEECH VIA WEBSOCKET');
+    console.log('='.repeat(60));
+    try {
+        // Gerar áudio de teste
+        const audioBuffer = await generateTestAudio(TEST_AUDIO_TEXT);
+        // Conectar ao WebSocket
+        console.log(`\n📡 Conectando ao servidor: ${WS_URL}`);
+        const ws = new WebSocket(WS_URL);
+        return new Promise((resolve, reject) => {
+            let responseReceived = false;
+            let audioChunks = [];
+            ws.on('open', () => {
+                console.log('✅ Conectado ao servidor WebSocket');
+                // Enviar mensagem de tipo 'audio'
+                const message = {
+                    type: 'audio',
+                    data: audioBuffer.toString('base64'),
+                    format: 'float32',
+                    sampleRate: 16000,
+                    sessionId: `test_${Date.now()}`
+                };
+                console.log(`📤 Enviando áudio: ${audioBuffer.length} bytes`);
+                ws.send(JSON.stringify(message));
+            });
+            ws.on('message', (data) => {
+                try {
+                    const message = JSON.parse(data);
+                    if (message.type === 'transcription') {
+                        console.log(`📝 Transcrição recebida: "${message.text}"`);
+                        responseReceived = true;
+                    } else if (message.type === 'audio') {
+                        // Áudio de resposta do TTS
+                        const audioData = Buffer.from(message.data, 'base64');
+                        audioChunks.push(audioData);
+                        console.log(`🔊 Chunk de áudio recebido: ${audioData.length} bytes`);
+                        if (message.isFinal) {
+                            console.log('✅ Áudio completo recebido');
+                            // Salvar áudio para verificação (opcional)
+                            const outputFile = '/tmp/response_audio.pcm';
+                            const fullAudio = Buffer.concat(audioChunks);
+                            fs.writeFileSync(outputFile, fullAudio);
+                            console.log(`💾 Áudio salvo em: ${outputFile}`);
+                            ws.close();
+                            resolve();
+                        }
+                    } else if (message.type === 'error') {
+                        console.error(`❌ Erro do servidor: ${message.message}`);
+                        ws.close();
+                        reject(new Error(message.message));
+                    }
+                } catch (error) {
+                    console.error('❌ Erro ao processar mensagem:', error);
+                }
+            });
+            ws.on('error', (error) => {
+                console.error('❌ Erro WebSocket:', error);
+                reject(error);
+            });
+            ws.on('close', () => {
+                console.log('🔌 Conexão fechada');
+                if (!responseReceived) {
+                    reject(new Error('Conexão fechada sem receber resposta'));
+                }
+            });
+            // Timeout
+            setTimeout(() => {
+                if (ws.readyState === WebSocket.OPEN) {
+                    console.log('⏱️ Timeout - fechando conexão');
+                    ws.close();
+                    reject(new Error('Timeout na resposta'));
+                }
+            }, 30000);
+        });
+    } catch (error) {
+        console.error('❌ Erro no teste:', error);
+        throw error;
+    }
+}
+// Executar teste
+testSpeechToSpeech()
+    .then(() => {
+        console.log('\n' + '='.repeat(60));
+        console.log('✅ TESTE CONCLUÍDO COM SUCESSO!');
+        console.log('='.repeat(60));
+        process.exit(0);
+    })
+    .catch((error) => {
+        console.error('\n' + '='.repeat(60));
+        console.error('❌ TESTE FALHOU:', error.message);
+        console.error('='.repeat(60));
+        process.exit(1);
+    });

services/webrtc_gateway/test-websocket.js ADDED Viewed

	@@ -0,0 +1,317 @@

+#!/usr/bin/env node
+/**
+ * Teste automatizado de WebSocket para validar respostas do Ultravox
+ * Simula conexões WebRTC e envia áudio de teste
+ */
+const WebSocket = require('ws');
+const fs = require('fs');
+const path = require('path');
+// Configuração
+const WS_URL = 'ws://localhost:8082/ws';
+const SAMPLE_RATE = 16000;
+const BITS_PER_SAMPLE = 16;
+const CHANNELS = 1;
+// Cores para output
+const colors = {
+    reset: '\x1b[0m',
+    bright: '\x1b[1m',
+    green: '\x1b[32m',
+    red: '\x1b[31m',
+    yellow: '\x1b[33m',
+    blue: '\x1b[34m',
+    cyan: '\x1b[36m'
+};
+// Função para gerar áudio de teste (silêncio com alguns pulsos)
+function generateTestAudio(durationMs = 1000) {
+    const samples = Math.floor((SAMPLE_RATE * durationMs) / 1000);
+    const buffer = Buffer.alloc(samples * 2); // 16-bit = 2 bytes per sample
+    // Adicionar alguns pulsos para simular fala
+    for (let i = 0; i < samples; i++) {
+        let value = 0;
+        // Criar padrão de "fala" simulada
+        if (i % 100 < 50) {
+            value = Math.sin(2 * Math.PI * 440 * i / SAMPLE_RATE) * 1000;
+            value += Math.sin(2 * Math.PI * 880 * i / SAMPLE_RATE) * 500;
+            value += (Math.random() - 0.5) * 200; // Adicionar ruído
+        }
+        // Converter para int16
+        const int16Value = Math.max(-32768, Math.min(32767, Math.floor(value)));
+        buffer.writeInt16LE(int16Value, i * 2);
+    }
+    return buffer;
+}
+// Classe de teste
+class WebSocketTester {
+    constructor() {
+        this.ws = null;
+        this.testResults = [];
+        this.currentTest = null;
+    }
+    connect() {
+        return new Promise((resolve, reject) => {
+            console.log(`${colors.cyan}🔌 Conectando ao WebSocket...${colors.reset}`);
+            this.ws = new WebSocket(WS_URL);
+            this.ws.on('open', () => {
+                console.log(`${colors.green}✅ Conectado ao servidor${colors.reset}`);
+                resolve();
+            });
+            this.ws.on('error', (error) => {
+                console.error(`${colors.red}❌ Erro de conexão:${colors.reset}`, error.message);
+                reject(error);
+            });
+            this.ws.on('message', (data) => {
+                this.handleMessage(data);
+            });
+        });
+    }
+    handleMessage(data) {
+        // Verificar se é binário (áudio) ou JSON (mensagem)
+        if (Buffer.isBuffer(data)) {
+            console.log(`${colors.green}🔊 Áudio binário recebido: ${data.length} bytes${colors.reset}`);
+            if (this.currentTest) {
+                this.currentTest.audioReceived = true;
+                this.currentTest.audioSize = data.length;
+                // Assumir que o áudio contém a resposta
+                this.currentTest.transcription = '[Resposta de áudio recebida]';
+            }
+            return;
+        }
+        try {
+            const msg = JSON.parse(data);
+            switch (msg.type) {
+                case 'init':
+                case 'welcome':
+                    console.log(`${colors.blue}👋 Cliente ID: ${msg.clientId}${colors.reset}`);
+                    break;
+                case 'metrics':
+                    console.log(`${colors.yellow}📝 Resposta: "${msg.response}"${colors.reset}`);
+                    if (this.currentTest) {
+                        this.currentTest.transcription = msg.response;
+                        this.currentTest.latency = msg.latency;
+                    }
+                    break;
+                case 'transcription':
+                    console.log(`${colors.yellow}📝 Transcrição: "${msg.text}"${colors.reset}`);
+                    if (this.currentTest) {
+                        this.currentTest.transcription = msg.text;
+                        this.currentTest.latency = msg.latency;
+                    }
+                    break;
+                case 'audio':
+                    console.log(`${colors.green}🔊 Áudio recebido: ${msg.size} bytes${colors.reset}`);
+                    if (this.currentTest) {
+                        this.currentTest.audioReceived = true;
+                        this.currentTest.audioSize = msg.size;
+                    }
+                    break;
+                case 'error':
+                    console.error(`${colors.red}❌ Erro do servidor: ${msg.message}${colors.reset}`);
+                    if (this.currentTest) {
+                        this.currentTest.error = msg.message;
+                    }
+                    break;
+            }
+        } catch (error) {
+            console.log(`${colors.cyan}📨 Dados recebidos: ${data.toString().substring(0, 100)}...${colors.reset}`);
+        }
+    }
+    async sendAudioTest(testName, systemPrompt = '') {
+        console.log(`\n${colors.bright}=== Teste: ${testName} ===${colors.reset}`);
+        this.currentTest = {
+            name: testName,
+            systemPrompt: systemPrompt,
+            startTime: Date.now(),
+            transcription: null,
+            audioReceived: false
+        };
+        // Enviar áudio de teste
+        const audioData = generateTestAudio(1500); // 1.5 segundos
+        console.log(`${colors.cyan}📤 Enviando áudio PCM direto: ${audioData.length} bytes${colors.reset}`);
+        // Enviar dados binários PCM diretamente (como o navegador faz)
+        this.ws.send(audioData);
+        // Aguardar resposta
+        await this.waitForResponse(5000);
+        // Avaliar resultado
+        this.evaluateTest();
+    }
+    waitForResponse(timeoutMs) {
+        return new Promise((resolve) => {
+            const startTime = Date.now();
+            const checkInterval = setInterval(() => {
+                const elapsed = Date.now() - startTime;
+                // Verificar se recebemos resposta completa
+                if (this.currentTest.transcription && this.currentTest.audioReceived) {
+                    clearInterval(checkInterval);
+                    resolve();
+                } else if (elapsed > timeoutMs) {
+                    clearInterval(checkInterval);
+                    console.log(`${colors.yellow}⏱️ Timeout aguardando resposta${colors.reset}`);
+                    resolve();
+                }
+            }, 100);
+        });
+    }
+    evaluateTest() {
+        const test = this.currentTest;
+        const responseTime = Date.now() - test.startTime;
+        console.log(`\n${colors.bright}📊 Resultado do Teste:${colors.reset}`);
+        console.log(`  Tempo de resposta: ${responseTime}ms`);
+        console.log(`  Transcrição recebida: ${test.transcription ? '✅' : '❌'}`);
+        console.log(`  Áudio recebido: ${test.audioReceived ? '✅' : '❌'}`);
+        // Verificar coerência da resposta
+        let isCoherent = false;
+        if (test.transcription) {
+            // Verificar se não contém "Brasília" ou respostas aleatórias
+            const problematicPhrases = [
+                'capital do brasil',
+                'brasília',
+                'cidade mais populosa',
+                'região centro-oeste',
+                'rio de janeiro',
+                'são paulo'
+            ];
+            const lowerTranscription = test.transcription.toLowerCase();
+            const hasProblematicContent = problematicPhrases.some(phrase =>
+                lowerTranscription.includes(phrase)
+            );
+            if (hasProblematicContent) {
+                console.log(`  ${colors.red}⚠️ Resposta contém conteúdo problemático${colors.reset}`);
+                isCoherent = false;
+            } else {
+                console.log(`  ${colors.green}✅ Resposta parece coerente${colors.reset}`);
+                isCoherent = true;
+            }
+        }
+        test.responseTime = responseTime;
+        test.isCoherent = isCoherent;
+        test.passed = test.transcription && test.audioReceived && isCoherent;
+        this.testResults.push(test);
+    }
+    async runAllTests() {
+        console.log(`\n${colors.bright}${colors.cyan}🚀 Iniciando bateria de testes${colors.reset}\n`);
+        // Teste 1: Sem prompt de sistema
+        await this.sendAudioTest('Sem prompt de sistema', '');
+        await this.wait(1000);
+        // Teste 2: Com prompt simples
+        await this.sendAudioTest('Prompt simples', 'Você é um assistente útil');
+        await this.wait(1000);
+        // Teste 3: Prompt vazio explícito
+        await this.sendAudioTest('Prompt vazio explícito', '');
+        await this.wait(1000);
+        // Mostrar resumo
+        this.showSummary();
+    }
+    showSummary() {
+        console.log(`\n${colors.bright}${colors.cyan}📈 RESUMO DOS TESTES${colors.reset}`);
+        console.log('═'.repeat(60));
+        let passed = 0;
+        let failed = 0;
+        this.testResults.forEach((test, index) => {
+            const status = test.passed ?
+                `${colors.green}✅ PASSOU${colors.reset}` :
+                `${colors.red}❌ FALHOU${colors.reset}`;
+            console.log(`\n${index + 1}. ${test.name}: ${status}`);
+            console.log(`   Tempo: ${test.responseTime}ms`);
+            console.log(`   Coerente: ${test.isCoherent ? 'Sim' : 'Não'}`);
+            if (test.transcription) {
+                console.log(`   Resposta: "${test.transcription.substring(0, 80)}..."`);
+            }
+            if (test.passed) passed++;
+            else failed++;
+        });
+        console.log('\n' + '═'.repeat(60));
+        console.log(`${colors.bright}Total: ${passed} passou, ${failed} falhou${colors.reset}`);
+        const successRate = (passed / this.testResults.length * 100).toFixed(1);
+        const rateColor = successRate >= 80 ? colors.green :
+                         successRate >= 50 ? colors.yellow :
+                         colors.red;
+        console.log(`${rateColor}Taxa de sucesso: ${successRate}%${colors.reset}\n`);
+    }
+    wait(ms) {
+        return new Promise(resolve => setTimeout(resolve, ms));
+    }
+    disconnect() {
+        if (this.ws) {
+            console.log(`${colors.cyan}👋 Desconectando...${colors.reset}`);
+            this.ws.close();
+        }
+    }
+}
+// Executar testes
+async function main() {
+    const tester = new WebSocketTester();
+    try {
+        await tester.connect();
+        await tester.wait(500); // Dar tempo para estabilizar
+        await tester.runAllTests();
+    } catch (error) {
+        console.error(`${colors.red}Erro fatal:${colors.reset}`, error);
+    } finally {
+        tester.disconnect();
+        process.exit(0);
+    }
+}
+// Iniciar
+console.log(`${colors.bright}${colors.blue}╔═══════════════════════════════════════╗${colors.reset}`);
+console.log(`${colors.bright}${colors.blue}║   Teste WebSocket - Ultravox Chat    ║${colors.reset}`);
+console.log(`${colors.bright}${colors.blue}╚═══════════════════════════════════════╝${colors.reset}\n`);
+main().catch(console.error);

services/webrtc_gateway/ultravox-chat-backup.html ADDED Viewed

	@@ -0,0 +1,964 @@

+<!DOCTYPE html>
+<html lang="pt-BR">
+<head>
+    <meta charset="UTF-8">
+    <meta name="viewport" content="width=device-width, initial-scale=1.0">
+    <title>Ultravox Chat PCM - Otimizado</title>
+    <script src="opus-decoder.js"></script>
+    <style>
+        * {
+            margin: 0;
+            padding: 0;
+            box-sizing: border-box;
+        }
+        body {
+            font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, sans-serif;
+            background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
+            min-height: 100vh;
+            display: flex;
+            justify-content: center;
+            align-items: center;
+            padding: 20px;
+        }
+        .container {
+            background: white;
+            border-radius: 20px;
+            box-shadow: 0 20px 60px rgba(0,0,0,0.3);
+            padding: 40px;
+            max-width: 600px;
+            width: 100%;
+        }
+        h1 {
+            text-align: center;
+            color: #333;
+            margin-bottom: 30px;
+            font-size: 28px;
+        }
+        .status {
+            background: #f8f9fa;
+            border-radius: 10px;
+            padding: 15px;
+            margin-bottom: 20px;
+            display: flex;
+            align-items: center;
+            justify-content: space-between;
+        }
+        .status-dot {
+            width: 12px;
+            height: 12px;
+            border-radius: 50%;
+            background: #dc3545;
+            margin-right: 10px;
+            display: inline-block;
+        }
+        .status-dot.connected {
+            background: #28a745;
+            animation: pulse 2s infinite;
+        }
+        @keyframes pulse {
+            0% { box-shadow: 0 0 0 0 rgba(40, 167, 69, 0.7); }
+            70% { box-shadow: 0 0 0 10px rgba(40, 167, 69, 0); }
+            100% { box-shadow: 0 0 0 0 rgba(40, 167, 69, 0); }
+        }
+        .controls {
+            display: flex;
+            gap: 10px;
+            margin-bottom: 20px;
+        }
+        .voice-selector {
+            display: flex;
+            align-items: center;
+            gap: 10px;
+            margin-bottom: 20px;
+            padding: 10px;
+            background: #f8f9fa;
+            border-radius: 10px;
+        }
+        .voice-selector label {
+            font-weight: 600;
+            color: #555;
+        }
+        .voice-selector select {
+            flex: 1;
+            padding: 8px;
+            border: 2px solid #ddd;
+            border-radius: 5px;
+            font-size: 14px;
+            background: white;
+            cursor: pointer;
+        }
+        .voice-selector select:focus {
+            outline: none;
+            border-color: #667eea;
+        }
+        button {
+            flex: 1;
+            padding: 15px;
+            border: none;
+            border-radius: 10px;
+            font-size: 16px;
+            font-weight: 600;
+            cursor: pointer;
+            transition: all 0.3s ease;
+        }
+        button:disabled {
+            opacity: 0.5;
+            cursor: not-allowed;
+        }
+        .btn-primary {
+            background: #007bff;
+            color: white;
+        }
+        .btn-primary:hover:not(:disabled) {
+            background: #0056b3;
+            transform: translateY(-2px);
+            box-shadow: 0 5px 15px rgba(0,123,255,0.3);
+        }
+        .btn-danger {
+            background: #dc3545;
+            color: white;
+        }
+        .btn-danger:hover:not(:disabled) {
+            background: #c82333;
+        }
+        .btn-success {
+            background: #28a745;
+            color: white;
+        }
+        .btn-success.recording {
+            background: #dc3545;
+            animation: recordPulse 1s infinite;
+        }
+        @keyframes recordPulse {
+            0%, 100% { opacity: 1; }
+            50% { opacity: 0.7; }
+        }
+        .metrics {
+            display: grid;
+            grid-template-columns: repeat(3, 1fr);
+            gap: 15px;
+            margin-bottom: 20px;
+        }
+        .metric {
+            background: #f8f9fa;
+            padding: 15px;
+            border-radius: 10px;
+            text-align: center;
+        }
+        .metric-label {
+            font-size: 12px;
+            color: #6c757d;
+            margin-bottom: 5px;
+        }
+        .metric-value {
+            font-size: 24px;
+            font-weight: bold;
+            color: #333;
+        }
+        .log {
+            background: #f8f9fa;
+            border-radius: 10px;
+            padding: 20px;
+            height: 300px;
+            overflow-y: auto;
+            font-family: 'Monaco', 'Menlo', monospace;
+            font-size: 12px;
+        }
+        .log-entry {
+            padding: 5px 0;
+            border-bottom: 1px solid #e9ecef;
+            display: flex;
+            align-items: flex-start;
+        }
+        .log-time {
+            color: #6c757d;
+            margin-right: 10px;
+            flex-shrink: 0;
+        }
+        .log-message {
+            flex: 1;
+        }
+        .log-entry.error { color: #dc3545; }
+        .log-entry.success { color: #28a745; }
+        .log-entry.info { color: #007bff; }
+        .log-entry.warning { color: #ffc107; }
+        .audio-player {
+            display: inline-flex;
+            align-items: center;
+            gap: 10px;
+            margin-left: 10px;
+        }
+        .play-btn {
+            background: #007bff;
+            color: white;
+            border: none;
+            border-radius: 5px;
+            padding: 5px 10px;
+            cursor: pointer;
+            font-size: 12px;
+        }
+        .play-btn:hover {
+            background: #0056b3;
+        }
+    </style>
+</head>
+<body>
+    <div class="container">
+        <h1>🚀 Ultravox PCM - Otimizado</h1>
+        <div class="status">
+            <div>
+                <span class="status-dot" id="statusDot"></span>
+                <span id="statusText">Desconectado</span>
+            </div>
+            <span id="latencyText">Latência: --ms</span>
+        </div>
+        <div class="voice-selector">
+            <label for="voiceSelect">🔊 Voz TTS:</label>
+            <select id="voiceSelect">
+                <option value="pf_dora" selected>🇧🇷 [pf_dora] Português Feminino (Dora)</option>
+                <option value="pm_alex">🇧🇷 [pm_alex] Português Masculino (Alex)</option>
+                <option value="af_heart">🌍 [af_heart] Alternativa Feminina (Heart)</option>
+                <option value="af_bella">🌍 [af_bella] Alternativa Feminina (Bella)</option>
+            </select>
+        </div>
+        <div class="controls">
+            <button id="connectBtn" class="btn-primary">Conectar</button>
+            <button id="talkBtn" class="btn-success" disabled>Push to Talk</button>
+        </div>
+        <div class="metrics">
+            <div class="metric">
+                <div class="metric-label">Enviado</div>
+                <div class="metric-value" id="sentBytes">0 KB</div>
+            </div>
+            <div class="metric">
+                <div class="metric-label">Recebido</div>
+                <div class="metric-value" id="receivedBytes">0 KB</div>
+            </div>
+            <div class="metric">
+                <div class="metric-label">Formato</div>
+                <div class="metric-value" id="format">PCM</div>
+            </div>
+            <div class="metric">
+                <div class="metric-label">🎤 Voz</div>
+                <div class="metric-value" id="currentVoice" style="font-family: monospace; color: #4CAF50; font-weight: bold;">pf_dora</div>
+            </div>
+        </div>
+        <div class="log" id="log"></div>
+    </div>
+    <!-- Seção TTS Direto -->
+    <div class="container" style="margin-top: 20px;">
+        <h2>🎵 Text-to-Speech Direto</h2>
+        <p>Digite ou edite o texto abaixo e escolha uma voz para converter em áudio</p>
+        <div class="section">
+            <textarea id="ttsText" style="width: 100%; height: 120px; padding: 10px; border: 1px solid #333; border-radius: 8px; background: #1e1e1e; color: #e0e0e0; font-family: 'Segoe UI', system-ui, sans-serif; font-size: 14px; resize: vertical;">Olá! Teste de voz.</textarea>
+        </div>
+        <div class="section" style="display: flex; gap: 10px; align-items: center; margin-top: 15px;">
+            <label for="ttsVoiceSelect" style="font-weight: 600;">🔊 Voz:</label>
+            <select id="ttsVoiceSelect" style="flex: 1; padding: 8px; border: 1px solid #333; border-radius: 5px; background: #2a2a2a; color: #e0e0e0;">
+                <optgroup label="🇧🇷 Português">
+                    <option value="pf_dora" selected>[pf_dora] Feminino - Dora</option>
+                    <option value="pm_alex">[pm_alex] Masculino - Alex</option>
+                    <option value="pm_santa">[pm_santa] Masculino - Santa (Festivo)</option>
+                </optgroup>
+                <optgroup label="🇫🇷 Francês">
+                    <option value="ff_siwis">[ff_siwis] Feminino - Siwis (Nativa)</option>
+                </optgroup>
+                <optgroup label="🇺🇸 Inglês Americano">
+                    <option value="af_alloy">Feminino - Alloy</option>
+                    <option value="af_aoede">Feminino - Aoede</option>
+                    <option value="af_bella">Feminino - Bella</option>
+                    <option value="af_heart">Feminino - Heart</option>
+                    <option value="af_jessica">Feminino - Jessica</option>
+                    <option value="af_kore">Feminino - Kore</option>
+                    <option value="af_nicole">Feminino - Nicole</option>
+                    <option value="af_nova">Feminino - Nova</option>
+                    <option value="af_river">Feminino - River</option>
+                    <option value="af_sarah">Feminino - Sarah</option>
+                    <option value="af_sky">Feminino - Sky</option>
+                    <option value="am_adam">Masculino - Adam</option>
+                    <option value="am_echo">Masculino - Echo</option>
+                    <option value="am_eric">Masculino - Eric</option>
+                    <option value="am_fenrir">Masculino - Fenrir</option>
+                    <option value="am_liam">Masculino - Liam</option>
+                    <option value="am_michael">Masculino - Michael</option>
+                    <option value="am_onyx">Masculino - Onyx</option>
+                    <option value="am_puck">Masculino - Puck</option>
+                    <option value="am_santa">Masculino - Santa</option>
+                </optgroup>
+                <optgroup label="🇬🇧 Inglês Britânico">
+                    <option value="bf_alice">Feminino - Alice</option>
+                    <option value="bf_emma">Feminino - Emma</option>
+                    <option value="bf_isabella">Feminino - Isabella</option>
+                    <option value="bf_lily">Feminino - Lily</option>
+                    <option value="bm_daniel">Masculino - Daniel</option>
+                    <option value="bm_fable">Masculino - Fable</option>
+                    <option value="bm_george">Masculino - George</option>
+                    <option value="bm_lewis">Masculino - Lewis</option>
+                </optgroup>
+                <optgroup label="🇪🇸 Espanhol">
+                    <option value="ef_dora">Feminino - Dora</option>
+                    <option value="em_alex">Masculino - Alex</option>
+                    <option value="em_santa">Masculino - Santa</option>
+                </optgroup>
+                <optgroup label="🇮🇹 Italiano">
+                    <option value="if_sara">Feminino - Sara</option>
+                    <option value="im_nicola">Masculino - Nicola</option>
+                </optgroup>
+                <optgroup label="🇯🇵 Japonês">
+                    <option value="jf_alpha">Feminino - Alpha</option>
+                    <option value="jf_gongitsune">Feminino - Gongitsune</option>
+                    <option value="jf_nezumi">Feminino - Nezumi</option>
+                    <option value="jf_tebukuro">Feminino - Tebukuro</option>
+                    <option value="jm_kumo">Masculino - Kumo</option>
+                </optgroup>
+                <optgroup label="🇨🇳 Chinês">
+                    <option value="zf_xiaobei">Feminino - Xiaobei</option>
+                    <option value="zf_xiaoni">Feminino - Xiaoni</option>
+                    <option value="zf_xiaoxiao">Feminino - Xiaoxiao</option>
+                    <option value="zf_xiaoyi">Feminino - Xiaoyi</option>
+                    <option value="zm_yunjian">Masculino - Yunjian</option>
+                    <option value="zm_yunxi">Masculino - Yunxi</option>
+                    <option value="zm_yunxia">Masculino - Yunxia</option>
+                    <option value="zm_yunyang">Masculino - Yunyang</option>
+                </optgroup>
+                <optgroup label="🇮🇳 Hindi">
+                    <option value="hf_alpha">Feminino - Alpha</option>
+                    <option value="hf_beta">Feminino - Beta</option>
+                    <option value="hm_omega">Masculino - Omega</option>
+                    <option value="hm_psi">Masculino - Psi</option>
+                </optgroup>
+            </select>
+            <button id="ttsPlayBtn" class="btn-success" disabled style="padding: 10px 20px;">
+                ▶️ Gerar Áudio
+            </button>
+        </div>
+        <div id="ttsStatus" style="display: none; margin-top: 15px; padding: 15px; background: #2a2a2a; border-radius: 8px;">
+            <span id="ttsStatusText">⏳ Processando...</span>
+        </div>
+        <div id="ttsPlayer" style="display: none; margin-top: 15px;">
+            <audio id="ttsAudio" controls style="width: 100%;"></audio>
+        </div>
+    </div>
+    <script>
+        // Estado da aplicação
+        let ws = null;
+        let isConnected = false;
+        let isRecording = false;
+        let audioContext = null;
+        let stream = null;
+        let audioSource = null;
+        let audioProcessor = null;
+        let pcmBuffer = [];
+        // Métricas
+        const metrics = {
+            sentBytes: 0,
+            receivedBytes: 0,
+            latency: 0,
+            recordingStartTime: 0
+        };
+        // Elementos DOM
+        const elements = {
+            statusDot: document.getElementById('statusDot'),
+            statusText: document.getElementById('statusText'),
+            latencyText: document.getElementById('latencyText'),
+            connectBtn: document.getElementById('connectBtn'),
+            talkBtn: document.getElementById('talkBtn'),
+            voiceSelect: document.getElementById('voiceSelect'),
+            sentBytes: document.getElementById('sentBytes'),
+            receivedBytes: document.getElementById('receivedBytes'),
+            format: document.getElementById('format'),
+            log: document.getElementById('log'),
+            // TTS elements
+            ttsText: document.getElementById('ttsText'),
+            ttsVoiceSelect: document.getElementById('ttsVoiceSelect'),
+            ttsPlayBtn: document.getElementById('ttsPlayBtn'),
+            ttsStatus: document.getElementById('ttsStatus'),
+            ttsStatusText: document.getElementById('ttsStatusText'),
+            ttsPlayer: document.getElementById('ttsPlayer'),
+            ttsAudio: document.getElementById('ttsAudio')
+        };
+        // Log no console visual
+        function log(message, type = 'info') {
+            const time = new Date().toLocaleTimeString('pt-BR');
+            const entry = document.createElement('div');
+            entry.className = `log-entry ${type}`;
+            entry.innerHTML = `
+                <span class="log-time">[${time}]</span>
+                <span class="log-message">${message}</span>
+            `;
+            elements.log.appendChild(entry);
+            elements.log.scrollTop = elements.log.scrollHeight;
+            console.log(`[${type}] ${message}`);
+        }
+        // Atualizar métricas
+        function updateMetrics() {
+            elements.sentBytes.textContent = `${(metrics.sentBytes / 1024).toFixed(1)} KB`;
+            elements.receivedBytes.textContent = `${(metrics.receivedBytes / 1024).toFixed(1)} KB`;
+            elements.latencyText.textContent = `Latência: ${metrics.latency}ms`;
+        }
+        // Conectar ao WebSocket
+        async function connect() {
+            try {
+                // Solicitar acesso ao microfone
+                stream = await navigator.mediaDevices.getUserMedia({
+                    audio: {
+                        echoCancellation: true,
+                        noiseSuppression: true,
+                        sampleRate: 24000  // High quality 24kHz
+                    }
+                });
+                log('✅ Microfone acessado', 'success');
+                // Conectar WebSocket com suporte binário
+                const protocol = window.location.protocol === 'https:' ? 'wss:' : 'ws:';
+                const wsUrl = `${protocol}//${window.location.host}/ws`;
+                ws = new WebSocket(wsUrl);
+                ws.binaryType = 'arraybuffer';
+                ws.onopen = () => {
+                    isConnected = true;
+                    elements.statusDot.classList.add('connected');
+                    elements.statusText.textContent = 'Conectado';
+                    elements.connectBtn.textContent = 'Desconectar';
+                    elements.connectBtn.classList.remove('btn-primary');
+                    elements.connectBtn.classList.add('btn-danger');
+                    elements.talkBtn.disabled = false;
+                    // Enviar voz selecionada ao conectar
+                    const currentVoice = elements.voiceSelect.value || elements.ttsVoiceSelect.value || 'pf_dora';
+                    ws.send(JSON.stringify({
+                        type: 'set-voice',
+                        voice_id: currentVoice
+                    }));
+                    log(`🔊 Voz configurada: ${currentVoice}`, 'info');
+                    elements.ttsPlayBtn.disabled = false; // Habilitar TTS button
+                    log('✅ Conectado ao servidor', 'success');
+                };
+                ws.onmessage = (event) => {
+                    if (event.data instanceof ArrayBuffer) {
+                        // Áudio PCM binário recebido
+                        handlePCMAudio(event.data);
+                    } else {
+                        // Mensagem JSON
+                        const data = JSON.parse(event.data);
+                        handleMessage(data);
+                    }
+                };
+                ws.onerror = (error) => {
+                    log(`❌ Erro WebSocket: ${error}`, 'error');
+                };
+                ws.onclose = () => {
+                    disconnect();
+                };
+            } catch (error) {
+                log(`❌ Erro ao conectar: ${error.message}`, 'error');
+            }
+        }
+        // Desconectar
+        function disconnect() {
+            isConnected = false;
+            if (ws) {
+                ws.close();
+                ws = null;
+            }
+            if (stream) {
+                stream.getTracks().forEach(track => track.stop());
+                stream = null;
+            }
+            if (audioContext) {
+                audioContext.close();
+                audioContext = null;
+            }
+            elements.statusDot.classList.remove('connected');
+            elements.statusText.textContent = 'Desconectado';
+            elements.connectBtn.textContent = 'Conectar';
+            elements.connectBtn.classList.remove('btn-danger');
+            elements.connectBtn.classList.add('btn-primary');
+            elements.talkBtn.disabled = true;
+            log('👋 Desconectado', 'warning');
+        }
+        // Iniciar gravação PCM
+        function startRecording() {
+            if (isRecording) return;
+            isRecording = true;
+            metrics.recordingStartTime = Date.now();
+            elements.talkBtn.classList.add('recording');
+            elements.talkBtn.textContent = 'Gravando...';
+            pcmBuffer = [];
+            const sampleRate = 24000; // Sempre usar melhor qualidade
+            log(`🎤 Gravando PCM 16-bit @ ${sampleRate}Hz (alta qualidade)`, 'info');
+            // Criar AudioContext se necessário
+            if (!audioContext) {
+                // Sempre usar melhor qualidade (24kHz)
+                const sampleRate = 24000;
+                audioContext = new (window.AudioContext || window.webkitAudioContext)({
+                    sampleRate: sampleRate
+                });
+                log(`🎧 AudioContext criado: ${sampleRate}Hz (alta qualidade)`, 'info');
+            }
+            // Criar processador de áudio
+            audioSource = audioContext.createMediaStreamSource(stream);
+            audioProcessor = audioContext.createScriptProcessor(4096, 1, 1);
+            audioProcessor.onaudioprocess = (e) => {
+                if (!isRecording) return;
+                const inputData = e.inputBuffer.getChannelData(0);
+                // Calcular RMS (Root Mean Square) para melhor detecção de volume
+                let sumSquares = 0;
+                for (let i = 0; i < inputData.length; i++) {
+                    sumSquares += inputData[i] * inputData[i];
+                }
+                const rms = Math.sqrt(sumSquares / inputData.length);
+                // Calcular amplitude máxima também
+                let maxAmplitude = 0;
+                for (let i = 0; i < inputData.length; i++) {
+                    maxAmplitude = Math.max(maxAmplitude, Math.abs(inputData[i]));
+                }
+                // Detecção de voz baseada em RMS (mais confiável que amplitude máxima)
+                const voiceThreshold = 0.01; // Threshold para detectar voz
+                const hasVoice = rms > voiceThreshold;
+                // Aplicar ganho suave apenas se necessário
+                let gain = 1.0;
+                if (hasVoice && rms < 0.05) {
+                    // Ganho suave baseado em RMS, máximo 5x
+                    gain = Math.min(5.0, 0.05 / rms);
+                    if (gain > 1.2) {
+                        log(`🎤 Volume baixo detectado, aplicando ganho: ${gain.toFixed(1)}x`, 'info');
+                    }
+                }
+                // Converter Float32 para Int16 com processamento melhorado
+                const pcmData = new Int16Array(inputData.length);
+                for (let i = 0; i < inputData.length; i++) {
+                    // Aplicar ganho suave
+                    let sample = inputData[i] * gain;
+                    // Soft clipping para evitar distorção
+                    if (Math.abs(sample) > 0.95) {
+                        sample = Math.sign(sample) * (0.95 + 0.05 * Math.tanh((Math.abs(sample) - 0.95) * 10));
+                    }
+                    // Converter para Int16
+                    sample = Math.max(-1, Math.min(1, sample));
+                    pcmData[i] = sample < 0 ? sample * 0x8000 : sample * 0x7FFF;
+                }
+                // Adicionar ao buffer apenas se detectar voz
+                if (hasVoice) {
+                    pcmBuffer.push(pcmData);
+                }
+            };
+            audioSource.connect(audioProcessor);
+            audioProcessor.connect(audioContext.destination);
+        }
+        // Parar gravação e enviar
+        function stopRecording() {
+            if (!isRecording) return;
+            isRecording = false;
+            const duration = Date.now() - metrics.recordingStartTime;
+            elements.talkBtn.classList.remove('recording');
+            elements.talkBtn.textContent = 'Push to Talk';
+            // Desconectar processador
+            if (audioProcessor) {
+                audioProcessor.disconnect();
+                audioProcessor = null;
+            }
+            if (audioSource) {
+                audioSource.disconnect();
+                audioSource = null;
+            }
+            // Verificar se há áudio para enviar
+            if (pcmBuffer.length === 0) {
+                log(`⚠️ Nenhum áudio capturado (silêncio ou volume muito baixo)`, 'warning');
+                pcmBuffer = [];
+                return;
+            }
+            // Combinar todos os chunks PCM
+            const totalLength = pcmBuffer.reduce((acc, chunk) => acc + chunk.length, 0);
+            // Verificar tamanho mínimo (0.5 segundos)
+            const sampleRate = 24000; // Sempre 24kHz
+            const minSamples = sampleRate * 0.5;
+            if (totalLength < minSamples) {
+                log(`⚠️ Áudio muito curto: ${(totalLength/sampleRate).toFixed(2)}s (mínimo 0.5s)`, 'warning');
+                pcmBuffer = [];
+                return;
+            }
+            const fullPCM = new Int16Array(totalLength);
+            let offset = 0;
+            for (const chunk of pcmBuffer) {
+                fullPCM.set(chunk, offset);
+                offset += chunk.length;
+            }
+            // Calcular amplitude final para debug
+            let maxAmp = 0;
+            for (let i = 0; i < Math.min(fullPCM.length, 1000); i++) {
+                maxAmp = Math.max(maxAmp, Math.abs(fullPCM[i] / 32768));
+            }
+            // Enviar PCM binário direto (sem Base64!)
+            if (ws && ws.readyState === WebSocket.OPEN) {
+                // Enviar um header simples antes do áudio
+                const header = new ArrayBuffer(8);
+                const view = new DataView(header);
+                view.setUint32(0, 0x50434D16); // Magic: "PCM16"
+                view.setUint32(4, fullPCM.length * 2); // Tamanho em bytes
+                ws.send(header);
+                ws.send(fullPCM.buffer);
+                metrics.sentBytes += fullPCM.length * 2;
+                updateMetrics();
+                const sampleRate = 24000; // Sempre 24kHz
+                log(`📤 PCM enviado: ${(fullPCM.length * 2 / 1024).toFixed(1)}KB, ${(totalLength/sampleRate).toFixed(1)}s @ ${sampleRate}Hz, amp:${maxAmp.toFixed(3)}`, 'success');
+            }
+            // Limpar buffer após enviar
+            pcmBuffer = [];
+        }
+        // Processar mensagem JSON
+        function handleMessage(data) {
+            switch (data.type) {
+                case 'metrics':
+                    metrics.latency = data.latency;
+                    updateMetrics();
+                    log(`📊 Resposta: "${data.response}" (${data.latency}ms)`, 'success');
+                    break;
+                case 'error':
+                    log(`❌ Erro: ${data.message}`, 'error');
+                    break;
+                case 'tts-response':
+                    // Resposta do TTS direto (Opus 24kHz ou PCM)
+                    if (data.audio) {
+                        // Decodificar base64 para arraybuffer
+                        const binaryString = atob(data.audio);
+                        const bytes = new Uint8Array(binaryString.length);
+                        for (let i = 0; i < binaryString.length; i++) {
+                            bytes[i] = binaryString.charCodeAt(i);
+                        }
+                        let audioData = bytes.buffer;
+                        // IMPORTANTE: Usar a taxa enviada pelo servidor
+                        const sampleRate = data.sampleRate || 24000;
+                        console.log(`🎯 TTS Response - Taxa recebida: ${sampleRate}Hz, Formato: ${data.format}, Tamanho: ${bytes.length} bytes`);
+                        // Se for Opus, usar WebAudio API para decodificar nativamente
+                        let wavBuffer;
+                        if (data.format === 'opus') {
+                            console.log(`🗜️ Opus 24kHz recebido: ${(bytes.length/1024).toFixed(1)}KB`);
+                            // Log de economia de banda
+                            if (data.originalSize) {
+                                const compression = Math.round(100 - (bytes.length / data.originalSize) * 100);
+                                console.log(`📊 Economia de banda: ${compression}% (${(data.originalSize/1024).toFixed(1)}KB → ${(bytes.length/1024).toFixed(1)}KB)`);
+                            }
+                            // WebAudio API pode decodificar Opus nativamente
+                            // Por agora, tratar como PCM até implementar decoder completo
+                            wavBuffer = addWavHeader(audioData, sampleRate);
+                        } else {
+                            // PCM - adicionar WAV header com a taxa correta
+                            wavBuffer = addWavHeader(audioData, sampleRate);
+                        }
+                        // Log da qualidade recebida
+                        console.log(`🎵 TTS pronto: ${(audioData.byteLength/1024).toFixed(1)}KB @ ${sampleRate}Hz (${data.quality || 'high'} quality, ${data.format || 'pcm'})`);
+                        // Criar blob e URL
+                        const blob = new Blob([wavBuffer], { type: 'audio/wav' });
+                        const audioUrl = URL.createObjectURL(blob);
+                        // Atualizar player
+                        elements.ttsAudio.src = audioUrl;
+                        elements.ttsPlayer.style.display = 'block';
+                        elements.ttsStatus.style.display = 'none';
+                        elements.ttsPlayBtn.disabled = false;
+                        elements.ttsPlayBtn.textContent = '▶️ Gerar Áudio';
+                        log('🎵 Áudio TTS gerado com sucesso!', 'success');
+                    }
+                    break;
+            }
+        }
+        // Processar áudio PCM recebido
+        function handlePCMAudio(arrayBuffer) {
+            metrics.receivedBytes += arrayBuffer.byteLength;
+            updateMetrics();
+            // Criar WAV header para reproduzir
+            const wavBuffer = addWavHeader(arrayBuffer);
+            // Criar blob e URL para o áudio
+            const blob = new Blob([wavBuffer], { type: 'audio/wav' });
+            const audioUrl = URL.createObjectURL(blob);
+            // Criar log com botão de play
+            const time = new Date().toLocaleTimeString('pt-BR');
+            const entry = document.createElement('div');
+            entry.className = 'log-entry success';
+            entry.innerHTML = `
+                <span class="log-time">[${time}]</span>
+                <span class="log-message">🔊 Áudio recebido: ${(arrayBuffer.byteLength / 1024).toFixed(1)}KB</span>
+                <div class="audio-player">
+                    <button class="play-btn" onclick="playAudio('${audioUrl}')">▶️ Play</button>
+                    <audio id="audio-${Date.now()}" src="${audioUrl}" style="display: none;"></audio>
+                </div>
+            `;
+            elements.log.appendChild(entry);
+            elements.log.scrollTop = elements.log.scrollHeight;
+            // Auto-play o áudio
+            const audio = new Audio(audioUrl);
+            audio.play().catch(err => {
+                console.log('Auto-play bloqueado, use o botão para reproduzir');
+            });
+        }
+        // Função para tocar áudio manualmente
+        function playAudio(url) {
+            const audio = new Audio(url);
+            audio.play();
+        }
+        // Adicionar header WAV ao PCM
+        function addWavHeader(pcmBuffer, customSampleRate) {
+            const pcmData = new Uint8Array(pcmBuffer);
+            const wavBuffer = new ArrayBuffer(44 + pcmData.length);
+            const view = new DataView(wavBuffer);
+            // WAV header
+            const writeString = (offset, string) => {
+                for (let i = 0; i < string.length; i++) {
+                    view.setUint8(offset + i, string.charCodeAt(i));
+                }
+            };
+            writeString(0, 'RIFF');
+            view.setUint32(4, 36 + pcmData.length, true);
+            writeString(8, 'WAVE');
+            writeString(12, 'fmt ');
+            view.setUint32(16, 16, true); // fmt chunk size
+            view.setUint16(20, 1, true); // PCM format
+            view.setUint16(22, 1, true); // Mono
+            // Usar taxa customizada se fornecida, senão usar 24kHz
+            let sampleRate = customSampleRate || 24000;
+            console.log(`📝 WAV Header - Configurando taxa: ${sampleRate}Hz`);
+            view.setUint32(24, sampleRate, true); // Sample rate
+            view.setUint32(28, sampleRate * 2, true); // Byte rate: sampleRate * 1 * 2
+            view.setUint16(32, 2, true); // Block align: 1 * 2
+            view.setUint16(34, 16, true); // Bits per sample: 16-bit
+            writeString(36, 'data');
+            view.setUint32(40, pcmData.length, true);
+            // Copiar dados PCM
+            new Uint8Array(wavBuffer, 44).set(pcmData);
+            return wavBuffer;
+        }
+        // Event Listeners
+        elements.connectBtn.addEventListener('click', () => {
+            if (isConnected) {
+                disconnect();
+            } else {
+                connect();
+            }
+        });
+        elements.talkBtn.addEventListener('mousedown', startRecording);
+        elements.talkBtn.addEventListener('mouseup', stopRecording);
+        elements.talkBtn.addEventListener('mouseleave', stopRecording);
+        // Voice selector listener
+        elements.voiceSelect.addEventListener('change', (e) => {
+            const voice_id = e.target.value;
+            console.log('Voice select changed to:', voice_id);
+            // Update current voice display
+            const currentVoiceElement = document.getElementById('currentVoice');
+            if (currentVoiceElement) {
+                currentVoiceElement.textContent = voice_id;
+            }
+            if (ws && ws.readyState === WebSocket.OPEN) {
+                console.log('Sending set-voice command:', voice_id);
+                ws.send(JSON.stringify({
+                    type: 'set-voice',
+                    voice_id: voice_id
+                }));
+                log(`🔊 Voz alterada para: ${voice_id} - ${e.target.options[e.target.selectedIndex].text}`, 'info');
+            } else {
+                console.log('WebSocket not connected, cannot send voice change');
+                log(`⚠️ Conecte-se primeiro para mudar a voz`, 'warning');
+            }
+        });
+        elements.talkBtn.addEventListener('touchstart', startRecording);
+        elements.talkBtn.addEventListener('touchend', stopRecording);
+        // TTS Voice selector listener
+        elements.ttsVoiceSelect.addEventListener('change', (e) => {
+            const voice_id = e.target.value;
+            // Update main voice selector
+            elements.voiceSelect.value = voice_id;
+            // Update current voice display
+            const currentVoiceElement = document.getElementById('currentVoice');
+            if (currentVoiceElement) {
+                currentVoiceElement.textContent = voice_id;
+            }
+            // Send voice change to server
+            if (ws && ws.readyState === WebSocket.OPEN) {
+                ws.send(JSON.stringify({
+                    type: 'set-voice',
+                    voice_id: voice_id
+                }));
+                log(`🎤 Voz TTS alterada para: ${voice_id}`, 'info');
+            }
+        });
+        // TTS Button Event Listener
+        elements.ttsPlayBtn.addEventListener('click', (e) => {
+            e.preventDefault();
+            e.stopPropagation();
+            console.log('TTS Button clicked!');
+            const text = elements.ttsText.value.trim();
+            const voice = elements.ttsVoiceSelect.value;
+            console.log('TTS Text:', text);
+            console.log('TTS Voice:', voice);
+            if (!text) {
+                alert('Por favor, digite algum texto para converter em áudio');
+                return;
+            }
+            if (!ws || ws.readyState !== WebSocket.OPEN) {
+                alert('Por favor, conecte-se primeiro clicando em "Conectar"');
+                return;
+            }
+            // Mostrar status
+            elements.ttsStatus.style.display = 'block';
+            elements.ttsStatusText.textContent = '⏳ Gerando áudio...';
+            elements.ttsPlayBtn.disabled = true;
+            elements.ttsPlayBtn.textContent = '⏳ Processando...';
+            elements.ttsPlayer.style.display = 'none';
+            // Sempre usar melhor qualidade (24kHz)
+            const quality = 'high';
+            // Enviar request para TTS com qualidade máxima
+            const ttsRequest = {
+                type: 'text-to-speech',
+                text: text,
+                voice_id: voice,
+                quality: quality,
+                format: 'opus'  // Opus 24kHz @ 32kbps - máxima qualidade, mínima banda
+            };
+            console.log('Sending TTS request:', ttsRequest);
+            ws.send(JSON.stringify(ttsRequest));
+            log(`🎤 Solicitando TTS: voz=${voice}, texto="${text.substring(0, 50)}..."`, 'info');
+        });
+        // Inicialização
+        log('🚀 Ultravox Chat PCM Otimizado', 'info');
+        log('📊 Formato: PCM 16-bit @ 16kHz', 'info');
+        log('⚡ Sem FFmpeg, sem Base64!', 'success');
+    </script>
+</body>
+</html>

services/webrtc_gateway/ultravox-chat-ios.html ADDED Viewed

	@@ -0,0 +1,1843 @@

+<!DOCTYPE html>
+<html lang="pt-BR">
+<head>
+    <meta charset="UTF-8">
+    <meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no">
+    <meta name="apple-mobile-web-app-capable" content="yes">
+    <meta name="apple-mobile-web-app-status-bar-style" content="black-translucent">
+    <title>Ultravox AI Assistant</title>
+    <!-- Material Icons -->
+    <link href="https://fonts.googleapis.com/icon?family=Material+Icons" rel="stylesheet">
+    <link href="https://fonts.googleapis.com/css2?family=Inter:wght@300;400;500;600;700&display=swap" rel="stylesheet">
+    <!-- Opus Decoder -->
+    <script src="opus-decoder.js"></script>
+    <style>
+        * {
+            margin: 0;
+            padding: 0;
+            box-sizing: border-box;
+            -webkit-tap-highlight-color: transparent;
+        }
+        :root {
+            --ios-blue: #007AFF;
+            --ios-gray: #8E8E93;
+            --ios-gray-2: #C7C7CC;
+            --ios-gray-3: #D1D1D6;
+            --ios-gray-4: #E5E5EA;
+            --ios-gray-5: #F2F2F7;
+            --ios-gray-6: #FFFFFF;
+            --ios-red: #FF3B30;
+            --ios-green: #34C759;
+            --ios-orange: #FF9500;
+            --ios-purple: #AF52DE;
+            --sidebar-width: 280px;
+            --header-height: 60px;
+        }
+        /* Pull to Refresh */
+        .pull-to-refresh {
+            position: fixed;
+            top: -60px;
+            left: 0;
+            right: 0;
+            height: 60px;
+            background: rgba(255, 255, 255, 0.95);
+            backdrop-filter: blur(20px);
+            -webkit-backdrop-filter: blur(20px);
+            display: flex;
+            align-items: center;
+            justify-content: center;
+            z-index: 2000;
+            transition: transform 0.3s ease;
+            border-bottom: 1px solid var(--ios-gray-4);
+        }
+        .pull-to-refresh.show {
+            transform: translateY(60px);
+        }
+        .pull-to-refresh-spinner {
+            width: 20px;
+            height: 20px;
+            border: 2px solid var(--ios-gray-3);
+            border-top-color: var(--ios-blue);
+            border-radius: 50%;
+            animation: none;
+            margin-right: 10px;
+        }
+        .pull-to-refresh.refreshing .pull-to-refresh-spinner {
+            animation: spin 1s linear infinite;
+        }
+        .pull-to-refresh-text {
+            font-size: 14px;
+            color: var(--ios-gray);
+        }
+        body {
+            font-family: 'Inter', -apple-system, BlinkMacSystemFont, 'Segoe UI', sans-serif;
+            background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
+            color: #000;
+            overflow: hidden;
+            height: 100vh;
+            position: fixed;
+            width: 100%;
+            user-select: none;
+            -webkit-user-select: none;
+        }
+        /* App Container */
+        .app-container {
+            display: flex;
+            height: 100vh;
+            position: relative;
+        }
+        /* Sidebar */
+        .sidebar {
+            width: var(--sidebar-width);
+            background: rgba(255, 255, 255, 0.95);
+            backdrop-filter: blur(20px);
+            -webkit-backdrop-filter: blur(20px);
+            border-right: 1px solid var(--ios-gray-4);
+            display: flex;
+            flex-direction: column;
+            transition: transform 0.3s cubic-bezier(0.4, 0, 0.2, 1);
+            position: relative;
+            z-index: 100;
+        }
+        .sidebar-header {
+            padding: 20px;
+            border-bottom: 1px solid var(--ios-gray-4);
+        }
+        .app-title {
+            font-size: 24px;
+            font-weight: 700;
+            color: #000;
+            display: flex;
+            align-items: center;
+            gap: 10px;
+        }
+        .app-subtitle {
+            font-size: 12px;
+            color: var(--ios-gray);
+            margin-top: 4px;
+        }
+        .nav-menu {
+            flex: 1;
+            padding: 12px 0;
+        }
+        .nav-item {
+            display: flex;
+            align-items: center;
+            padding: 14px 20px;
+            color: #000;
+            text-decoration: none;
+            transition: all 0.2s ease;
+            position: relative;
+            cursor: pointer;
+            font-size: 15px;
+            font-weight: 500;
+        }
+        .nav-item:hover {
+            background: var(--ios-gray-5);
+        }
+        .nav-item.active {
+            background: var(--ios-blue);
+            color: white;
+        }
+        .nav-item .material-icons {
+            margin-right: 16px;
+            font-size: 22px;
+        }
+        .nav-badge {
+            margin-left: auto;
+            background: var(--ios-red);
+            color: white;
+            font-size: 11px;
+            padding: 2px 8px;
+            border-radius: 12px;
+            font-weight: 600;
+        }
+        /* Main Content */
+        .main-content {
+            flex: 1;
+            display: flex;
+            flex-direction: column;
+            overflow: hidden;
+            background: transparent;
+        }
+        /* Header */
+        .header {
+            height: var(--header-height);
+            background: rgba(255, 255, 255, 0.95);
+            backdrop-filter: blur(20px);
+            -webkit-backdrop-filter: blur(20px);
+            border-bottom: 1px solid var(--ios-gray-4);
+            display: flex;
+            align-items: center;
+            padding: 0 20px;
+            justify-content: space-between;
+        }
+        .menu-toggle {
+            display: none;
+            background: none;
+            border: none;
+            color: var(--ios-blue);
+            cursor: pointer;
+            padding: 8px;
+        }
+        .header-title {
+            font-size: 17px;
+            font-weight: 600;
+            color: #000;
+        }
+        .connection-status {
+            display: flex;
+            align-items: center;
+            gap: 8px;
+            padding: 6px 12px;
+            background: var(--ios-gray-5);
+            border-radius: 20px;
+            font-size: 13px;
+        }
+        .status-dot {
+            width: 8px;
+            height: 8px;
+            border-radius: 50%;
+            background: var(--ios-red);
+        }
+        .status-dot.connected {
+            background: var(--ios-green);
+            animation: pulse 2s infinite;
+        }
+        @keyframes pulse {
+            0%, 100% {
+                opacity: 1;
+                transform: scale(1);
+            }
+            50% {
+                opacity: 0.8;
+                transform: scale(1.05);
+            }
+        }
+        /* View Container */
+        .view-container {
+            flex: 1;
+            overflow-y: auto;
+            padding: 20px;
+            display: none;
+        }
+        .view-container.active {
+            display: block;
+        }
+        /* iOS Card Style - Minimal */
+        .ios-card {
+            background: rgba(255, 255, 255, 0.95);
+            backdrop-filter: blur(20px);
+            -webkit-backdrop-filter: blur(20px);
+            border-radius: 16px;
+            padding: 20px;
+            margin-bottom: 16px;
+            border: 1px solid rgba(255, 255, 255, 0.3);
+        }
+        .card-title {
+            font-size: 20px;
+            font-weight: 600;
+            margin-bottom: 16px;
+            color: #000;
+        }
+        /* Voice Selector */
+        .voice-selector {
+            width: 100%;
+            padding: 12px 16px;
+            background: var(--ios-gray-5);
+            border: 1px solid var(--ios-gray-4);
+            border-radius: 10px;
+            font-size: 15px;
+            font-family: inherit;
+            appearance: none;
+            background-image: url("data:image/svg+xml;charset=UTF-8,%3csvg xmlns='http://www.w3.org/2000/svg' viewBox='0 0 24 24' fill='none' stroke='%23007AFF' stroke-width='2' stroke-linecap='round' stroke-linejoin='round'%3e%3cpolyline points='6 9 12 15 18 9'%3e%3c/polyline%3e%3c/svg%3e");
+            background-repeat: no-repeat;
+            background-position: right 12px center;
+            background-size: 20px;
+            padding-right: 40px;
+        }
+        /* iOS Button */
+        .ios-button {
+            width: 100%;
+            padding: 16px;
+            background: var(--ios-blue);
+            color: white;
+            border: none;
+            border-radius: 12px;
+            font-size: 17px;
+            font-weight: 600;
+            cursor: pointer;
+            transition: all 0.2s ease;
+            display: flex;
+            align-items: center;
+            justify-content: center;
+            gap: 8px;
+            font-family: inherit;
+        }
+        .ios-button:hover {
+            opacity: 0.9;
+        }
+        .ios-button:active {
+            transform: scale(0.98);
+        }
+        .ios-button:disabled {
+            background: var(--ios-gray-3);
+            cursor: not-allowed;
+        }
+        .ios-button.secondary {
+            background: var(--ios-gray-5);
+            color: var(--ios-blue);
+        }
+        .ios-button.danger {
+            background: var(--ios-red);
+        }
+        .ios-button.success {
+            background: var(--ios-green);
+        }
+        .ios-button.recording {
+            background: var(--ios-red);
+            animation: recordPulse 1s infinite;
+        }
+        @keyframes recordPulse {
+            0%, 100% { opacity: 1; }
+            50% { opacity: 0.8; }
+        }
+        /* Push to Talk View - Compact Professional */
+        .ptt-container {
+            display: grid;
+            grid-template-columns: 1fr;
+            gap: 20px;
+            max-width: 500px;
+            margin: 0 auto;
+        }
+        .ptt-main-section {
+            display: flex;
+            flex-direction: column;
+            align-items: center;
+            gap: 20px;
+        }
+        .ptt-button {
+            width: 140px;
+            height: 140px;
+            border-radius: 50%;
+            background: linear-gradient(145deg, #ffffff, #f0f0f5);
+            color: var(--ios-blue);
+            border: none;
+            font-size: 14px;
+            font-weight: 600;
+            cursor: pointer;
+            transition: all 0.2s ease;
+            display: flex;
+            flex-direction: column;
+            align-items: center;
+            justify-content: center;
+            gap: 8px;
+            box-shadow: 0 8px 24px rgba(0, 0, 0, 0.1);
+            position: relative;
+            user-select: none;
+            -webkit-user-select: none;
+            -webkit-tap-highlight-color: transparent;
+        }
+        .ptt-button::before {
+            content: '';
+            position: absolute;
+            width: 100%;
+            height: 100%;
+            border-radius: 50%;
+            border: 2px solid var(--ios-blue);
+            animation: ripple 2s linear infinite;
+            opacity: 0;
+        }
+        .ptt-button:active {
+            transform: scale(0.92);
+            box-shadow: 0 2px 8px rgba(0, 122, 255, 0.3);
+        }
+        .ptt-button.recording {
+            background: linear-gradient(145deg, #ff453a, #ff6b6b);
+            color: white;
+            transform: scale(1.05);
+            box-shadow: 0 12px 32px rgba(255, 59, 48, 0.3);
+        }
+        .ptt-button.recording::before {
+            border-color: var(--ios-red);
+            animation: ripple 1s linear infinite;
+        }
+        @keyframes ripple {
+            0% {
+                transform: scale(1);
+                opacity: 1;
+            }
+            100% {
+                transform: scale(1.5);
+                opacity: 0;
+            }
+        }
+        .ptt-button .material-icons {
+            font-size: 40px;
+            user-select: none;
+            -webkit-user-select: none;
+        }
+        .ptt-button span:not(.material-icons) {
+            font-size: 12px;
+            opacity: 0.9;
+        }
+        /* Metrics Grid */
+        .metrics-grid {
+            display: grid;
+            grid-template-columns: repeat(auto-fit, minmax(140px, 1fr));
+            gap: 12px;
+            margin-top: 20px;
+        }
+        .metric-card {
+            background: var(--ios-gray-5);
+            padding: 16px;
+            border-radius: 10px;
+            text-align: center;
+        }
+        .metric-label {
+            font-size: 11px;
+            color: var(--ios-gray);
+            text-transform: uppercase;
+            letter-spacing: 0.5px;
+            margin-bottom: 4px;
+        }
+        .metric-value {
+            font-size: 24px;
+            font-weight: 600;
+            color: var(--ios-blue);
+        }
+        /* TTS Textarea */
+        .tts-textarea {
+            width: 100%;
+            min-height: 120px;
+            padding: 16px;
+            background: var(--ios-gray-5);
+            border: 1px solid var(--ios-gray-4);
+            border-radius: 10px;
+            font-size: 15px;
+            font-family: inherit;
+            resize: vertical;
+        }
+        .tts-textarea:focus {
+            outline: none;
+            border-color: var(--ios-blue);
+        }
+        /* Log Console */
+        .log-container {
+            background: #1c1c1e;
+            border-radius: 10px;
+            padding: 16px;
+            height: 300px;
+            overflow-y: auto;
+            font-family: 'SF Mono', Monaco, monospace;
+            font-size: 12px;
+        }
+        .log-entry {
+            padding: 4px 0;
+            display: flex;
+            align-items: flex-start;
+            color: #e0e0e0;
+        }
+        .log-time {
+            color: #8e8e93;
+            margin-right: 10px;
+            flex-shrink: 0;
+        }
+        .log-entry.error { color: #ff453a; }
+        .log-entry.success { color: #30d158; }
+        .log-entry.info { color: #0a84ff; }
+        .log-entry.warning { color: #ffd60a; }
+        /* Audio Player */
+        .audio-player {
+            display: inline-flex;
+            align-items: center;
+            gap: 8px;
+            margin-left: 8px;
+        }
+        .play-btn {
+            background: var(--ios-blue);
+            color: white;
+            border: none;
+            border-radius: 4px;
+            padding: 4px 8px;
+            cursor: pointer;
+            font-size: 11px;
+        }
+        /* Loading Spinner */
+        .loading-spinner {
+            display: none;
+            width: 40px;
+            height: 40px;
+            border: 3px solid var(--ios-gray-4);
+            border-top-color: var(--ios-blue);
+            border-radius: 50%;
+            animation: spin 1s linear infinite;
+            margin: 20px auto;
+        }
+        .loading-spinner.active {
+            display: block;
+        }
+        @keyframes spin {
+            to { transform: rotate(360deg); }
+        }
+        /* Mobile Styles */
+        @media (max-width: 768px) {
+            .sidebar {
+                position: fixed;
+                left: 0;
+                top: 0;
+                height: 100%;
+                transform: translateX(-100%);
+                z-index: 1000;
+            }
+            .sidebar.open {
+                transform: translateX(0);
+            }
+            .menu-toggle {
+                display: block;
+            }
+            .overlay {
+                display: none;
+                position: fixed;
+                top: 0;
+                left: 0;
+                right: 0;
+                bottom: 0;
+                background: rgba(0, 0, 0, 0.5);
+                z-index: 999;
+            }
+            .overlay.active {
+                display: block;
+            }
+        }
+        /* Settings View */
+        .settings-group {
+            margin-bottom: 24px;
+        }
+        .settings-label {
+            font-size: 13px;
+            color: var(--ios-gray);
+            text-transform: uppercase;
+            letter-spacing: 0.5px;
+            margin-bottom: 12px;
+        }
+        .toggle-switch {
+            display: flex;
+            align-items: center;
+            justify-content: space-between;
+            padding: 12px 0;
+        }
+        .toggle-label {
+            font-size: 15px;
+            color: #000;
+        }
+        .toggle-input {
+            position: relative;
+            width: 51px;
+            height: 31px;
+            background: var(--ios-gray-3);
+            border-radius: 31px;
+            cursor: pointer;
+            transition: background 0.3s;
+        }
+        .toggle-input.checked {
+            background: var(--ios-green);
+        }
+        .toggle-input::after {
+            content: '';
+            position: absolute;
+            width: 27px;
+            height: 27px;
+            border-radius: 50%;
+            background: white;
+            top: 2px;
+            left: 2px;
+            transition: transform 0.3s;
+            box-shadow: 0 2px 4px rgba(0, 0, 0, 0.2);
+        }
+        .toggle-input.checked::after {
+            transform: translateX(20px);
+        }
+    </style>
+</head>
+<body>
+    <!-- Pull to Refresh Indicator -->
+    <div class="pull-to-refresh" id="pullToRefresh">
+        <div class="pull-to-refresh-spinner"></div>
+        <span class="pull-to-refresh-text">Refreshing...</span>
+    </div>
+    <div class="app-container">
+        <!-- Sidebar -->
+        <nav class="sidebar" id="sidebar">
+            <div class="sidebar-header">
+                <div class="app-title">
+                    <span class="material-icons">smart_toy</span>
+                    Ultravox AI
+                </div>
+                <div class="app-subtitle">Voice Assistant</div>
+            </div>
+            <div class="nav-menu">
+                <a class="nav-item active" data-view="push-to-talk">
+                    <span class="material-icons">mic</span>
+                    Push to Talk
+                    <span class="nav-badge" id="pttBadge" style="display: none;">Live</span>
+                </a>
+                <a class="nav-item" data-view="text-to-speech">
+                    <span class="material-icons">record_voice_over</span>
+                    Text to Speech
+                </a>
+                <a class="nav-item" data-view="logs">
+                    <span class="material-icons">terminal</span>
+                    Console Logs
+                </a>
+                <a class="nav-item" data-view="settings">
+                    <span class="material-icons">settings</span>
+                    Settings
+                </a>
+            </div>
+        </nav>
+        <!-- Overlay for mobile -->
+        <div class="overlay" id="overlay"></div>
+        <!-- Main Content -->
+        <main class="main-content">
+            <!-- Header -->
+            <header class="header">
+                <button class="menu-toggle" id="menuToggle">
+                    <span class="material-icons">menu</span>
+                </button>
+                <h1 class="header-title" id="headerTitle">Push to Talk</h1>
+                <div class="connection-status">
+                    <span class="status-dot" id="statusDot"></span>
+                    <span id="statusText">Disconnected</span>
+                </div>
+            </header>
+            <!-- Push to Talk View -->
+            <div class="view-container active" id="push-to-talk">
+                <div class="ptt-container">
+                    <!-- Single Clean Card -->
+                    <div style="background: rgba(255, 255, 255, 0.98); backdrop-filter: blur(20px); border-radius: 24px; padding: 32px; box-shadow: 0 20px 40px rgba(0, 0, 0, 0.1);">
+                        <!-- Connection Message -->
+                        <div id="connectionMessage" style="background: linear-gradient(145deg, #FFF4E6, #FFF9F0); border: 1px solid #FFD700; border-radius: 12px; padding: 16px; margin-bottom: 24px; text-align: center; display: block;">
+                            <span class="material-icons" style="color: var(--ios-orange); font-size: 24px; margin-bottom: 8px; display: block;">info</span>
+                            <p style="margin: 0; color: #333; font-size: 14px; font-weight: 500;">Connect to start using voice assistant</p>
+                            <p style="margin: 4px 0 0 0; color: var(--ios-gray); font-size: 12px;">Click the connect button below to begin</p>
+                        </div>
+                        <!-- Voice Selector at Top -->
+                        <div style="text-align: center; margin-bottom: 32px;">
+                            <select class="voice-selector" id="quickVoiceSelect" disabled style="background: linear-gradient(145deg, #f0f0f5, #ffffff); border: none; padding: 12px 24px; font-size: 14px; font-weight: 500; border-radius: 12px; width: auto; min-width: 180px; box-shadow: 0 2px 8px rgba(0, 0, 0, 0.05); opacity: 0.5; cursor: not-allowed;">
+                                <option value="pf_dora" selected>🇧🇷 Portuguese Female</option>
+                                <option value="pm_alex">🇧🇷 Portuguese Male</option>
+                                <option value="af_bella">🇺🇸 English Female</option>
+                                <option value="am_adam">🇺🇸 English Male</option>
+                            </select>
+                        </div>
+                        <!-- Main Button Area -->
+                        <div class="ptt-main-section" style="margin-bottom: 32px;">
+                            <button class="ptt-button" id="talkBtn" disabled style="opacity: 0.3; cursor: not-allowed;">
+                                <span class="material-icons">mic_off</span>
+                                <span style="font-size: 11px; text-transform: uppercase; letter-spacing: 1px;">Offline</span>
+                            </button>
+                            <button class="ios-button" id="connectBtn" style="background: linear-gradient(145deg, #34C759, #30D158); width: 200px; padding: 16px; font-size: 16px; border-radius: 16px; margin-top: 24px; box-shadow: 0 6px 20px rgba(52, 199, 89, 0.3); font-weight: 600;">
+                                <span class="material-icons" style="font-size: 22px;">wifi</span>
+                                Connect Now
+                            </button>
+                        </div>
+                        <!-- Inline Metrics -->
+                        <div style="background: linear-gradient(145deg, #f8f9fa, #ffffff); border-radius: 16px; padding: 20px; margin-bottom: 24px;">
+                            <div style="display: flex; justify-content: space-around; text-align: center;">
+                                <div>
+                                    <div style="font-size: 20px; font-weight: 700; color: var(--ios-blue);" id="sentBytes">0</div>
+                                    <div style="font-size: 10px; color: var(--ios-gray); text-transform: uppercase; margin-top: 4px;">KB Sent</div>
+                                </div>
+                                <div style="width: 1px; background: var(--ios-gray-4);"></div>
+                                <div>
+                                    <div style="font-size: 20px; font-weight: 700; color: var(--ios-green);" id="receivedBytes">0</div>
+                                    <div style="font-size: 10px; color: var(--ios-gray); text-transform: uppercase; margin-top: 4px;">KB Received</div>
+                                </div>
+                                <div style="width: 1px; background: var(--ios-gray-4);"></div>
+                                <div>
+                                    <div style="font-size: 20px; font-weight: 700; color: var(--ios-orange);" id="latency">--</div>
+                                    <div style="font-size: 10px; color: var(--ios-gray); text-transform: uppercase; margin-top: 4px;">MS Latency</div>
+                                </div>
+                            </div>
+                        </div>
+                        <!-- Messages Area -->
+                        <div style="margin-top: 20px;">
+                            <div style="display: flex; justify-content: space-between; align-items: center; margin-bottom: 10px;">
+                                <h4 style="color: var(--ios-gray); font-size: 14px; margin: 0;">📝 Conversation History</h4>
+                                <button onclick="clearMessages()" style="background: linear-gradient(145deg, #ff453a, #ff6b6b); color: white; border: none; padding: 6px 12px; border-radius: 8px; font-size: 12px; cursor: pointer;">Clear</button>
+                            </div>
+                            <div id="messagesContainer" style="background: white; border-radius: 12px; padding: 12px; height: 200px; overflow-y: auto; border: 1px solid var(--ios-gray-4); box-shadow: inset 0 2px 5px rgba(0, 0, 0, 0.05);">
+                                <div id="messagesList" style="font-size: 13px; color: var(--ios-gray);">
+                                    <p style="margin: 0; text-align: center; color: var(--ios-gray-3);">No messages yet. Connect and start talking!</p>
+                                </div>
+                            </div>
+                        </div>
+                        <!-- Status Line -->
+                        <div style="text-align: center; margin-top: 15px;">
+                            <div id="recentActivity" style="font-size: 12px; color: var(--ios-gray); padding: 8px; background: rgba(0, 0, 0, 0.02); border-radius: 8px; min-height: 30px; display: flex; align-items: center; justify-content: center;">
+                                <p style="margin: 0;">Ready to connect</p>
+                            </div>
+                        </div>
+                        <!-- Processing Indicator -->
+                        <div id="processingIndicator" style="display: none; margin-top: 20px; text-align: center;">
+                            <div style="background: linear-gradient(145deg, #E8F4FD, #F0F8FF); border-radius: 12px; padding: 16px; display: inline-flex; align-items: center; gap: 12px;">
+                                <div class="processing-spinner" style="width: 24px; height: 24px; border: 3px solid var(--ios-gray-3); border-top-color: var(--ios-blue); border-radius: 50%; animation: spin 1s linear infinite;"></div>
+                                <span style="color: var(--ios-blue); font-size: 14px; font-weight: 500;">Processing audio...</span>
+                            </div>
+                        </div>
+                        <!-- Audio Replay Button (Hidden by default) -->
+                        <div id="audioReplayContainer" style="display: none; margin-top: 20px; text-align: center;">
+                            <button id="replayAudioBtn" class="ios-button" style="background: linear-gradient(145deg, #34C759, #30D158); padding: 12px 24px; font-size: 14px; border-radius: 12px; box-shadow: 0 4px 12px rgba(52, 199, 89, 0.2);">
+                                <span class="material-icons" style="font-size: 18px;">replay</span>
+                                Replay Last Audio
+                            </button>
+                        </div>
+                    </div>
+                </div>
+            </div>
+            <!-- Text to Speech View -->
+            <div class="view-container" id="text-to-speech">
+                <div class="ios-card">
+                    <h2 class="card-title">Text to Speech</h2>
+                    <div style="margin-bottom: 16px;">
+                        <textarea class="tts-textarea" id="ttsText" placeholder="Enter text to convert to speech...">Olá! Este é um teste de voz.</textarea>
+                    </div>
+                    <div style="margin-bottom: 16px;">
+                        <select class="voice-selector" id="voiceSelect">
+                            <optgroup label="Portuguese">
+                                <option value="pf_dora" selected>Female - Dora</option>
+                                <option value="pm_alex">Male - Alex</option>
+                                <option value="pm_santa">Male - Santa</option>
+                            </optgroup>
+                            <optgroup label="English">
+                                <option value="af_bella">Female - Bella</option>
+                                <option value="af_heart">Female - Heart</option>
+                                <option value="am_adam">Male - Adam</option>
+                            </optgroup>
+                        </select>
+                    </div>
+                    <button class="ios-button success" id="ttsPlayBtn" disabled>
+                        <span class="material-icons">play_arrow</span>
+                        Generate Audio
+                    </button>
+                    <div class="loading-spinner" id="ttsLoader"></div>
+                    <div id="ttsPlayer" style="display: none; margin-top: 16px;">
+                        <audio id="ttsAudio" controls style="width: 100%;"></audio>
+                    </div>
+                </div>
+            </div>
+            <!-- Logs View -->
+            <div class="view-container" id="logs">
+                <div class="ios-card">
+                    <h2 class="card-title">Console Output</h2>
+                    <div class="log-container" id="log"></div>
+                    <div style="margin-top: 16px; display: flex; gap: 12px;">
+                        <button class="ios-button" style="background: linear-gradient(145deg, #007AFF, #0051D5);" onclick="copyAllLogs()">
+                            <span class="material-icons">content_copy</span>
+                            Copy All Logs
+                        </button>
+                        <button class="ios-button secondary" onclick="document.getElementById('log').innerHTML = ''; log('Console cleared', 'info');">
+                            <span class="material-icons">clear_all</span>
+                            Clear Logs
+                        </button>
+                    </div>
+                </div>
+            </div>
+            <!-- Settings View -->
+            <div class="view-container" id="settings">
+                <div class="ios-card">
+                    <h2 class="card-title">Voice Settings</h2>
+                    <div class="settings-group">
+                        <div class="settings-label">Default Voice</div>
+                        <select class="voice-selector" id="settingsVoiceSelect">
+                            <optgroup label="Portuguese">
+                                <option value="pf_dora" selected>Female - Dora</option>
+                                <option value="pm_alex">Male - Alex</option>
+                                <option value="pm_santa">Male - Santa</option>
+                            </optgroup>
+                        </select>
+                    </div>
+                    <div class="settings-group">
+                        <div class="settings-label">Audio Settings</div>
+                        <div class="toggle-switch">
+                            <span class="toggle-label">Auto-play responses</span>
+                            <div class="toggle-input checked" id="autoplayToggle"></div>
+                        </div>
+                        <div class="toggle-switch">
+                            <span class="toggle-label">Echo cancellation</span>
+                            <div class="toggle-input checked" id="echoCancelToggle"></div>
+                        </div>
+                        <div class="toggle-switch">
+                            <span class="toggle-label">Noise suppression</span>
+                            <div class="toggle-input checked" id="noiseToggle"></div>
+                        </div>
+                    </div>
+                </div>
+                <div class="ios-card">
+                    <h2 class="card-title">About</h2>
+                    <p style="color: var(--ios-gray); font-size: 14px; line-height: 1.6;">
+                        Ultravox AI Assistant v1.0<br>
+                        Powered by advanced speech recognition and synthesis.<br>
+                        <br>
+                        Format: PCM 16-bit @ 24kHz<br>
+                        Protocol: WebSocket + gRPC
+                    </p>
+                </div>
+            </div>
+        </main>
+    </div>
+    <!-- Hidden selects for compatibility -->
+    <select id="ttsVoiceSelect" style="display: none;">
+        <option value="pf_dora" selected>pf_dora</option>
+        <option value="pm_alex">pm_alex</option>
+        <option value="pm_santa">pm_santa</option>
+        <option value="af_bella">af_bella</option>
+        <option value="af_heart">af_heart</option>
+        <option value="am_adam">am_adam</option>
+    </select>
+    <script>
+        // Navigation
+        const navItems = document.querySelectorAll('.nav-item');
+        const viewContainers = document.querySelectorAll('.view-container');
+        const headerTitle = document.getElementById('headerTitle');
+        const sidebar = document.getElementById('sidebar');
+        const overlay = document.getElementById('overlay');
+        const menuToggle = document.getElementById('menuToggle');
+        // Handle navigation
+        navItems.forEach(item => {
+            item.addEventListener('click', (e) => {
+                e.preventDefault();
+                const viewId = item.dataset.view;
+                // Update active nav
+                navItems.forEach(nav => nav.classList.remove('active'));
+                item.classList.add('active');
+                // Update active view
+                viewContainers.forEach(view => view.classList.remove('active'));
+                document.getElementById(viewId).classList.add('active');
+                // Update header title
+                headerTitle.textContent = item.textContent.trim();
+                // Close mobile menu
+                if (window.innerWidth <= 768) {
+                    sidebar.classList.remove('open');
+                    overlay.classList.remove('active');
+                }
+            });
+        });
+        // Mobile menu toggle
+        menuToggle.addEventListener('click', () => {
+            sidebar.classList.toggle('open');
+            overlay.classList.toggle('active');
+        });
+        overlay.addEventListener('click', () => {
+            sidebar.classList.remove('open');
+            overlay.classList.remove('active');
+        });
+        // Toggle switches
+        document.querySelectorAll('.toggle-input').forEach(toggle => {
+            toggle.addEventListener('click', () => {
+                toggle.classList.toggle('checked');
+            });
+        });
+        // Sync voice selectors
+        const voiceSelects = [
+            document.getElementById('voiceSelect'),
+            document.getElementById('settingsVoiceSelect'),
+            document.getElementById('quickVoiceSelect')
+        ];
+        voiceSelects.forEach(select => {
+            if (select) {
+                select.addEventListener('change', () => {
+                    const value = select.value;
+                    voiceSelects.forEach(s => {
+                        if (s) s.value = value;
+                    });
+                    document.getElementById('ttsVoiceSelect').value = value;
+                    document.getElementById('currentVoice').textContent = value.split('_')[1] || value;
+                    // Update recent activity
+                    const recentActivity = document.getElementById('recentActivity');
+                    if (recentActivity) {
+                        const time = new Date().toLocaleTimeString('pt-BR', { hour: '2-digit', minute: '2-digit' });
+                        recentActivity.innerHTML = `<p style="margin: 0; color: var(--ios-blue);">${time} - Voice changed to ${value}</p>` + recentActivity.innerHTML;
+                    }
+                    if (ws && ws.readyState === WebSocket.OPEN) {
+                        ws.send(JSON.stringify({
+                            type: 'set-voice',
+                            voice_id: value
+                        }));
+                        log(`Voice changed to: ${value}`, 'info');
+                    }
+                });
+            }
+        });
+        // ========= ORIGINAL WEBSOCKET AND AUDIO CODE =========
+        // Estado da aplicação
+        let ws = null;
+        let isConnected = false;
+        let isRecording = false;
+        let audioContext = null;
+        let stream = null;
+        let audioSource = null;
+        let audioProcessor = null;
+        let pcmBuffer = [];
+        // Métricas
+        const metrics = {
+            sentBytes: 0,
+            receivedBytes: 0,
+            latency: 0,
+            recordingStartTime: 0
+        };
+        // Elementos DOM
+        const elements = {
+            statusDot: document.getElementById('statusDot'),
+            statusText: document.getElementById('statusText'),
+            connectBtn: document.getElementById('connectBtn'),
+            talkBtn: document.getElementById('talkBtn'),
+            voiceSelect: document.getElementById('voiceSelect'),
+            sentBytes: document.getElementById('sentBytes'),
+            receivedBytes: document.getElementById('receivedBytes'),
+            latency: document.getElementById('latency'),
+            log: document.getElementById('log'),
+            // TTS elements
+            ttsText: document.getElementById('ttsText'),
+            ttsVoiceSelect: document.getElementById('ttsVoiceSelect'),
+            ttsPlayBtn: document.getElementById('ttsPlayBtn'),
+            ttsLoader: document.getElementById('ttsLoader'),
+            ttsPlayer: document.getElementById('ttsPlayer'),
+            ttsAudio: document.getElementById('ttsAudio')
+        };
+        // Log no console visual
+        function log(message, type = 'info') {
+            const time = new Date().toLocaleTimeString('pt-BR');
+            const entry = document.createElement('div');
+            entry.className = `log-entry ${type}`;
+            entry.innerHTML = `
+                <span class="log-time">[${time}]</span>
+                <span class="log-message">${message}</span>
+            `;
+            elements.log.appendChild(entry);
+            elements.log.scrollTop = elements.log.scrollHeight;
+            console.log(`[${type}] ${message}`);
+            // Update recent activity in Push to Talk view
+            const recentActivity = document.getElementById('recentActivity');
+            if (recentActivity && (type === 'success' || type === 'info')) {
+                const shortTime = new Date().toLocaleTimeString('pt-BR', { hour: '2-digit', minute: '2-digit' });
+                const color = type === 'success' ? 'var(--ios-green)' : 'var(--ios-gray)';
+                const shortMessage = message.length > 50 ? message.substring(0, 50) + '...' : message;
+                recentActivity.innerHTML = `<p style="margin: 0; color: ${color};">${shortTime} - ${shortMessage}</p>`;
+            }
+        }
+        // Atualizar métricas
+        function updateMetrics() {
+            elements.sentBytes.textContent = `${(metrics.sentBytes / 1024).toFixed(1)}`;
+            elements.receivedBytes.textContent = `${(metrics.receivedBytes / 1024).toFixed(1)}`;
+            elements.latency.textContent = `${metrics.latency}`;
+        }
+        // Conectar ao WebSocket
+        async function connect() {
+            try {
+                // Solicitar acesso ao microfone
+                stream = await navigator.mediaDevices.getUserMedia({
+                    audio: {
+                        echoCancellation: true,
+                        noiseSuppression: true,
+                        sampleRate: 24000
+                    }
+                });
+                log('Microphone accessed', 'success');
+                // Conectar WebSocket
+                const protocol = window.location.protocol === 'https:' ? 'wss:' : 'ws:';
+                const wsUrl = `${protocol}//${window.location.host}/ws`;
+                ws = new WebSocket(wsUrl);
+                ws.binaryType = 'arraybuffer';
+                ws.onopen = () => {
+                    isConnected = true;
+                    elements.statusDot.classList.add('connected');
+                    elements.statusText.textContent = 'Connected';
+                    elements.connectBtn.innerHTML = '<span class="material-icons">power_settings_new</span>Disconnect';
+                    elements.connectBtn.style.background = 'linear-gradient(145deg, #FF3B30, #FF453A)';
+                    elements.talkBtn.disabled = false;
+                    elements.talkBtn.style.opacity = '1';
+                    elements.talkBtn.style.cursor = 'pointer';
+                    elements.talkBtn.innerHTML = '<span class="material-icons">mic</span><span style="font-size: 11px; text-transform: uppercase; letter-spacing: 1px;">Hold</span>';
+                    document.getElementById('pttBadge').style.display = 'block';
+                    // Enable voice selector
+                    const quickVoiceSelect = document.getElementById('quickVoiceSelect');
+                    if (quickVoiceSelect) {
+                        quickVoiceSelect.disabled = false;
+                        quickVoiceSelect.style.opacity = '1';
+                        quickVoiceSelect.style.cursor = 'pointer';
+                    }
+                    // Hide connection message
+                    const connectionMessage = document.getElementById('connectionMessage');
+                    if (connectionMessage) {
+                        connectionMessage.style.display = 'none';
+                    }
+                    // Enviar voz selecionada
+                    const currentVoice = elements.voiceSelect.value || 'pf_dora';
+                    ws.send(JSON.stringify({
+                        type: 'set-voice',
+                        voice_id: currentVoice
+                    }));
+                    elements.ttsPlayBtn.disabled = false;
+                    log('Connected to server', 'success');
+                };
+                ws.onmessage = (event) => {
+                    if (event.data instanceof ArrayBuffer) {
+                        handlePCMAudio(event.data);
+                    } else {
+                        const data = JSON.parse(event.data);
+                        handleMessage(data);
+                    }
+                };
+                ws.onerror = (error) => {
+                    log(`WebSocket error: ${error}`, 'error');
+                };
+                ws.onclose = () => {
+                    disconnect();
+                };
+            } catch (error) {
+                log(`Connection error: ${error.message}`, 'error');
+            }
+        }
+        // Desconectar
+        function disconnect() {
+            isConnected = false;
+            if (ws) {
+                ws.close();
+                ws = null;
+            }
+            if (stream) {
+                stream.getTracks().forEach(track => track.stop());
+                stream = null;
+            }
+            if (audioContext) {
+                audioContext.close();
+                audioContext = null;
+            }
+            elements.statusDot.classList.remove('connected');
+            elements.statusText.textContent = 'Disconnected';
+            elements.connectBtn.innerHTML = '<span class="material-icons">wifi</span>Connect Now';
+            elements.connectBtn.style.background = 'linear-gradient(145deg, #34C759, #30D158)';
+            elements.talkBtn.disabled = true;
+            elements.talkBtn.style.opacity = '0.3';
+            elements.talkBtn.style.cursor = 'not-allowed';
+            elements.talkBtn.innerHTML = '<span class="material-icons">mic_off</span><span style="font-size: 11px; text-transform: uppercase; letter-spacing: 1px;">Offline</span>';
+            document.getElementById('pttBadge').style.display = 'none';
+            // Disable voice selector
+            const quickVoiceSelect = document.getElementById('quickVoiceSelect');
+            if (quickVoiceSelect) {
+                quickVoiceSelect.disabled = true;
+                quickVoiceSelect.style.opacity = '0.5';
+                quickVoiceSelect.style.cursor = 'not-allowed';
+            }
+            // Show connection message
+            const connectionMessage = document.getElementById('connectionMessage');
+            if (connectionMessage) {
+                connectionMessage.style.display = 'block';
+            }
+            // Hide replay button
+            const audioReplayContainer = document.getElementById('audioReplayContainer');
+            if (audioReplayContainer) {
+                audioReplayContainer.style.display = 'none';
+            }
+            log('Disconnected', 'warning');
+        }
+        // Variáveis para MediaRecorder
+        let mediaRecorder = null;
+        let audioChunks = [];
+        // Iniciar gravação com PCM (Opus desabilitado temporariamente)
+        function startRecording() {
+            if (isRecording) return;
+            isRecording = true;
+            audioChunks = [];
+            pcmBuffer = [];
+            metrics.recordingStartTime = Date.now();
+            elements.talkBtn.classList.add('recording');
+            elements.talkBtn.innerHTML = '<span class="material-icons">stop</span><span>Recording</span>';
+            // FORÇAR USO DE PCM - Opus está com problemas no servidor
+            const usingOpus = false;
+            // Usar apenas PCM
+            if (!usingOpus) {
+                if (!audioContext) {
+                    audioContext = new (window.AudioContext || window.webkitAudioContext)({
+                        sampleRate: 24000
+                    });
+                }
+                audioSource = audioContext.createMediaStreamSource(stream);
+                audioProcessor = audioContext.createScriptProcessor(4096, 1, 1);
+                audioProcessor.onaudioprocess = (e) => {
+                    if (!isRecording) return;
+                    const inputData = e.inputBuffer.getChannelData(0);
+                    // Calculate RMS
+                    let sumSquares = 0;
+                    for (let i = 0; i < inputData.length; i++) {
+                        sumSquares += inputData[i] * inputData[i];
+                    }
+                    const rms = Math.sqrt(sumSquares / inputData.length);
+                    const voiceThreshold = 0.01;
+                    const hasVoice = rms > voiceThreshold;
+                    let gain = 1.0;
+                    if (hasVoice && rms < 0.05) {
+                        gain = Math.min(5.0, 0.05 / rms);
+                    }
+                    // Convert to PCM
+                    const pcmData = new Int16Array(inputData.length);
+                    for (let i = 0; i < inputData.length; i++) {
+                        let sample = inputData[i] * gain;
+                        if (Math.abs(sample) > 0.95) {
+                            sample = Math.sign(sample) * (0.95 + 0.05 * Math.tanh((Math.abs(sample) - 0.95) * 10));
+                        }
+                        sample = Math.max(-1, Math.min(1, sample));
+                        pcmData[i] = sample < 0 ? sample * 0x8000 : sample * 0x7FFF;
+                    }
+                    if (hasVoice) {
+                        pcmBuffer.push(pcmData);
+                    }
+                };
+                audioSource.connect(audioProcessor);
+                audioProcessor.connect(audioContext.destination);
+                log('Recording with PCM 16-bit @ 24kHz', 'info');
+            }
+        }
+        // Enviar áudio Opus para o servidor
+        async function sendOpusAudioToServer(audioBlob) {
+            if (!ws || ws.readyState !== WebSocket.OPEN) {
+                log('WebSocket not connected', 'error');
+                return;
+            }
+            // Show processing indicator
+            const processingIndicator = document.getElementById('processingIndicator');
+            if (processingIndicator) {
+                processingIndicator.style.display = 'block';
+            }
+            // Update recent activity
+            const recentActivity = document.getElementById('recentActivity');
+            if (recentActivity) {
+                recentActivity.innerHTML = '<p style="margin: 0; color: var(--ios-blue);">⏳ Sending Opus audio to server...</p>';
+            }
+            try {
+                // Converter Blob para ArrayBuffer
+                const arrayBuffer = await audioBlob.arrayBuffer();
+                const uint8Array = new Uint8Array(arrayBuffer);
+                // Criar header para Opus (similar ao PCM mas com tipo diferente)
+                const header = new ArrayBuffer(8);
+                const view = new DataView(header);
+                view.setUint32(0, 0x4F505553); // 'OPUS' em hex
+                view.setUint32(4, uint8Array.length);
+                // Enviar header e dados
+                ws.send(header);
+                ws.send(uint8Array);
+                // Update metrics
+                metrics.totalBytesSent += uint8Array.length;
+                updateMetrics();
+                log(`Sent Opus audio: ${(uint8Array.length / 1024).toFixed(2)} KB`, 'info');
+            } catch (error) {
+                log('Error sending Opus audio: ' + error.message, 'error');
+                console.error('Opus send error:', error);
+            }
+        }
+        // Parar gravação
+        function stopRecording() {
+            if (!isRecording) return;
+            isRecording = false;
+            elements.talkBtn.classList.remove('recording');
+            elements.talkBtn.innerHTML = '<span class="material-icons">mic</span><span>Hold</span>';
+            // Sempre usar PCM
+            if (audioProcessor) {
+                audioProcessor.disconnect();
+                audioProcessor = null;
+            }
+            if (audioSource) {
+                audioSource.disconnect();
+                audioSource = null;
+            }
+            if (pcmBuffer.length === 0) {
+                log('No audio captured', 'warning');
+                return;
+            }
+            // Combine PCM chunks
+            const totalLength = pcmBuffer.reduce((acc, chunk) => acc + chunk.length, 0);
+            const fullPCM = new Int16Array(totalLength);
+            let offset = 0;
+            for (const chunk of pcmBuffer) {
+                fullPCM.set(chunk, offset);
+                offset += chunk.length;
+            }
+            // Send PCM
+            if (ws && ws.readyState === WebSocket.OPEN) {
+                // Show processing indicator
+                const processingIndicator = document.getElementById('processingIndicator');
+                if (processingIndicator) {
+                    processingIndicator.style.display = 'block';
+                }
+                // Update recent activity
+                const recentActivity = document.getElementById('recentActivity');
+                if (recentActivity) {
+                    recentActivity.innerHTML = '<p style="margin: 0; color: var(--ios-blue);">⏳ Sending audio to server...</p>';
+                }
+                const header = new ArrayBuffer(8);
+                const view = new DataView(header);
+                view.setUint32(0, 0x50434D16);
+                view.setUint32(4, fullPCM.length * 2);
+                ws.send(header);
+                ws.send(fullPCM.buffer);
+                metrics.sentBytes += fullPCM.length * 2;
+                updateMetrics();
+                log(`PCM sent: ${(fullPCM.length * 2 / 1024).toFixed(1)}KB`, 'success');
+            }
+            pcmBuffer = [];
+        }
+        // Process messages
+        function handleMessage(data) {
+            switch (data.type) {
+                case 'metrics':
+                    metrics.latency = data.latency;
+                    updateMetrics();
+                    // Hide processing indicator when we get metrics (means processing is done)
+                    const processingIndicator = document.getElementById('processingIndicator');
+                    if (processingIndicator) {
+                        processingIndicator.style.display = 'none';
+                    }
+                    // Update recent activity with response
+                    const recentActivity = document.getElementById('recentActivity');
+                    if (recentActivity) {
+                        recentActivity.innerHTML = `<p style="margin: 0; color: var(--ios-green);">✅ Response received (${data.latency}ms)</p>`;
+                    }
+                    // Add to messages container
+                    const messagesList = document.getElementById('messagesList');
+                    if (messagesList) {
+                        // Clear initial message if it's the first message
+                        if (messagesList.innerHTML.includes('No messages yet')) {
+                            messagesList.innerHTML = '';
+                        }
+                        // Add user message (audio)
+                        const userMsg = document.createElement('div');
+                        userMsg.style.cssText = 'margin-bottom: 10px; padding: 8px; background: linear-gradient(145deg, #007AFF, #0051D5); border-radius: 12px; color: white; word-wrap: break-word;';
+                        userMsg.innerHTML = `<strong>🎵 You:</strong> [Audio message sent]`;
+                        messagesList.appendChild(userMsg);
+                        // Add assistant response (full message)
+                        const assistantMsg = document.createElement('div');
+                        assistantMsg.style.cssText = 'margin-bottom: 10px; padding: 8px; background: rgba(52, 199, 89, 0.1); border-radius: 12px; color: #333; word-wrap: break-word;';
+                        assistantMsg.innerHTML = `<strong>🤖 Assistant:</strong> ${data.response}`;
+                        messagesList.appendChild(assistantMsg);
+                        // Add timestamp
+                        const timestamp = document.createElement('div');
+                        timestamp.style.cssText = 'font-size: 11px; color: var(--ios-gray-3); text-align: right; margin-bottom: 5px;';
+                        timestamp.innerHTML = new Date().toLocaleTimeString('pt-BR', { hour: '2-digit', minute: '2-digit', second: '2-digit' });
+                        messagesList.appendChild(timestamp);
+                        // Scroll to bottom
+                        const container = document.getElementById('messagesContainer');
+                        if (container) {
+                            container.scrollTop = container.scrollHeight;
+                        }
+                    }
+                    log(`Response: "${data.response}" (${data.latency}ms)`, 'success');
+                    break;
+                case 'error':
+                    log(`Error: ${data.message}`, 'error');
+                    break;
+                case 'tts-response':
+                    if (data.audio) {
+                        const binaryString = atob(data.audio);
+                        const bytes = new Uint8Array(binaryString.length);
+                        for (let i = 0; i < binaryString.length; i++) {
+                            bytes[i] = binaryString.charCodeAt(i);
+                        }
+                        const sampleRate = data.sampleRate || 24000;
+                        const wavBuffer = addWavHeader(bytes.buffer, sampleRate);
+                        const blob = new Blob([wavBuffer], { type: 'audio/wav' });
+                        const audioUrl = URL.createObjectURL(blob);
+                        elements.ttsAudio.src = audioUrl;
+                        elements.ttsPlayer.style.display = 'block';
+                        elements.ttsLoader.classList.remove('active');
+                        elements.ttsPlayBtn.disabled = false;
+                        elements.ttsPlayBtn.innerHTML = '<span class="material-icons">play_arrow</span>Generate Audio';
+                        log('TTS audio generated', 'success');
+                    }
+                    break;
+            }
+        }
+        // Global variable to store last audio URL
+        let lastAudioUrl = null;
+        // Handle PCM audio
+        function handlePCMAudio(arrayBuffer) {
+            metrics.receivedBytes += arrayBuffer.byteLength;
+            updateMetrics();
+            // Hide processing indicator
+            const processingIndicator = document.getElementById('processingIndicator');
+            if (processingIndicator) {
+                processingIndicator.style.display = 'none';
+            }
+            const wavBuffer = addWavHeader(arrayBuffer);
+            const blob = new Blob([wavBuffer], { type: 'audio/wav' });
+            const audioUrl = URL.createObjectURL(blob);
+            // Store last audio URL
+            lastAudioUrl = audioUrl;
+            // Show replay button
+            const replayContainer = document.getElementById('audioReplayContainer');
+            if (replayContainer) {
+                replayContainer.style.display = 'block';
+            }
+            const time = new Date().toLocaleTimeString('pt-BR');
+            const entry = document.createElement('div');
+            entry.className = 'log-entry success';
+            entry.innerHTML = `
+                <span class="log-time">[${time}]</span>
+                <span class="log-message">🔊 Audio received: ${(arrayBuffer.byteLength / 1024).toFixed(1)}KB</span>
+                <div class="audio-player">
+                    <button class="play-btn" onclick="playAudio('${audioUrl}')">▶️ Play</button>
+                </div>
+            `;
+            elements.log.appendChild(entry);
+            elements.log.scrollTop = elements.log.scrollHeight;
+            // Update recent activity
+            const recentActivity = document.getElementById('recentActivity');
+            if (recentActivity) {
+                recentActivity.innerHTML = `<p style="margin: 0; color: var(--ios-green);">✅ Response received - ${(arrayBuffer.byteLength / 1024).toFixed(1)}KB</p>`;
+            }
+            // Always try to auto-play
+            const audio = new Audio(audioUrl);
+            audio.play().then(() => {
+                console.log('Audio playing automatically');
+            }).catch(err => {
+                console.log('Auto-play blocked, use replay button');
+                // Flash the replay button to draw attention
+                const replayBtn = document.getElementById('replayAudioBtn');
+                if (replayBtn) {
+                    replayBtn.style.animation = 'pulse 1s ease-in-out 2';
+                    setTimeout(() => {
+                        replayBtn.style.animation = '';
+                    }, 2000);
+                }
+            });
+        }
+        // Play audio
+        function playAudio(url) {
+            const audio = new Audio(url);
+            audio.play();
+        }
+        // Add WAV header
+        function addWavHeader(pcmBuffer, customSampleRate) {
+            const pcmData = new Uint8Array(pcmBuffer);
+            const wavBuffer = new ArrayBuffer(44 + pcmData.length);
+            const view = new DataView(wavBuffer);
+            const writeString = (offset, string) => {
+                for (let i = 0; i < string.length; i++) {
+                    view.setUint8(offset + i, string.charCodeAt(i));
+                }
+            };
+            writeString(0, 'RIFF');
+            view.setUint32(4, 36 + pcmData.length, true);
+            writeString(8, 'WAVE');
+            writeString(12, 'fmt ');
+            view.setUint32(16, 16, true);
+            view.setUint16(20, 1, true);
+            view.setUint16(22, 1, true);
+            const sampleRate = customSampleRate || 24000;
+            view.setUint32(24, sampleRate, true);
+            view.setUint32(28, sampleRate * 2, true);
+            view.setUint16(32, 2, true);
+            view.setUint16(34, 16, true);
+            writeString(36, 'data');
+            view.setUint32(40, pcmData.length, true);
+            new Uint8Array(wavBuffer, 44).set(pcmData);
+            return wavBuffer;
+        }
+        // Event Listeners
+        elements.connectBtn.addEventListener('click', () => {
+            if (isConnected) {
+                disconnect();
+            } else {
+                connect();
+            }
+        });
+        elements.talkBtn.addEventListener('mousedown', startRecording);
+        elements.talkBtn.addEventListener('mouseup', stopRecording);
+        elements.talkBtn.addEventListener('mouseleave', stopRecording);
+        elements.talkBtn.addEventListener('touchstart', startRecording);
+        elements.talkBtn.addEventListener('touchend', stopRecording);
+        // TTS Button
+        elements.ttsPlayBtn.addEventListener('click', (e) => {
+            e.preventDefault();
+            const text = elements.ttsText.value.trim();
+            const voice = elements.ttsVoiceSelect.value;
+            if (!text) {
+                alert('Please enter some text');
+                return;
+            }
+            if (!ws || ws.readyState !== WebSocket.OPEN) {
+                alert('Please connect first');
+                return;
+            }
+            elements.ttsLoader.classList.add('active');
+            elements.ttsPlayBtn.disabled = true;
+            elements.ttsPlayBtn.innerHTML = '<span class="material-icons">hourglass_empty</span>Processing...';
+            elements.ttsPlayer.style.display = 'none';
+            ws.send(JSON.stringify({
+                type: 'text-to-speech',
+                text: text,
+                voice_id: voice,
+                quality: 'high',
+                format: 'opus'
+            }));
+            log(`TTS requested: voice=${voice}`, 'info');
+        });
+        // Replay button event listener
+        const replayBtn = document.getElementById('replayAudioBtn');
+        if (replayBtn) {
+            replayBtn.addEventListener('click', () => {
+                if (lastAudioUrl) {
+                    const audio = new Audio(lastAudioUrl);
+                    audio.play().then(() => {
+                        log('Replaying last audio', 'info');
+                    }).catch(err => {
+                        log('Error playing audio', 'error');
+                    });
+                } else {
+                    log('No audio to replay', 'warning');
+                }
+            });
+        }
+        // Pull to Refresh Implementation
+        let startY = 0;
+        let pullDistance = 0;
+        let isPulling = false;
+        const pullThreshold = 100;
+        const pullToRefreshEl = document.getElementById('pullToRefresh');
+        // Touch events for pull to refresh
+        document.addEventListener('touchstart', (e) => {
+            if (window.scrollY === 0) {
+                startY = e.touches[0].pageY;
+                isPulling = true;
+            }
+        }, { passive: true });
+        document.addEventListener('touchmove', (e) => {
+            if (!isPulling) return;
+            const currentY = e.touches[0].pageY;
+            pullDistance = currentY - startY;
+            if (pullDistance > 0 && window.scrollY === 0) {
+                e.preventDefault();
+                // Show pull to refresh indicator
+                if (pullDistance > 20) {
+                    pullToRefreshEl.classList.add('show');
+                    // Update text based on pull distance
+                    const pullText = pullToRefreshEl.querySelector('.pull-to-refresh-text');
+                    if (pullDistance > pullThreshold) {
+                        pullText.textContent = 'Release to refresh';
+                    } else {
+                        pullText.textContent = 'Pull to refresh';
+                    }
+                    // Apply transform based on pull distance (with resistance)
+                    const resistance = Math.min(pullDistance / 3, 60);
+                    pullToRefreshEl.style.transform = `translateY(${60 + resistance}px)`;
+                }
+            }
+        }, { passive: false });
+        document.addEventListener('touchend', () => {
+            if (!isPulling) return;
+            if (pullDistance > pullThreshold) {
+                // Trigger refresh
+                pullToRefreshEl.classList.add('refreshing');
+                pullToRefreshEl.querySelector('.pull-to-refresh-text').textContent = 'Refreshing...';
+                // Reload page after animation
+                setTimeout(() => {
+                    window.location.reload();
+                }, 1000);
+            } else {
+                // Hide pull to refresh
+                pullToRefreshEl.classList.remove('show');
+                pullToRefreshEl.style.transform = '';
+            }
+            isPulling = false;
+            pullDistance = 0;
+        });
+        // Mouse events for desktop testing
+        let mouseDown = false;
+        let mouseStartY = 0;
+        document.addEventListener('mousedown', (e) => {
+            if (window.scrollY === 0) {
+                mouseStartY = e.pageY;
+                mouseDown = true;
+            }
+        });
+        document.addEventListener('mousemove', (e) => {
+            if (!mouseDown) return;
+            const currentY = e.pageY;
+            const distance = currentY - mouseStartY;
+            if (distance > 0 && window.scrollY === 0) {
+                e.preventDefault();
+                if (distance > 20) {
+                    pullToRefreshEl.classList.add('show');
+                    const pullText = pullToRefreshEl.querySelector('.pull-to-refresh-text');
+                    if (distance > pullThreshold) {
+                        pullText.textContent = 'Release to refresh';
+                        pullToRefreshEl.classList.add('ready');
+                    } else {
+                        pullText.textContent = 'Pull to refresh';
+                        pullToRefreshEl.classList.remove('ready');
+                    }
+                    const resistance = Math.min(distance / 3, 60);
+                    pullToRefreshEl.style.transform = `translateY(${60 + resistance}px)`;
+                }
+            }
+        });
+        document.addEventListener('mouseup', () => {
+            if (!mouseDown) return;
+            const distance = mouseStartY ? event.pageY - mouseStartY : 0;
+            if (distance > pullThreshold) {
+                pullToRefreshEl.classList.add('refreshing');
+                pullToRefreshEl.querySelector('.pull-to-refresh-text').textContent = 'Refreshing...';
+                setTimeout(() => {
+                    window.location.reload();
+                }, 1000);
+            } else {
+                pullToRefreshEl.classList.remove('show', 'ready');
+                pullToRefreshEl.style.transform = '';
+            }
+            mouseDown = false;
+            mouseStartY = 0;
+        });
+        // Clear messages function
+        function clearMessages() {
+            const messagesList = document.getElementById('messagesList');
+            if (messagesList) {
+                messagesList.innerHTML = '<p style="margin: 0; text-align: center; color: var(--ios-gray-3);">No messages yet. Connect and start talking!</p>';
+            }
+        }
+        // Copy all logs function
+        function copyAllLogs() {
+            const logContainer = document.getElementById('log');
+            if (!logContainer) {
+                alert('No logs to copy');
+                return;
+            }
+            // Get all log entries
+            const logEntries = logContainer.querySelectorAll('.log-entry');
+            let logsText = '';
+            // Build text from all log entries
+            logEntries.forEach(entry => {
+                const time = entry.querySelector('.log-time')?.textContent || '';
+                const message = entry.querySelector('.log-message')?.textContent || '';
+                logsText += `${time} ${message}\n`;
+            });
+            if (!logsText) {
+                alert('No logs to copy');
+                return;
+            }
+            // Copy to clipboard
+            if (navigator.clipboard && navigator.clipboard.writeText) {
+                navigator.clipboard.writeText(logsText).then(() => {
+                    // Visual feedback
+                    const copyBtn = event.target.closest('button');
+                    const originalHTML = copyBtn.innerHTML;
+                    copyBtn.innerHTML = '<span class="material-icons">check</span>Copied!';
+                    copyBtn.style.background = 'linear-gradient(145deg, #34C759, #30D158)';
+                    setTimeout(() => {
+                        copyBtn.innerHTML = originalHTML;
+                        copyBtn.style.background = 'linear-gradient(145deg, #007AFF, #0051D5)';
+                    }, 2000);
+                    log('Logs copied to clipboard', 'success');
+                }).catch(err => {
+                    // Fallback method
+                    fallbackCopyToClipboard(logsText);
+                });
+            } else {
+                // Fallback for older browsers
+                fallbackCopyToClipboard(logsText);
+            }
+        }
+        // Fallback copy method for older browsers
+        function fallbackCopyToClipboard(text) {
+            const textArea = document.createElement('textarea');
+            textArea.value = text;
+            textArea.style.position = 'fixed';
+            textArea.style.top = '-9999px';
+            document.body.appendChild(textArea);
+            textArea.focus();
+            textArea.select();
+            try {
+                const successful = document.execCommand('copy');
+                if (successful) {
+                    log('Logs copied to clipboard (fallback)', 'success');
+                } else {
+                    alert('Failed to copy logs');
+                }
+            } catch (err) {
+                alert('Failed to copy logs: ' + err);
+            }
+            document.body.removeChild(textArea);
+        }
+        // Initialize
+        log('Ultravox AI Assistant initialized', 'info');
+        log('Format: PCM 16-bit @ 24kHz', 'info');
+        log('Pull down to refresh the page', 'info');
+    </script>
+</body>
+</html>

services/webrtc_gateway/ultravox-chat-material.html ADDED Viewed

	@@ -0,0 +1,1116 @@

+<!DOCTYPE html>
+<html lang="pt-BR">
+<head>
+    <meta charset="UTF-8">
+    <meta name="viewport" content="width=device-width, initial-scale=1.0">
+    <title>Ultravox Chat PCM - Material Design</title>
+    <!-- Material Design CSS via CDN -->
+    <link href="https://unpkg.com/material-components-web@latest/dist/material-components-web.min.css" rel="stylesheet">
+    <link href="https://fonts.googleapis.com/icon?family=Material+Icons" rel="stylesheet">
+    <link href="https://fonts.googleapis.com/css2?family=Roboto:wght@300;400;500;700&display=swap" rel="stylesheet">
+    <!-- Opus Decoder -->
+    <script src="opus-decoder.js"></script>
+    <style>
+        :root {
+            --mdc-theme-primary: #6200ee;
+            --mdc-theme-secondary: #03dac6;
+            --mdc-theme-error: #b00020;
+            --mdc-theme-surface: #ffffff;
+            --mdc-theme-background: #f5f5f5;
+        }
+        * {
+            margin: 0;
+            padding: 0;
+            box-sizing: border-box;
+        }
+        body {
+            font-family: 'Roboto', sans-serif;
+            background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
+            min-height: 100vh;
+            padding: 20px;
+        }
+        .main-container {
+            max-width: 1200px;
+            margin: 0 auto;
+        }
+        .mdc-card {
+            margin-bottom: 20px;
+            padding: 24px;
+        }
+        .header-title {
+            font-size: 28px;
+            font-weight: 500;
+            color: #333;
+            margin-bottom: 24px;
+            display: flex;
+            align-items: center;
+            gap: 12px;
+        }
+        .status-chip {
+            display: inline-flex;
+            align-items: center;
+            padding: 8px 16px;
+            border-radius: 16px;
+            background: #f5f5f5;
+            margin-bottom: 16px;
+        }
+        .status-dot {
+            width: 12px;
+            height: 12px;
+            border-radius: 50%;
+            background: #dc3545;
+            margin-right: 8px;
+            display: inline-block;
+        }
+        .status-dot.connected {
+            background: #28a745;
+            animation: pulse 2s infinite;
+        }
+        @keyframes pulse {
+            0% { box-shadow: 0 0 0 0 rgba(40, 167, 69, 0.7); }
+            70% { box-shadow: 0 0 0 10px rgba(40, 167, 69, 0); }
+            100% { box-shadow: 0 0 0 0 rgba(40, 167, 69, 0); }
+        }
+        .controls-grid {
+            display: grid;
+            grid-template-columns: repeat(auto-fit, minmax(200px, 1fr));
+            gap: 16px;
+            margin-bottom: 24px;
+        }
+        .voice-selector-container {
+            margin-bottom: 24px;
+        }
+        .metrics-grid {
+            display: grid;
+            grid-template-columns: repeat(auto-fit, minmax(150px, 1fr));
+            gap: 16px;
+            margin-bottom: 24px;
+        }
+        .metric-card {
+            background: #f8f9fa;
+            padding: 16px;
+            border-radius: 8px;
+            text-align: center;
+        }
+        .metric-label {
+            font-size: 12px;
+            color: #6c757d;
+            margin-bottom: 8px;
+            text-transform: uppercase;
+            letter-spacing: 0.5px;
+        }
+        .metric-value {
+            font-size: 24px;
+            font-weight: 500;
+            color: #333;
+        }
+        .log-container {
+            background: #1e1e1e;
+            border-radius: 8px;
+            padding: 16px;
+            height: 300px;
+            overflow-y: auto;
+            font-family: 'Monaco', 'Menlo', monospace;
+            font-size: 12px;
+        }
+        .log-entry {
+            padding: 4px 0;
+            display: flex;
+            align-items: flex-start;
+            color: #e0e0e0;
+        }
+        .log-time {
+            color: #6c757d;
+            margin-right: 10px;
+            flex-shrink: 0;
+        }
+        .log-message {
+            flex: 1;
+        }
+        .log-entry.error { color: #ff5252; }
+        .log-entry.success { color: #69f0ae; }
+        .log-entry.info { color: #448aff; }
+        .log-entry.warning { color: #ffd740; }
+        .tts-textarea {
+            width: 100%;
+            min-height: 120px;
+            padding: 12px;
+            border: 1px solid #ddd;
+            border-radius: 4px;
+            font-family: 'Roboto', sans-serif;
+            font-size: 14px;
+            resize: vertical;
+            margin-bottom: 16px;
+        }
+        .tts-textarea:focus {
+            outline: none;
+            border-color: var(--mdc-theme-primary);
+        }
+        .audio-player {
+            display: inline-flex;
+            align-items: center;
+            gap: 10px;
+            margin-left: 10px;
+        }
+        .play-btn {
+            background: #007bff;
+            color: white;
+            border: none;
+            border-radius: 4px;
+            padding: 4px 8px;
+            cursor: pointer;
+            font-size: 12px;
+        }
+        .play-btn:hover {
+            background: #0056b3;
+        }
+        .mdc-button.recording {
+            background: #dc3545 !important;
+            animation: recordPulse 1s infinite;
+        }
+        @keyframes recordPulse {
+            0%, 100% { opacity: 1; }
+            50% { opacity: 0.7; }
+        }
+        #ttsStatus {
+            padding: 16px;
+            background: #f5f5f5;
+            border-radius: 8px;
+            margin-top: 16px;
+        }
+        #ttsPlayer {
+            margin-top: 16px;
+        }
+        #ttsPlayer audio {
+            width: 100%;
+        }
+        /* Mobile responsive */
+        @media (max-width: 600px) {
+            .main-container {
+                padding: 0;
+            }
+            .mdc-card {
+                border-radius: 0;
+                margin-bottom: 8px;
+            }
+            .header-title {
+                font-size: 24px;
+            }
+            .metrics-grid {
+                grid-template-columns: repeat(2, 1fr);
+            }
+        }
+    </style>
+</head>
+<body>
+    <div class="main-container">
+        <!-- Main Card -->
+        <div class="mdc-card mdc-elevation--z8">
+            <h1 class="header-title">
+                <span class="material-icons">rocket_launch</span>
+                Ultravox PCM - Otimizado
+            </h1>
+            <!-- Status -->
+            <div class="status-chip">
+                <span class="status-dot" id="statusDot"></span>
+                <span id="statusText">Desconectado</span>
+                <span style="margin-left: auto; margin-right: 8px;" id="latencyText">Latência: --ms</span>
+            </div>
+            <!-- Voice Selector -->
+            <div class="voice-selector-container">
+                <div class="mdc-select mdc-select--filled" style="width: 100%;">
+                    <div class="mdc-select__anchor" role="button" aria-haspopup="listbox" aria-expanded="false">
+                        <span class="mdc-select__ripple"></span>
+                        <span class="mdc-floating-label">Voz TTS</span>
+                        <span class="mdc-select__selected-text"></span>
+                        <span class="mdc-select__dropdown-icon">
+                            <span class="material-icons">arrow_drop_down</span>
+                        </span>
+                        <span class="mdc-line-ripple"></span>
+                    </div>
+                    <div class="mdc-select__menu mdc-menu mdc-menu-surface mdc-menu-surface--fullwidth">
+                        <ul class="mdc-list" role="listbox">
+                            <li class="mdc-list-item mdc-list-item--selected" data-value="pf_dora" role="option">
+                                <span class="mdc-list-item__ripple"></span>
+                                <span class="mdc-list-item__text">🇧🇷 [pf_dora] Português Feminino (Dora)</span>
+                            </li>
+                            <li class="mdc-list-item" data-value="pm_alex" role="option">
+                                <span class="mdc-list-item__ripple"></span>
+                                <span class="mdc-list-item__text">🇧🇷 [pm_alex] Português Masculino (Alex)</span>
+                            </li>
+                            <li class="mdc-list-item" data-value="af_heart" role="option">
+                                <span class="mdc-list-item__ripple"></span>
+                                <span class="mdc-list-item__text">🌍 [af_heart] Alternativa Feminina (Heart)</span>
+                            </li>
+                            <li class="mdc-list-item" data-value="af_bella" role="option">
+                                <span class="mdc-list-item__ripple"></span>
+                                <span class="mdc-list-item__text">🌍 [af_bella] Alternativa Feminina (Bella)</span>
+                            </li>
+                        </ul>
+                    </div>
+                </div>
+                <!-- Hidden select for compatibility with existing JS -->
+                <select id="voiceSelect" style="display: none;">
+                    <option value="pf_dora" selected>Português Feminino (Dora)</option>
+                    <option value="pm_alex">Português Masculino (Alex)</option>
+                    <option value="af_heart">Alternativa Feminina (Heart)</option>
+                    <option value="af_bella">Alternativa Feminina (Bella)</option>
+                </select>
+            </div>
+            <!-- Control Buttons -->
+            <div class="controls-grid">
+                <button id="connectBtn" class="mdc-button mdc-button--raised">
+                    <span class="mdc-button__ripple"></span>
+                    <i class="material-icons mdc-button__icon" aria-hidden="true">power_settings_new</i>
+                    <span class="mdc-button__label">Conectar</span>
+                </button>
+                <button id="talkBtn" class="mdc-button mdc-button--raised" disabled>
+                    <span class="mdc-button__ripple"></span>
+                    <i class="material-icons mdc-button__icon" aria-hidden="true">mic</i>
+                    <span class="mdc-button__label">Push to Talk</span>
+                </button>
+            </div>
+            <!-- Metrics -->
+            <div class="metrics-grid">
+                <div class="metric-card mdc-elevation--z2">
+                    <div class="metric-label">Enviado</div>
+                    <div class="metric-value" id="sentBytes">0 KB</div>
+                </div>
+                <div class="metric-card mdc-elevation--z2">
+                    <div class="metric-label">Recebido</div>
+                    <div class="metric-value" id="receivedBytes">0 KB</div>
+                </div>
+                <div class="metric-card mdc-elevation--z2">
+                    <div class="metric-label">Formato</div>
+                    <div class="metric-value" id="format">PCM</div>
+                </div>
+                <div class="metric-card mdc-elevation--z2">
+                    <div class="metric-label">🎤 Voz</div>
+                    <div class="metric-value" id="currentVoice" style="font-family: monospace; color: #4CAF50; font-weight: bold;">pf_dora</div>
+                </div>
+            </div>
+            <!-- Log -->
+            <div class="log-container" id="log"></div>
+        </div>
+        <!-- TTS Direct Card -->
+        <div class="mdc-card mdc-elevation--z8">
+            <h2 class="header-title">
+                <span class="material-icons">record_voice_over</span>
+                Text-to-Speech Direto
+            </h2>
+            <p style="color: #666; margin-bottom: 16px;">Digite ou edite o texto abaixo e escolha uma voz para converter em áudio</p>
+            <!-- TTS Text Area -->
+            <textarea id="ttsText" class="tts-textarea" placeholder="Digite seu texto aqui...">Olá! Teste de voz.</textarea>
+            <!-- TTS Voice Selector -->
+            <div style="display: flex; gap: 16px; align-items: center; margin-bottom: 16px;">
+                <div class="mdc-select mdc-select--filled" style="flex: 1;">
+                    <div class="mdc-select__anchor" role="button" aria-haspopup="listbox" aria-expanded="false">
+                        <span class="mdc-select__ripple"></span>
+                        <span class="mdc-floating-label">Voz TTS</span>
+                        <span class="mdc-select__selected-text"></span>
+                        <span class="mdc-select__dropdown-icon">
+                            <span class="material-icons">arrow_drop_down</span>
+                        </span>
+                        <span class="mdc-line-ripple"></span>
+                    </div>
+                    <div class="mdc-select__menu mdc-menu mdc-menu-surface mdc-menu-surface--fullwidth">
+                        <ul class="mdc-list" role="listbox" style="max-height: 400px; overflow-y: auto;">
+                            <!-- Portuguese voices -->
+                            <li class="mdc-list-divider" role="separator">🇧🇷 Português</li>
+                            <li class="mdc-list-item mdc-list-item--selected" data-value="pf_dora" role="option">
+                                <span class="mdc-list-item__text">[pf_dora] Feminino - Dora</span>
+                            </li>
+                            <li class="mdc-list-item" data-value="pm_alex" role="option">
+                                <span class="mdc-list-item__text">[pm_alex] Masculino - Alex</span>
+                            </li>
+                            <li class="mdc-list-item" data-value="pm_santa" role="option">
+                                <span class="mdc-list-item__text">[pm_santa] Masculino - Santa</span>
+                            </li>
+                            <!-- Other languages - keeping all voices from original -->
+                            <li class="mdc-list-divider" role="separator">🇺🇸 Inglês Americano</li>
+                            <li class="mdc-list-item" data-value="af_alloy" role="option">
+                                <span class="mdc-list-item__text">Feminino - Alloy</span>
+                            </li>
+                            <li class="mdc-list-item" data-value="af_bella" role="option">
+                                <span class="mdc-list-item__text">Feminino - Bella</span>
+                            </li>
+                            <li class="mdc-list-item" data-value="af_heart" role="option">
+                                <span class="mdc-list-item__text">Feminino - Heart</span>
+                            </li>
+                            <li class="mdc-list-item" data-value="am_adam" role="option">
+                                <span class="mdc-list-item__text">Masculino - Adam</span>
+                            </li>
+                            <li class="mdc-list-item" data-value="am_echo" role="option">
+                                <span class="mdc-list-item__text">Masculino - Echo</span>
+                            </li>
+                        </ul>
+                    </div>
+                </div>
+                <!-- Hidden select for compatibility -->
+                <select id="ttsVoiceSelect" style="display: none;">
+                    <optgroup label="🇧🇷 Português">
+                        <option value="pf_dora" selected>[pf_dora] Feminino - Dora</option>
+                        <option value="pm_alex">[pm_alex] Masculino - Alex</option>
+                        <option value="pm_santa">[pm_santa] Masculino - Santa (Festivo)</option>
+                    </optgroup>
+                    <optgroup label="🇫🇷 Francês">
+                        <option value="ff_siwis">[ff_siwis] Feminino - Siwis (Nativa)</option>
+                    </optgroup>
+                    <optgroup label="🇺🇸 Inglês Americano">
+                        <option value="af_alloy">Feminino - Alloy</option>
+                        <option value="af_aoede">Feminino - Aoede</option>
+                        <option value="af_bella">Feminino - Bella</option>
+                        <option value="af_heart">Feminino - Heart</option>
+                        <option value="af_jessica">Feminino - Jessica</option>
+                        <option value="af_kore">Feminino - Kore</option>
+                        <option value="af_nicole">Feminino - Nicole</option>
+                        <option value="af_nova">Feminino - Nova</option>
+                        <option value="af_river">Feminino - River</option>
+                        <option value="af_sarah">Feminino - Sarah</option>
+                        <option value="af_sky">Feminino - Sky</option>
+                        <option value="am_adam">Masculino - Adam</option>
+                        <option value="am_echo">Masculino - Echo</option>
+                        <option value="am_eric">Masculino - Eric</option>
+                        <option value="am_fenrir">Masculino - Fenrir</option>
+                        <option value="am_liam">Masculino - Liam</option>
+                        <option value="am_michael">Masculino - Michael</option>
+                        <option value="am_onyx">Masculino - Onyx</option>
+                        <option value="am_puck">Masculino - Puck</option>
+                        <option value="am_santa">Masculino - Santa</option>
+                    </optgroup>
+                    <optgroup label="🇬🇧 Inglês Britânico">
+                        <option value="bf_alice">Feminino - Alice</option>
+                        <option value="bf_emma">Feminino - Emma</option>
+                        <option value="bf_isabella">Feminino - Isabella</option>
+                        <option value="bf_lily">Feminino - Lily</option>
+                        <option value="bm_daniel">Masculino - Daniel</option>
+                        <option value="bm_fable">Masculino - Fable</option>
+                        <option value="bm_george">Masculino - George</option>
+                        <option value="bm_lewis">Masculino - Lewis</option>
+                    </optgroup>
+                    <optgroup label="🇪🇸 Espanhol">
+                        <option value="ef_dora">Feminino - Dora</option>
+                        <option value="em_alex">Masculino - Alex</option>
+                        <option value="em_santa">Masculino - Santa</option>
+                    </optgroup>
+                    <optgroup label="🇮🇹 Italiano">
+                        <option value="if_sara">Feminino - Sara</option>
+                        <option value="im_nicola">Masculino - Nicola</option>
+                    </optgroup>
+                    <optgroup label="🇯🇵 Japonês">
+                        <option value="jf_alpha">Feminino - Alpha</option>
+                        <option value="jf_gongitsune">Feminino - Gongitsune</option>
+                        <option value="jf_nezumi">Feminino - Nezumi</option>
+                        <option value="jf_tebukuro">Feminino - Tebukuro</option>
+                        <option value="jm_kumo">Masculino - Kumo</option>
+                    </optgroup>
+                    <optgroup label="🇨🇳 Chinês">
+                        <option value="zf_xiaobei">Feminino - Xiaobei</option>
+                        <option value="zf_xiaoni">Feminino - Xiaoni</option>
+                        <option value="zf_xiaoxiao">Feminino - Xiaoxiao</option>
+                        <option value="zf_xiaoyi">Feminino - Xiaoyi</option>
+                        <option value="zm_yunjian">Masculino - Yunjian</option>
+                        <option value="zm_yunxi">Masculino - Yunxi</option>
+                        <option value="zm_yunxia">Masculino - Yunxia</option>
+                        <option value="zm_yunyang">Masculino - Yunyang</option>
+                    </optgroup>
+                    <optgroup label="🇮🇳 Hindi">
+                        <option value="hf_alpha">Feminino - Alpha</option>
+                        <option value="hf_beta">Feminino - Beta</option>
+                        <option value="hm_omega">Masculino - Omega</option>
+                        <option value="hm_psi">Masculino - Psi</option>
+                    </optgroup>
+                </select>
+                <button id="ttsPlayBtn" class="mdc-button mdc-button--raised" disabled>
+                    <span class="mdc-button__ripple"></span>
+                    <i class="material-icons mdc-button__icon" aria-hidden="true">play_arrow</i>
+                    <span class="mdc-button__label">Gerar Áudio</span>
+                </button>
+            </div>
+            <!-- TTS Status -->
+            <div id="ttsStatus" style="display: none;">
+                <div class="mdc-linear-progress mdc-linear-progress--indeterminate" role="progressbar">
+                    <div class="mdc-linear-progress__buffer">
+                        <div class="mdc-linear-progress__buffer-bar"></div>
+                        <div class="mdc-linear-progress__buffer-dots"></div>
+                    </div>
+                    <div class="mdc-linear-progress__bar mdc-linear-progress__primary-bar">
+                        <span class="mdc-linear-progress__bar-inner"></span>
+                    </div>
+                    <div class="mdc-linear-progress__bar mdc-linear-progress__secondary-bar">
+                        <span class="mdc-linear-progress__bar-inner"></span>
+                    </div>
+                </div>
+                <p id="ttsStatusText" style="margin-top: 8px;">⏳ Processando...</p>
+            </div>
+            <!-- TTS Player -->
+            <div id="ttsPlayer" style="display: none;">
+                <audio id="ttsAudio" controls style="width: 100%;"></audio>
+            </div>
+        </div>
+    </div>
+    <!-- Material Design JavaScript via CDN -->
+    <script src="https://unpkg.com/material-components-web@latest/dist/material-components-web.min.js"></script>
+    <!-- Original JavaScript (preserved completely) -->
+    <script>
+        // Initialize Material Design Components
+        mdc.autoInit();
+        // Initialize specific MDC components
+        const mdcSelects = document.querySelectorAll('.mdc-select');
+        mdcSelects.forEach((selectEl, index) => {
+            const select = mdc.select.MDCSelect.attachTo(selectEl);
+            // Sync with hidden selects
+            select.listen('MDCSelect:change', () => {
+                const value = select.value;
+                if (index === 0) {
+                    // Main voice selector
+                    document.getElementById('voiceSelect').value = value;
+                    document.getElementById('voiceSelect').dispatchEvent(new Event('change'));
+                } else {
+                    // TTS voice selector
+                    document.getElementById('ttsVoiceSelect').value = value;
+                    document.getElementById('ttsVoiceSelect').dispatchEvent(new Event('change'));
+                }
+            });
+        });
+        // Initialize buttons
+        const buttons = document.querySelectorAll('.mdc-button');
+        buttons.forEach(buttonEl => {
+            mdc.ripple.MDCRipple.attachTo(buttonEl);
+        });
+        // ========= ORIGINAL JAVASCRIPT CODE (PRESERVED COMPLETELY) =========
+        // Estado da aplicação
+        let ws = null;
+        let isConnected = false;
+        let isRecording = false;
+        let audioContext = null;
+        let stream = null;
+        let audioSource = null;
+        let audioProcessor = null;
+        let pcmBuffer = [];
+        // Métricas
+        const metrics = {
+            sentBytes: 0,
+            receivedBytes: 0,
+            latency: 0,
+            recordingStartTime: 0
+        };
+        // Elementos DOM
+        const elements = {
+            statusDot: document.getElementById('statusDot'),
+            statusText: document.getElementById('statusText'),
+            latencyText: document.getElementById('latencyText'),
+            connectBtn: document.getElementById('connectBtn'),
+            talkBtn: document.getElementById('talkBtn'),
+            voiceSelect: document.getElementById('voiceSelect'),
+            sentBytes: document.getElementById('sentBytes'),
+            receivedBytes: document.getElementById('receivedBytes'),
+            format: document.getElementById('format'),
+            log: document.getElementById('log'),
+            // TTS elements
+            ttsText: document.getElementById('ttsText'),
+            ttsVoiceSelect: document.getElementById('ttsVoiceSelect'),
+            ttsPlayBtn: document.getElementById('ttsPlayBtn'),
+            ttsStatus: document.getElementById('ttsStatus'),
+            ttsStatusText: document.getElementById('ttsStatusText'),
+            ttsPlayer: document.getElementById('ttsPlayer'),
+            ttsAudio: document.getElementById('ttsAudio')
+        };
+        // Log no console visual
+        function log(message, type = 'info') {
+            const time = new Date().toLocaleTimeString('pt-BR');
+            const entry = document.createElement('div');
+            entry.className = `log-entry ${type}`;
+            entry.innerHTML = `
+                <span class="log-time">[${time}]</span>
+                <span class="log-message">${message}</span>
+            `;
+            elements.log.appendChild(entry);
+            elements.log.scrollTop = elements.log.scrollHeight;
+            console.log(`[${type}] ${message}`);
+        }
+        // Atualizar métricas
+        function updateMetrics() {
+            elements.sentBytes.textContent = `${(metrics.sentBytes / 1024).toFixed(1)} KB`;
+            elements.receivedBytes.textContent = `${(metrics.receivedBytes / 1024).toFixed(1)} KB`;
+            elements.latencyText.textContent = `Latência: ${metrics.latency}ms`;
+        }
+        // Conectar ao WebSocket
+        async function connect() {
+            try {
+                // Solicitar acesso ao microfone
+                stream = await navigator.mediaDevices.getUserMedia({
+                    audio: {
+                        echoCancellation: true,
+                        noiseSuppression: true,
+                        sampleRate: 24000  // High quality 24kHz
+                    }
+                });
+                log('✅ Microfone acessado', 'success');
+                // Conectar WebSocket com suporte binário
+                const protocol = window.location.protocol === 'https:' ? 'wss:' : 'ws:';
+                const wsUrl = `${protocol}//${window.location.host}/ws`;
+                ws = new WebSocket(wsUrl);
+                ws.binaryType = 'arraybuffer';
+                ws.onopen = () => {
+                    isConnected = true;
+                    elements.statusDot.classList.add('connected');
+                    elements.statusText.textContent = 'Conectado';
+                    // Update button appearance
+                    elements.connectBtn.querySelector('.mdc-button__label').textContent = 'Desconectar';
+                    elements.connectBtn.querySelector('.material-icons').textContent = 'power_settings_new';
+                    elements.talkBtn.disabled = false;
+                    // Enviar voz selecionada ao conectar
+                    const currentVoice = elements.voiceSelect.value || elements.ttsVoiceSelect.value || 'pf_dora';
+                    ws.send(JSON.stringify({
+                        type: 'set-voice',
+                        voice_id: currentVoice
+                    }));
+                    log(`🔊 Voz configurada: ${currentVoice}`, 'info');
+                    elements.ttsPlayBtn.disabled = false; // Habilitar TTS button
+                    log('✅ Conectado ao servidor', 'success');
+                };
+                ws.onmessage = (event) => {
+                    if (event.data instanceof ArrayBuffer) {
+                        // Áudio PCM binário recebido
+                        handlePCMAudio(event.data);
+                    } else {
+                        // Mensagem JSON
+                        const data = JSON.parse(event.data);
+                        handleMessage(data);
+                    }
+                };
+                ws.onerror = (error) => {
+                    log(`❌ Erro WebSocket: ${error}`, 'error');
+                };
+                ws.onclose = () => {
+                    disconnect();
+                };
+            } catch (error) {
+                log(`❌ Erro ao conectar: ${error.message}`, 'error');
+            }
+        }
+        // Desconectar
+        function disconnect() {
+            isConnected = false;
+            if (ws) {
+                ws.close();
+                ws = null;
+            }
+            if (stream) {
+                stream.getTracks().forEach(track => track.stop());
+                stream = null;
+            }
+            if (audioContext) {
+                audioContext.close();
+                audioContext = null;
+            }
+            elements.statusDot.classList.remove('connected');
+            elements.statusText.textContent = 'Desconectado';
+            elements.connectBtn.querySelector('.mdc-button__label').textContent = 'Conectar';
+            elements.talkBtn.disabled = true;
+            log('👋 Desconectado', 'warning');
+        }
+        // Iniciar gravação PCM
+        function startRecording() {
+            if (isRecording) return;
+            isRecording = true;
+            metrics.recordingStartTime = Date.now();
+            elements.talkBtn.classList.add('recording');
+            elements.talkBtn.querySelector('.mdc-button__label').textContent = 'Gravando...';
+            elements.talkBtn.querySelector('.material-icons').textContent = 'mic_off';
+            pcmBuffer = [];
+            const sampleRate = 24000; // Sempre usar melhor qualidade
+            log(`🎤 Gravando PCM 16-bit @ ${sampleRate}Hz (alta qualidade)`, 'info');
+            // Criar AudioContext se necessário
+            if (!audioContext) {
+                // Sempre usar melhor qualidade (24kHz)
+                const sampleRate = 24000;
+                audioContext = new (window.AudioContext || window.webkitAudioContext)({
+                    sampleRate: sampleRate
+                });
+                log(`🎧 AudioContext criado: ${sampleRate}Hz (alta qualidade)`, 'info');
+            }
+            // Criar processador de áudio
+            audioSource = audioContext.createMediaStreamSource(stream);
+            audioProcessor = audioContext.createScriptProcessor(4096, 1, 1);
+            audioProcessor.onaudioprocess = (e) => {
+                if (!isRecording) return;
+                const inputData = e.inputBuffer.getChannelData(0);
+                // Calcular RMS (Root Mean Square) para melhor detecção de volume
+                let sumSquares = 0;
+                for (let i = 0; i < inputData.length; i++) {
+                    sumSquares += inputData[i] * inputData[i];
+                }
+                const rms = Math.sqrt(sumSquares / inputData.length);
+                // Calcular amplitude máxima também
+                let maxAmplitude = 0;
+                for (let i = 0; i < inputData.length; i++) {
+                    maxAmplitude = Math.max(maxAmplitude, Math.abs(inputData[i]));
+                }
+                // Detecção de voz baseada em RMS (mais confiável que amplitude máxima)
+                const voiceThreshold = 0.01; // Threshold para detectar voz
+                const hasVoice = rms > voiceThreshold;
+                // Aplicar ganho suave apenas se necessário
+                let gain = 1.0;
+                if (hasVoice && rms < 0.05) {
+                    // Ganho suave baseado em RMS, máximo 5x
+                    gain = Math.min(5.0, 0.05 / rms);
+                    if (gain > 1.2) {
+                        log(`🎤 Volume baixo detectado, aplicando ganho: ${gain.toFixed(1)}x`, 'info');
+                    }
+                }
+                // Converter Float32 para Int16 com processamento melhorado
+                const pcmData = new Int16Array(inputData.length);
+                for (let i = 0; i < inputData.length; i++) {
+                    // Aplicar ganho suave
+                    let sample = inputData[i] * gain;
+                    // Soft clipping para evitar distorção
+                    if (Math.abs(sample) > 0.95) {
+                        sample = Math.sign(sample) * (0.95 + 0.05 * Math.tanh((Math.abs(sample) - 0.95) * 10));
+                    }
+                    // Converter para Int16
+                    sample = Math.max(-1, Math.min(1, sample));
+                    pcmData[i] = sample < 0 ? sample * 0x8000 : sample * 0x7FFF;
+                }
+                // Adicionar ao buffer apenas se detectar voz
+                if (hasVoice) {
+                    pcmBuffer.push(pcmData);
+                }
+            };
+            audioSource.connect(audioProcessor);
+            audioProcessor.connect(audioContext.destination);
+        }
+        // Parar gravação e enviar
+        function stopRecording() {
+            if (!isRecording) return;
+            isRecording = false;
+            const duration = Date.now() - metrics.recordingStartTime;
+            elements.talkBtn.classList.remove('recording');
+            elements.talkBtn.querySelector('.mdc-button__label').textContent = 'Push to Talk';
+            elements.talkBtn.querySelector('.material-icons').textContent = 'mic';
+            // Desconectar processador
+            if (audioProcessor) {
+                audioProcessor.disconnect();
+                audioProcessor = null;
+            }
+            if (audioSource) {
+                audioSource.disconnect();
+                audioSource = null;
+            }
+            // Verificar se há áudio para enviar
+            if (pcmBuffer.length === 0) {
+                log(`⚠️ Nenhum áudio capturado (silêncio ou volume muito baixo)`, 'warning');
+                pcmBuffer = [];
+                return;
+            }
+            // Combinar todos os chunks PCM
+            const totalLength = pcmBuffer.reduce((acc, chunk) => acc + chunk.length, 0);
+            // Verificar tamanho mínimo (0.5 segundos)
+            const sampleRate = 24000; // Sempre 24kHz
+            const minSamples = sampleRate * 0.5;
+            if (totalLength < minSamples) {
+                log(`⚠️ Áudio muito curto: ${(totalLength/sampleRate).toFixed(2)}s (mínimo 0.5s)`, 'warning');
+                pcmBuffer = [];
+                return;
+            }
+            const fullPCM = new Int16Array(totalLength);
+            let offset = 0;
+            for (const chunk of pcmBuffer) {
+                fullPCM.set(chunk, offset);
+                offset += chunk.length;
+            }
+            // Calcular amplitude final para debug
+            let maxAmp = 0;
+            for (let i = 0; i < Math.min(fullPCM.length, 1000); i++) {
+                maxAmp = Math.max(maxAmp, Math.abs(fullPCM[i] / 32768));
+            }
+            // Enviar PCM binário direto (sem Base64!)
+            if (ws && ws.readyState === WebSocket.OPEN) {
+                // Enviar um header simples antes do áudio
+                const header = new ArrayBuffer(8);
+                const view = new DataView(header);
+                view.setUint32(0, 0x50434D16); // Magic: "PCM16"
+                view.setUint32(4, fullPCM.length * 2); // Tamanho em bytes
+                ws.send(header);
+                ws.send(fullPCM.buffer);
+                metrics.sentBytes += fullPCM.length * 2;
+                updateMetrics();
+                const sampleRate = 24000; // Sempre 24kHz
+                log(`📤 PCM enviado: ${(fullPCM.length * 2 / 1024).toFixed(1)}KB, ${(totalLength/sampleRate).toFixed(1)}s @ ${sampleRate}Hz, amp:${maxAmp.toFixed(3)}`, 'success');
+            }
+            // Limpar buffer após enviar
+            pcmBuffer = [];
+        }
+        // Processar mensagem JSON
+        function handleMessage(data) {
+            switch (data.type) {
+                case 'metrics':
+                    metrics.latency = data.latency;
+                    updateMetrics();
+                    log(`📊 Resposta: "${data.response}" (${data.latency}ms)`, 'success');
+                    break;
+                case 'error':
+                    log(`❌ Erro: ${data.message}`, 'error');
+                    break;
+                case 'tts-response':
+                    // Resposta do TTS direto (Opus 24kHz ou PCM)
+                    if (data.audio) {
+                        // Decodificar base64 para arraybuffer
+                        const binaryString = atob(data.audio);
+                        const bytes = new Uint8Array(binaryString.length);
+                        for (let i = 0; i < binaryString.length; i++) {
+                            bytes[i] = binaryString.charCodeAt(i);
+                        }
+                        let audioData = bytes.buffer;
+                        // IMPORTANTE: Usar a taxa enviada pelo servidor
+                        const sampleRate = data.sampleRate || 24000;
+                        console.log(`🎯 TTS Response - Taxa recebida: ${sampleRate}Hz, Formato: ${data.format}, Tamanho: ${bytes.length} bytes`);
+                        // Se for Opus, usar WebAudio API para decodificar nativamente
+                        let wavBuffer;
+                        if (data.format === 'opus') {
+                            console.log(`🗜️ Opus 24kHz recebido: ${(bytes.length/1024).toFixed(1)}KB`);
+                            // Log de economia de banda
+                            if (data.originalSize) {
+                                const compression = Math.round(100 - (bytes.length / data.originalSize) * 100);
+                                console.log(`📊 Economia de banda: ${compression}% (${(data.originalSize/1024).toFixed(1)}KB → ${(bytes.length/1024).toFixed(1)}KB)`);
+                            }
+                            // WebAudio API pode decodificar Opus nativamente
+                            // Por agora, tratar como PCM até implementar decoder completo
+                            wavBuffer = addWavHeader(audioData, sampleRate);
+                        } else {
+                            // PCM - adicionar WAV header com a taxa correta
+                            wavBuffer = addWavHeader(audioData, sampleRate);
+                        }
+                        // Log da qualidade recebida
+                        console.log(`🎵 TTS pronto: ${(audioData.byteLength/1024).toFixed(1)}KB @ ${sampleRate}Hz (${data.quality || 'high'} quality, ${data.format || 'pcm'})`);
+                        // Criar blob e URL
+                        const blob = new Blob([wavBuffer], { type: 'audio/wav' });
+                        const audioUrl = URL.createObjectURL(blob);
+                        // Atualizar player
+                        elements.ttsAudio.src = audioUrl;
+                        elements.ttsPlayer.style.display = 'block';
+                        elements.ttsStatus.style.display = 'none';
+                        elements.ttsPlayBtn.disabled = false;
+                        elements.ttsPlayBtn.querySelector('.mdc-button__label').textContent = 'Gerar Áudio';
+                        log('🎵 Áudio TTS gerado com sucesso!', 'success');
+                    }
+                    break;
+            }
+        }
+        // Processar áudio PCM recebido
+        function handlePCMAudio(arrayBuffer) {
+            metrics.receivedBytes += arrayBuffer.byteLength;
+            updateMetrics();
+            // Criar WAV header para reproduzir
+            const wavBuffer = addWavHeader(arrayBuffer);
+            // Criar blob e URL para o áudio
+            const blob = new Blob([wavBuffer], { type: 'audio/wav' });
+            const audioUrl = URL.createObjectURL(blob);
+            // Criar log com botão de play
+            const time = new Date().toLocaleTimeString('pt-BR');
+            const entry = document.createElement('div');
+            entry.className = 'log-entry success';
+            entry.innerHTML = `
+                <span class="log-time">[${time}]</span>
+                <span class="log-message">🔊 Áudio recebido: ${(arrayBuffer.byteLength / 1024).toFixed(1)}KB</span>
+                <div class="audio-player">
+                    <button class="play-btn" onclick="playAudio('${audioUrl}')">▶️ Play</button>
+                    <audio id="audio-${Date.now()}" src="${audioUrl}" style="display: none;"></audio>
+                </div>
+            `;
+            elements.log.appendChild(entry);
+            elements.log.scrollTop = elements.log.scrollHeight;
+            // Auto-play o áudio
+            const audio = new Audio(audioUrl);
+            audio.play().catch(err => {
+                console.log('Auto-play bloqueado, use o botão para reproduzir');
+            });
+        }
+        // Função para tocar áudio manualmente
+        function playAudio(url) {
+            const audio = new Audio(url);
+            audio.play();
+        }
+        // Adicionar header WAV ao PCM
+        function addWavHeader(pcmBuffer, customSampleRate) {
+            const pcmData = new Uint8Array(pcmBuffer);
+            const wavBuffer = new ArrayBuffer(44 + pcmData.length);
+            const view = new DataView(wavBuffer);
+            // WAV header
+            const writeString = (offset, string) => {
+                for (let i = 0; i < string.length; i++) {
+                    view.setUint8(offset + i, string.charCodeAt(i));
+                }
+            };
+            writeString(0, 'RIFF');
+            view.setUint32(4, 36 + pcmData.length, true);
+            writeString(8, 'WAVE');
+            writeString(12, 'fmt ');
+            view.setUint32(16, 16, true); // fmt chunk size
+            view.setUint16(20, 1, true); // PCM format
+            view.setUint16(22, 1, true); // Mono
+            // Usar taxa customizada se fornecida, senão usar 24kHz
+            let sampleRate = customSampleRate || 24000;
+            console.log(`📝 WAV Header - Configurando taxa: ${sampleRate}Hz`);
+            view.setUint32(24, sampleRate, true); // Sample rate
+            view.setUint32(28, sampleRate * 2, true); // Byte rate: sampleRate * 1 * 2
+            view.setUint16(32, 2, true); // Block align: 1 * 2
+            view.setUint16(34, 16, true); // Bits per sample: 16-bit
+            writeString(36, 'data');
+            view.setUint32(40, pcmData.length, true);
+            // Copiar dados PCM
+            new Uint8Array(wavBuffer, 44).set(pcmData);
+            return wavBuffer;
+        }
+        // Event Listeners
+        elements.connectBtn.addEventListener('click', () => {
+            if (isConnected) {
+                disconnect();
+            } else {
+                connect();
+            }
+        });
+        elements.talkBtn.addEventListener('mousedown', startRecording);
+        elements.talkBtn.addEventListener('mouseup', stopRecording);
+        elements.talkBtn.addEventListener('mouseleave', stopRecording);
+        // Voice selector listener
+        elements.voiceSelect.addEventListener('change', (e) => {
+            const voice_id = e.target.value;
+            console.log('Voice select changed to:', voice_id);
+            // Update current voice display
+            const currentVoiceElement = document.getElementById('currentVoice');
+            if (currentVoiceElement) {
+                currentVoiceElement.textContent = voice_id;
+            }
+            if (ws && ws.readyState === WebSocket.OPEN) {
+                console.log('Sending set-voice command:', voice_id);
+                ws.send(JSON.stringify({
+                    type: 'set-voice',
+                    voice_id: voice_id
+                }));
+                log(`🔊 Voz alterada para: ${voice_id} - ${e.target.options[e.target.selectedIndex].text}`, 'info');
+            } else {
+                console.log('WebSocket not connected, cannot send voice change');
+                log(`⚠️ Conecte-se primeiro para mudar a voz`, 'warning');
+            }
+        });
+        elements.talkBtn.addEventListener('touchstart', startRecording);
+        elements.talkBtn.addEventListener('touchend', stopRecording);
+        // TTS Voice selector listener
+        elements.ttsVoiceSelect.addEventListener('change', (e) => {
+            const voice_id = e.target.value;
+            // Update main voice selector
+            elements.voiceSelect.value = voice_id;
+            // Update current voice display
+            const currentVoiceElement = document.getElementById('currentVoice');
+            if (currentVoiceElement) {
+                currentVoiceElement.textContent = voice_id;
+            }
+            // Send voice change to server
+            if (ws && ws.readyState === WebSocket.OPEN) {
+                ws.send(JSON.stringify({
+                    type: 'set-voice',
+                    voice_id: voice_id
+                }));
+                log(`🎤 Voz TTS alterada para: ${voice_id}`, 'info');
+            }
+        });
+        // TTS Button Event Listener
+        elements.ttsPlayBtn.addEventListener('click', (e) => {
+            e.preventDefault();
+            e.stopPropagation();
+            console.log('TTS Button clicked!');
+            const text = elements.ttsText.value.trim();
+            const voice = elements.ttsVoiceSelect.value;
+            console.log('TTS Text:', text);
+            console.log('TTS Voice:', voice);
+            if (!text) {
+                alert('Por favor, digite algum texto para converter em áudio');
+                return;
+            }
+            if (!ws || ws.readyState !== WebSocket.OPEN) {
+                alert('Por favor, conecte-se primeiro clicando em "Conectar"');
+                return;
+            }
+            // Mostrar status
+            elements.ttsStatus.style.display = 'block';
+            elements.ttsStatusText.textContent = '⏳ Gerando áudio...';
+            elements.ttsPlayBtn.disabled = true;
+            elements.ttsPlayBtn.querySelector('.mdc-button__label').textContent = 'Processando...';
+            elements.ttsPlayer.style.display = 'none';
+            // Sempre usar melhor qualidade (24kHz)
+            const quality = 'high';
+            // Enviar request para TTS com qualidade máxima
+            const ttsRequest = {
+                type: 'text-to-speech',
+                text: text,
+                voice_id: voice,
+                quality: quality,
+                format: 'opus'  // Opus 24kHz @ 32kbps - máxima qualidade, mínima banda
+            };
+            console.log('Sending TTS request:', ttsRequest);
+            ws.send(JSON.stringify(ttsRequest));
+            log(`🎤 Solicitando TTS: voz=${voice}, texto="${text.substring(0, 50)}..."`, 'info');
+        });
+        // Inicialização
+        log('🚀 Ultravox Chat PCM Otimizado - Material Design', 'info');
+        log('📊 Formato: PCM 16-bit @ 24kHz', 'info');
+        log('⚡ Interface Material Design', 'success');
+    </script>
+</body>
+</html>

services/webrtc_gateway/ultravox-chat-opus.html ADDED Viewed

	@@ -0,0 +1,581 @@

+<!DOCTYPE html>
+<html lang="pt-BR">
+<head>
+    <meta charset="UTF-8">
+    <meta name="viewport" content="width=device-width, initial-scale=1.0">
+    <title>Ultravox Chat - Opus Edition</title>
+    <link href="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0/dist/css/bootstrap.min.css" rel="stylesheet">
+    <style>
+        body {
+            background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
+            min-height: 100vh;
+            font-family: 'Segoe UI', Tahoma, Geneva, Verdana, sans-serif;
+        }
+        .container {
+            max-width: 1200px;
+            margin-top: 30px;
+        }
+        .card {
+            border: none;
+            border-radius: 15px;
+            box-shadow: 0 10px 40px rgba(0,0,0,0.1);
+            backdrop-filter: blur(10px);
+            background: rgba(255, 255, 255, 0.95);
+        }
+        .card-header {
+            background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
+            color: white;
+            border-radius: 15px 15px 0 0 !important;
+            padding: 20px;
+            border: none;
+        }
+        .status-indicator {
+            display: inline-block;
+            width: 10px;
+            height: 10px;
+            border-radius: 50%;
+            background: #dc3545;
+            margin-right: 8px;
+            animation: pulse 2s infinite;
+        }
+        .status-indicator.connected {
+            background: #28a745;
+        }
+        @keyframes pulse {
+            0% { opacity: 1; }
+            50% { opacity: 0.5; }
+            100% { opacity: 1; }
+        }
+        .btn-primary {
+            background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
+            border: none;
+            border-radius: 25px;
+            padding: 10px 30px;
+            transition: all 0.3s;
+        }
+        .btn-primary:hover {
+            transform: translateY(-2px);
+            box-shadow: 0 5px 20px rgba(0,0,0,0.2);
+        }
+        .btn-talk {
+            width: 100px;
+            height: 100px;
+            border-radius: 50%;
+            font-size: 24px;
+            position: relative;
+            transition: all 0.3s;
+            background: linear-gradient(135deg, #f093fb 0%, #f5576c 100%);
+            border: none;
+            color: white;
+        }
+        .btn-talk:disabled {
+            background: #ccc;
+            cursor: not-allowed;
+        }
+        .btn-talk.recording {
+            animation: recording-pulse 1s infinite;
+            background: linear-gradient(135deg, #fa709a 0%, #fee140 100%);
+        }
+        @keyframes recording-pulse {
+            0% { transform: scale(1); }
+            50% { transform: scale(1.1); }
+            100% { transform: scale(1); }
+        }
+        #chatLog {
+            height: 400px;
+            overflow-y: auto;
+            background: #f8f9fa;
+            border-radius: 10px;
+            padding: 15px;
+            font-family: 'Courier New', monospace;
+            font-size: 14px;
+        }
+        .log-entry {
+            margin-bottom: 8px;
+            padding: 8px;
+            border-radius: 5px;
+            animation: fadeIn 0.3s;
+        }
+        @keyframes fadeIn {
+            from { opacity: 0; transform: translateY(10px); }
+            to { opacity: 1; transform: translateY(0); }
+        }
+        .log-info { background: #d1ecf1; color: #0c5460; }
+        .log-success { background: #d4edda; color: #155724; }
+        .log-warning { background: #fff3cd; color: #856404; }
+        .log-error { background: #f8d7da; color: #721c24; }
+        .log-ai { background: #e7e3ff; color: #4a4a8a; }
+        .metrics-card {
+            background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
+            color: white;
+            border-radius: 10px;
+            padding: 15px;
+            margin-top: 20px;
+        }
+        .metric-item {
+            display: flex;
+            justify-content: space-between;
+            padding: 5px 0;
+            border-bottom: 1px solid rgba(255,255,255,0.2);
+        }
+        .metric-item:last-child {
+            border-bottom: none;
+        }
+        .metric-value {
+            font-weight: bold;
+        }
+        .voice-select {
+            margin-top: 10px;
+        }
+        #debugLog {
+            height: 200px;
+            overflow-y: auto;
+            background: #2d2d2d;
+            color: #00ff00;
+            border-radius: 5px;
+            padding: 10px;
+            font-family: 'Courier New', monospace;
+            font-size: 12px;
+            margin-top: 20px;
+        }
+        .codec-indicator {
+            display: inline-block;
+            padding: 2px 8px;
+            border-radius: 12px;
+            background: #28a745;
+            color: white;
+            font-size: 12px;
+            margin-left: 10px;
+        }
+    </style>
+</head>
+<body>
+    <div class="container">
+        <div class="row">
+            <div class="col-lg-8">
+                <div class="card">
+                    <div class="card-header">
+                        <h4 class="mb-0">
+                            🎙️ Ultravox Chat - WebRTC Pipeline
+                            <span class="codec-indicator">OPUS</span>
+                        </h4>
+                        <small>Gravação e envio em Opus codec (compressão eficiente)</small>
+                    </div>
+                    <div class="card-body">
+                        <div class="d-flex justify-content-between align-items-center mb-3">
+                            <div>
+                                <span class="status-indicator" id="statusDot"></span>
+                                <span id="statusText">Desconectado</span>
+                            </div>
+                            <button class="btn btn-primary" id="connectBtn">Conectar</button>
+                        </div>
+                        <div class="text-center my-4">
+                            <button class="btn btn-talk" id="talkBtn" disabled>
+                                🎤
+                            </button>
+                            <div class="mt-2 text-muted">Segure para falar</div>
+                        </div>
+                        <div class="voice-select">
+                            <label for="voiceSelect" class="form-label">🔊 Voz TTS:</label>
+                            <select class="form-select" id="voiceSelect">
+                                <optgroup label="🇧🇷 Português Brasileiro">
+                                    <option value="pf_dora" selected>[pf_dora] Feminino - Dora</option>
+                                    <option value="pm_alex">[pm_alex] Masculino - Alex</option>
+                                </optgroup>
+                            </select>
+                        </div>
+                        <div class="mt-3">
+                            <label for="chatLog" class="form-label">📝 Log de Conversação:</label>
+                            <div id="chatLog"></div>
+                        </div>
+                    </div>
+                </div>
+            </div>
+            <div class="col-lg-4">
+                <div class="metrics-card">
+                    <h5 class="mb-3">📊 Métricas</h5>
+                    <div class="metric-item">
+                        <span>Codec:</span>
+                        <span class="metric-value" id="codecType">Opus</span>
+                    </div>
+                    <div class="metric-item">
+                        <span>Bitrate:</span>
+                        <span class="metric-value" id="bitrate">32 kbps</span>
+                    </div>
+                    <div class="metric-item">
+                        <span>Taxa de Compressão:</span>
+                        <span class="metric-value" id="compressionRate">-</span>
+                    </div>
+                    <div class="metric-item">
+                        <span>Latência Total:</span>
+                        <span class="metric-value" id="totalLatency">-</span>
+                    </div>
+                    <div class="metric-item">
+                        <span>Tempo de Gravação:</span>
+                        <span class="metric-value" id="recordingTime">-</span>
+                    </div>
+                    <div class="metric-item">
+                        <span>Taxa de Áudio:</span>
+                        <span class="metric-value" id="audioRate">48 kHz</span>
+                    </div>
+                    <div class="metric-item">
+                        <span>Tamanho do Áudio:</span>
+                        <span class="metric-value" id="audioSize">-</span>
+                    </div>
+                </div>
+                <div class="card mt-3">
+                    <div class="card-body">
+                        <h6>🐛 Debug Log</h6>
+                        <div id="debugLog"></div>
+                    </div>
+                </div>
+            </div>
+        </div>
+    </div>
+    <script>
+        // Elementos do DOM
+        const elements = {
+            connectBtn: document.getElementById('connectBtn'),
+            talkBtn: document.getElementById('talkBtn'),
+            statusDot: document.getElementById('statusDot'),
+            statusText: document.getElementById('statusText'),
+            chatLog: document.getElementById('chatLog'),
+            debugLog: document.getElementById('debugLog'),
+            voiceSelect: document.getElementById('voiceSelect'),
+            // Métricas
+            codecType: document.getElementById('codecType'),
+            bitrate: document.getElementById('bitrate'),
+            compressionRate: document.getElementById('compressionRate'),
+            totalLatency: document.getElementById('totalLatency'),
+            recordingTime: document.getElementById('recordingTime'),
+            audioRate: document.getElementById('audioRate'),
+            audioSize: document.getElementById('audioSize')
+        };
+        // Estado da aplicação
+        let ws = null;
+        let isConnected = false;
+        let isRecording = false;
+        let stream = null;
+        let mediaRecorder = null;
+        let audioChunks = [];
+        let sessionId = null;
+        // Métricas
+        let metrics = {
+            recordingStartTime: 0,
+            recordingEndTime: 0,
+            audioBytesSent: 0,
+            pcmBytesOriginal: 0
+        };
+        // Função de log
+        function log(message, type = 'info') {
+            const timestamp = new Date().toLocaleTimeString();
+            const entry = document.createElement('div');
+            entry.className = `log-entry log-${type}`;
+            entry.textContent = `[${timestamp}] ${message}`;
+            elements.chatLog.appendChild(entry);
+            elements.chatLog.scrollTop = elements.chatLog.scrollHeight;
+        }
+        // Debug log
+        function debug(message) {
+            const timestamp = new Date().toLocaleTimeString();
+            const entry = `[${timestamp}] ${message}\n`;
+            elements.debugLog.textContent += entry;
+            elements.debugLog.scrollTop = elements.debugLog.scrollHeight;
+        }
+        // Gerar ID de sessão único
+        function generateSessionId() {
+            return Math.random().toString(36).substring(2) + Date.now().toString(36);
+        }
+        // Conectar ao WebSocket
+        async function connect() {
+            if (isConnected) {
+                disconnect();
+                return;
+            }
+            try {
+                // Solicitar permissão de microfone
+                stream = await navigator.mediaDevices.getUserMedia({
+                    audio: {
+                        echoCancellation: true,
+                        noiseSuppression: true,
+                        autoGainControl: true,
+                        sampleRate: 48000
+                    }
+                });
+                // Conectar WebSocket
+                const protocol = window.location.protocol === 'https:' ? 'wss:' : 'ws:';
+                const wsUrl = `${protocol}//${window.location.hostname}:8082/ws`;
+                ws = new WebSocket(wsUrl);
+                ws.onopen = () => {
+                    isConnected = true;
+                    sessionId = generateSessionId();
+                    elements.statusDot.classList.add('connected');
+                    elements.statusText.textContent = 'Conectado';
+                    elements.connectBtn.textContent = 'Desconectar';
+                    elements.connectBtn.classList.remove('btn-primary');
+                    elements.connectBtn.classList.add('btn-danger');
+                    elements.talkBtn.disabled = false;
+                    log('✅ Conectado ao servidor (Opus mode)', 'success');
+                    debug('WebSocket conectado com suporte a Opus');
+                };
+                ws.onmessage = (event) => {
+                    const data = JSON.parse(event.data);
+                    if (data.type === 'transcription') {
+                        log(`👂 Você: ${data.text}`, 'info');
+                    } else if (data.type === 'response') {
+                        log(`🤖 AI: ${data.text}`, 'ai');
+                        const latency = Date.now() - metrics.recordingEndTime;
+                        elements.totalLatency.textContent = `${latency}ms`;
+                    } else if (data.type === 'audio') {
+                        playAudio(data.audio);
+                    } else if (data.type === 'error') {
+                        log(`❌ Erro: ${data.message}`, 'error');
+                    }
+                };
+                ws.onerror = (error) => {
+                    log(`❌ Erro de conexão: ${error}`, 'error');
+                    debug(`WebSocket error: ${error}`);
+                };
+                ws.onclose = () => {
+                    if (isConnected) {
+                        log('⚠️ Conexão perdida', 'warning');
+                        disconnect();
+                    }
+                };
+            } catch (error) {
+                log(`❌ Erro ao conectar: ${error.message}`, 'error');
+                debug(`Connection error: ${error.message}`);
+            }
+        }
+        // Desconectar
+        function disconnect() {
+            isConnected = false;
+            if (ws) {
+                ws.close();
+                ws = null;
+            }
+            if (stream) {
+                stream.getTracks().forEach(track => track.stop());
+                stream = null;
+            }
+            elements.statusDot.classList.remove('connected');
+            elements.statusText.textContent = 'Desconectado';
+            elements.connectBtn.textContent = 'Conectar';
+            elements.connectBtn.classList.remove('btn-danger');
+            elements.connectBtn.classList.add('btn-primary');
+            elements.talkBtn.disabled = true;
+            log('👋 Desconectado', 'warning');
+        }
+        // Iniciar gravação com MediaRecorder (Opus)
+        function startRecording() {
+            if (isRecording) return;
+            isRecording = true;
+            audioChunks = [];
+            metrics.recordingStartTime = Date.now();
+            metrics.audioBytesSent = 0;
+            metrics.pcmBytesOriginal = 0;
+            elements.talkBtn.classList.add('recording');
+            elements.talkBtn.textContent = '⏺️';
+            // Configurar MediaRecorder para Opus
+            const mimeType = 'audio/webm;codecs=opus';
+            if (!MediaRecorder.isTypeSupported(mimeType)) {
+                log('⚠️ Opus não suportado, usando codec padrão', 'warning');
+                debug('Opus codec not supported, falling back');
+            }
+            const options = {
+                mimeType: MediaRecorder.isTypeSupported(mimeType) ? mimeType : 'audio/webm',
+                audioBitsPerSecond: 32000 // 32 kbps para Opus
+            };
+            mediaRecorder = new MediaRecorder(stream, options);
+            debug(`MediaRecorder iniciado: ${mediaRecorder.mimeType}`);
+            log(`🎤 Gravando com ${mediaRecorder.mimeType}`, 'info');
+            // Coletar chunks de áudio
+            mediaRecorder.ondataavailable = (event) => {
+                if (event.data.size > 0) {
+                    audioChunks.push(event.data);
+                    metrics.audioBytesSent += event.data.size;
+                    // Estimar tamanho original (PCM 16-bit @ 48kHz)
+                    const duration = (Date.now() - metrics.recordingStartTime) / 1000;
+                    metrics.pcmBytesOriginal = duration * 48000 * 2; // 2 bytes per sample
+                    updateMetrics();
+                }
+            };
+            // Enviar áudio quando parar
+            mediaRecorder.onstop = async () => {
+                const audioBlob = new Blob(audioChunks, { type: mediaRecorder.mimeType });
+                await sendAudioToServer(audioBlob);
+            };
+            // Iniciar gravação com timeslice de 100ms para streaming
+            mediaRecorder.start(100);
+            elements.codecType.textContent = mediaRecorder.mimeType.includes('opus') ? 'Opus' : 'WebM';
+        }
+        // Parar gravação
+        function stopRecording() {
+            if (!isRecording) return;
+            isRecording = false;
+            metrics.recordingEndTime = Date.now();
+            elements.talkBtn.classList.remove('recording');
+            elements.talkBtn.textContent = '🎤';
+            if (mediaRecorder && mediaRecorder.state !== 'inactive') {
+                mediaRecorder.stop();
+            }
+            const duration = ((metrics.recordingEndTime - metrics.recordingStartTime) / 1000).toFixed(1);
+            elements.recordingTime.textContent = `${duration}s`;
+            log(`⏹️ Gravação finalizada (${duration}s)`, 'info');
+            debug(`Recording stopped: ${duration}s, ${metrics.audioBytesSent} bytes`);
+        }
+        // Enviar áudio para o servidor
+        async function sendAudioToServer(audioBlob) {
+            if (!ws || ws.readyState !== WebSocket.OPEN) {
+                log('❌ WebSocket não conectado', 'error');
+                return;
+            }
+            try {
+                // Converter blob para base64
+                const reader = new FileReader();
+                reader.onloadend = () => {
+                    const base64Audio = reader.result.split(',')[1];
+                    // Enviar via WebSocket
+                    ws.send(JSON.stringify({
+                        type: 'audio',
+                        sessionId: sessionId,
+                        audio: base64Audio,
+                        format: 'opus',
+                        mimeType: audioBlob.type,
+                        voice: elements.voiceSelect.value,
+                        sampleRate: 48000
+                    }));
+                    log(`📤 Áudio enviado: ${(audioBlob.size / 1024).toFixed(1)}KB (Opus)`, 'success');
+                    debug(`Audio sent: ${audioBlob.size} bytes, type: ${audioBlob.type}`);
+                    elements.audioSize.textContent = `${(audioBlob.size / 1024).toFixed(1)}KB`;
+                };
+                reader.readAsDataURL(audioBlob);
+            } catch (error) {
+                log(`❌ Erro ao enviar áudio: ${error.message}`, 'error');
+                debug(`Send error: ${error.message}`);
+            }
+        }
+        // Atualizar métricas
+        function updateMetrics() {
+            if (metrics.pcmBytesOriginal > 0 && metrics.audioBytesSent > 0) {
+                const compressionRate = (metrics.pcmBytesOriginal / metrics.audioBytesSent).toFixed(1);
+                elements.compressionRate.textContent = `${compressionRate}:1`;
+                const bitrate = (metrics.audioBytesSent * 8 / ((Date.now() - metrics.recordingStartTime) / 1000) / 1000).toFixed(0);
+                elements.bitrate.textContent = `${bitrate} kbps`;
+            }
+        }
+        // Reproduzir áudio recebido
+        function playAudio(base64Audio) {
+            try {
+                const audio = new Audio(`data:audio/wav;base64,${base64Audio}`);
+                audio.play();
+                debug('Playing TTS audio response');
+            } catch (error) {
+                log(`❌ Erro ao reproduzir áudio: ${error.message}`, 'error');
+            }
+        }
+        // Event Listeners
+        elements.connectBtn.addEventListener('click', connect);
+        // Push-to-talk
+        elements.talkBtn.addEventListener('mousedown', startRecording);
+        elements.talkBtn.addEventListener('mouseup', stopRecording);
+        elements.talkBtn.addEventListener('mouseleave', stopRecording);
+        // Touch events para mobile
+        elements.talkBtn.addEventListener('touchstart', (e) => {
+            e.preventDefault();
+            startRecording();
+        });
+        elements.talkBtn.addEventListener('touchend', (e) => {
+            e.preventDefault();
+            stopRecording();
+        });
+        // Inicialização
+        log('🎯 Ultravox Chat (Opus Edition) pronto!', 'info');
+        debug('Sistema inicializado com suporte a gravação Opus');
+        debug('Codec preferencial: audio/webm;codecs=opus @ 32kbps');
+    </script>
+</body>
+</html>

services/webrtc_gateway/ultravox-chat-original.html ADDED Viewed

	@@ -0,0 +1,964 @@

+<!DOCTYPE html>
+<html lang="pt-BR">
+<head>
+    <meta charset="UTF-8">
+    <meta name="viewport" content="width=device-width, initial-scale=1.0">
+    <title>Ultravox Chat PCM - Otimizado</title>
+    <script src="opus-decoder.js"></script>
+    <style>
+        * {
+            margin: 0;
+            padding: 0;
+            box-sizing: border-box;
+        }
+        body {
+            font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, sans-serif;
+            background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
+            min-height: 100vh;
+            display: flex;
+            justify-content: center;
+            align-items: center;
+            padding: 20px;
+        }
+        .container {
+            background: white;
+            border-radius: 20px;
+            box-shadow: 0 20px 60px rgba(0,0,0,0.3);
+            padding: 40px;
+            max-width: 600px;
+            width: 100%;
+        }
+        h1 {
+            text-align: center;
+            color: #333;
+            margin-bottom: 30px;
+            font-size: 28px;
+        }
+        .status {
+            background: #f8f9fa;
+            border-radius: 10px;
+            padding: 15px;
+            margin-bottom: 20px;
+            display: flex;
+            align-items: center;
+            justify-content: space-between;
+        }
+        .status-dot {
+            width: 12px;
+            height: 12px;
+            border-radius: 50%;
+            background: #dc3545;
+            margin-right: 10px;
+            display: inline-block;
+        }
+        .status-dot.connected {
+            background: #28a745;
+            animation: pulse 2s infinite;
+        }
+        @keyframes pulse {
+            0% { box-shadow: 0 0 0 0 rgba(40, 167, 69, 0.7); }
+            70% { box-shadow: 0 0 0 10px rgba(40, 167, 69, 0); }
+            100% { box-shadow: 0 0 0 0 rgba(40, 167, 69, 0); }
+        }
+        .controls {
+            display: flex;
+            gap: 10px;
+            margin-bottom: 20px;
+        }
+        .voice-selector {
+            display: flex;
+            align-items: center;
+            gap: 10px;
+            margin-bottom: 20px;
+            padding: 10px;
+            background: #f8f9fa;
+            border-radius: 10px;
+        }
+        .voice-selector label {
+            font-weight: 600;
+            color: #555;
+        }
+        .voice-selector select {
+            flex: 1;
+            padding: 8px;
+            border: 2px solid #ddd;
+            border-radius: 5px;
+            font-size: 14px;
+            background: white;
+            cursor: pointer;
+        }
+        .voice-selector select:focus {
+            outline: none;
+            border-color: #667eea;
+        }
+        button {
+            flex: 1;
+            padding: 15px;
+            border: none;
+            border-radius: 10px;
+            font-size: 16px;
+            font-weight: 600;
+            cursor: pointer;
+            transition: all 0.3s ease;
+        }
+        button:disabled {
+            opacity: 0.5;
+            cursor: not-allowed;
+        }
+        .btn-primary {
+            background: #007bff;
+            color: white;
+        }
+        .btn-primary:hover:not(:disabled) {
+            background: #0056b3;
+            transform: translateY(-2px);
+            box-shadow: 0 5px 15px rgba(0,123,255,0.3);
+        }
+        .btn-danger {
+            background: #dc3545;
+            color: white;
+        }
+        .btn-danger:hover:not(:disabled) {
+            background: #c82333;
+        }
+        .btn-success {
+            background: #28a745;
+            color: white;
+        }
+        .btn-success.recording {
+            background: #dc3545;
+            animation: recordPulse 1s infinite;
+        }
+        @keyframes recordPulse {
+            0%, 100% { opacity: 1; }
+            50% { opacity: 0.7; }
+        }
+        .metrics {
+            display: grid;
+            grid-template-columns: repeat(3, 1fr);
+            gap: 15px;
+            margin-bottom: 20px;
+        }
+        .metric {
+            background: #f8f9fa;
+            padding: 15px;
+            border-radius: 10px;
+            text-align: center;
+        }
+        .metric-label {
+            font-size: 12px;
+            color: #6c757d;
+            margin-bottom: 5px;
+        }
+        .metric-value {
+            font-size: 24px;
+            font-weight: bold;
+            color: #333;
+        }
+        .log {
+            background: #f8f9fa;
+            border-radius: 10px;
+            padding: 20px;
+            height: 300px;
+            overflow-y: auto;
+            font-family: 'Monaco', 'Menlo', monospace;
+            font-size: 12px;
+        }
+        .log-entry {
+            padding: 5px 0;
+            border-bottom: 1px solid #e9ecef;
+            display: flex;
+            align-items: flex-start;
+        }
+        .log-time {
+            color: #6c757d;
+            margin-right: 10px;
+            flex-shrink: 0;
+        }
+        .log-message {
+            flex: 1;
+        }
+        .log-entry.error { color: #dc3545; }
+        .log-entry.success { color: #28a745; }
+        .log-entry.info { color: #007bff; }
+        .log-entry.warning { color: #ffc107; }
+        .audio-player {
+            display: inline-flex;
+            align-items: center;
+            gap: 10px;
+            margin-left: 10px;
+        }
+        .play-btn {
+            background: #007bff;
+            color: white;
+            border: none;
+            border-radius: 5px;
+            padding: 5px 10px;
+            cursor: pointer;
+            font-size: 12px;
+        }
+        .play-btn:hover {
+            background: #0056b3;
+        }
+    </style>
+</head>
+<body>
+    <div class="container">
+        <h1>🚀 Ultravox PCM - Otimizado</h1>
+        <div class="status">
+            <div>
+                <span class="status-dot" id="statusDot"></span>
+                <span id="statusText">Desconectado</span>
+            </div>
+            <span id="latencyText">Latência: --ms</span>
+        </div>
+        <div class="voice-selector">
+            <label for="voiceSelect">🔊 Voz TTS:</label>
+            <select id="voiceSelect">
+                <option value="pf_dora" selected>🇧🇷 [pf_dora] Português Feminino (Dora)</option>
+                <option value="pm_alex">🇧🇷 [pm_alex] Português Masculino (Alex)</option>
+                <option value="af_heart">🌍 [af_heart] Alternativa Feminina (Heart)</option>
+                <option value="af_bella">🌍 [af_bella] Alternativa Feminina (Bella)</option>
+            </select>
+        </div>
+        <div class="controls">
+            <button id="connectBtn" class="btn-primary">Conectar</button>
+            <button id="talkBtn" class="btn-success" disabled>Push to Talk</button>
+        </div>
+        <div class="metrics">
+            <div class="metric">
+                <div class="metric-label">Enviado</div>
+                <div class="metric-value" id="sentBytes">0 KB</div>
+            </div>
+            <div class="metric">
+                <div class="metric-label">Recebido</div>
+                <div class="metric-value" id="receivedBytes">0 KB</div>
+            </div>
+            <div class="metric">
+                <div class="metric-label">Formato</div>
+                <div class="metric-value" id="format">PCM</div>
+            </div>
+            <div class="metric">
+                <div class="metric-label">🎤 Voz</div>
+                <div class="metric-value" id="currentVoice" style="font-family: monospace; color: #4CAF50; font-weight: bold;">pf_dora</div>
+            </div>
+        </div>
+        <div class="log" id="log"></div>
+    </div>
+    <!-- Seção TTS Direto -->
+    <div class="container" style="margin-top: 20px;">
+        <h2>🎵 Text-to-Speech Direto</h2>
+        <p>Digite ou edite o texto abaixo e escolha uma voz para converter em áudio</p>
+        <div class="section">
+            <textarea id="ttsText" style="width: 100%; height: 120px; padding: 10px; border: 1px solid #333; border-radius: 8px; background: #1e1e1e; color: #e0e0e0; font-family: 'Segoe UI', system-ui, sans-serif; font-size: 14px; resize: vertical;">Olá! Teste de voz.</textarea>
+        </div>
+        <div class="section" style="display: flex; gap: 10px; align-items: center; margin-top: 15px;">
+            <label for="ttsVoiceSelect" style="font-weight: 600;">🔊 Voz:</label>
+            <select id="ttsVoiceSelect" style="flex: 1; padding: 8px; border: 1px solid #333; border-radius: 5px; background: #2a2a2a; color: #e0e0e0;">
+                <optgroup label="🇧🇷 Português">
+                    <option value="pf_dora" selected>[pf_dora] Feminino - Dora</option>
+                    <option value="pm_alex">[pm_alex] Masculino - Alex</option>
+                    <option value="pm_santa">[pm_santa] Masculino - Santa (Festivo)</option>
+                </optgroup>
+                <optgroup label="🇫🇷 Francês">
+                    <option value="ff_siwis">[ff_siwis] Feminino - Siwis (Nativa)</option>
+                </optgroup>
+                <optgroup label="🇺🇸 Inglês Americano">
+                    <option value="af_alloy">Feminino - Alloy</option>
+                    <option value="af_aoede">Feminino - Aoede</option>
+                    <option value="af_bella">Feminino - Bella</option>
+                    <option value="af_heart">Feminino - Heart</option>
+                    <option value="af_jessica">Feminino - Jessica</option>
+                    <option value="af_kore">Feminino - Kore</option>
+                    <option value="af_nicole">Feminino - Nicole</option>
+                    <option value="af_nova">Feminino - Nova</option>
+                    <option value="af_river">Feminino - River</option>
+                    <option value="af_sarah">Feminino - Sarah</option>
+                    <option value="af_sky">Feminino - Sky</option>
+                    <option value="am_adam">Masculino - Adam</option>
+                    <option value="am_echo">Masculino - Echo</option>
+                    <option value="am_eric">Masculino - Eric</option>
+                    <option value="am_fenrir">Masculino - Fenrir</option>
+                    <option value="am_liam">Masculino - Liam</option>
+                    <option value="am_michael">Masculino - Michael</option>
+                    <option value="am_onyx">Masculino - Onyx</option>
+                    <option value="am_puck">Masculino - Puck</option>
+                    <option value="am_santa">Masculino - Santa</option>
+                </optgroup>
+                <optgroup label="🇬🇧 Inglês Britânico">
+                    <option value="bf_alice">Feminino - Alice</option>
+                    <option value="bf_emma">Feminino - Emma</option>
+                    <option value="bf_isabella">Feminino - Isabella</option>
+                    <option value="bf_lily">Feminino - Lily</option>
+                    <option value="bm_daniel">Masculino - Daniel</option>
+                    <option value="bm_fable">Masculino - Fable</option>
+                    <option value="bm_george">Masculino - George</option>
+                    <option value="bm_lewis">Masculino - Lewis</option>
+                </optgroup>
+                <optgroup label="🇪🇸 Espanhol">
+                    <option value="ef_dora">Feminino - Dora</option>
+                    <option value="em_alex">Masculino - Alex</option>
+                    <option value="em_santa">Masculino - Santa</option>
+                </optgroup>
+                <optgroup label="🇮🇹 Italiano">
+                    <option value="if_sara">Feminino - Sara</option>
+                    <option value="im_nicola">Masculino - Nicola</option>
+                </optgroup>
+                <optgroup label="🇯🇵 Japonês">
+                    <option value="jf_alpha">Feminino - Alpha</option>
+                    <option value="jf_gongitsune">Feminino - Gongitsune</option>
+                    <option value="jf_nezumi">Feminino - Nezumi</option>
+                    <option value="jf_tebukuro">Feminino - Tebukuro</option>
+                    <option value="jm_kumo">Masculino - Kumo</option>
+                </optgroup>
+                <optgroup label="🇨🇳 Chinês">
+                    <option value="zf_xiaobei">Feminino - Xiaobei</option>
+                    <option value="zf_xiaoni">Feminino - Xiaoni</option>
+                    <option value="zf_xiaoxiao">Feminino - Xiaoxiao</option>
+                    <option value="zf_xiaoyi">Feminino - Xiaoyi</option>
+                    <option value="zm_yunjian">Masculino - Yunjian</option>
+                    <option value="zm_yunxi">Masculino - Yunxi</option>
+                    <option value="zm_yunxia">Masculino - Yunxia</option>
+                    <option value="zm_yunyang">Masculino - Yunyang</option>
+                </optgroup>
+                <optgroup label="🇮🇳 Hindi">
+                    <option value="hf_alpha">Feminino - Alpha</option>
+                    <option value="hf_beta">Feminino - Beta</option>
+                    <option value="hm_omega">Masculino - Omega</option>
+                    <option value="hm_psi">Masculino - Psi</option>
+                </optgroup>
+            </select>
+            <button id="ttsPlayBtn" class="btn-success" disabled style="padding: 10px 20px;">
+                ▶️ Gerar Áudio
+            </button>
+        </div>
+        <div id="ttsStatus" style="display: none; margin-top: 15px; padding: 15px; background: #2a2a2a; border-radius: 8px;">
+            <span id="ttsStatusText">⏳ Processando...</span>
+        </div>
+        <div id="ttsPlayer" style="display: none; margin-top: 15px;">
+            <audio id="ttsAudio" controls style="width: 100%;"></audio>
+        </div>
+    </div>
+    <script>
+        // Estado da aplicação
+        let ws = null;
+        let isConnected = false;
+        let isRecording = false;
+        let audioContext = null;
+        let stream = null;
+        let audioSource = null;
+        let audioProcessor = null;
+        let pcmBuffer = [];
+        // Métricas
+        const metrics = {
+            sentBytes: 0,
+            receivedBytes: 0,
+            latency: 0,
+            recordingStartTime: 0
+        };
+        // Elementos DOM
+        const elements = {
+            statusDot: document.getElementById('statusDot'),
+            statusText: document.getElementById('statusText'),
+            latencyText: document.getElementById('latencyText'),
+            connectBtn: document.getElementById('connectBtn'),
+            talkBtn: document.getElementById('talkBtn'),
+            voiceSelect: document.getElementById('voiceSelect'),
+            sentBytes: document.getElementById('sentBytes'),
+            receivedBytes: document.getElementById('receivedBytes'),
+            format: document.getElementById('format'),
+            log: document.getElementById('log'),
+            // TTS elements
+            ttsText: document.getElementById('ttsText'),
+            ttsVoiceSelect: document.getElementById('ttsVoiceSelect'),
+            ttsPlayBtn: document.getElementById('ttsPlayBtn'),
+            ttsStatus: document.getElementById('ttsStatus'),
+            ttsStatusText: document.getElementById('ttsStatusText'),
+            ttsPlayer: document.getElementById('ttsPlayer'),
+            ttsAudio: document.getElementById('ttsAudio')
+        };
+        // Log no console visual
+        function log(message, type = 'info') {
+            const time = new Date().toLocaleTimeString('pt-BR');
+            const entry = document.createElement('div');
+            entry.className = `log-entry ${type}`;
+            entry.innerHTML = `
+                <span class="log-time">[${time}]</span>
+                <span class="log-message">${message}</span>
+            `;
+            elements.log.appendChild(entry);
+            elements.log.scrollTop = elements.log.scrollHeight;
+            console.log(`[${type}] ${message}`);
+        }
+        // Atualizar métricas
+        function updateMetrics() {
+            elements.sentBytes.textContent = `${(metrics.sentBytes / 1024).toFixed(1)} KB`;
+            elements.receivedBytes.textContent = `${(metrics.receivedBytes / 1024).toFixed(1)} KB`;
+            elements.latencyText.textContent = `Latência: ${metrics.latency}ms`;
+        }
+        // Conectar ao WebSocket
+        async function connect() {
+            try {
+                // Solicitar acesso ao microfone
+                stream = await navigator.mediaDevices.getUserMedia({
+                    audio: {
+                        echoCancellation: true,
+                        noiseSuppression: true,
+                        sampleRate: 24000  // High quality 24kHz
+                    }
+                });
+                log('✅ Microfone acessado', 'success');
+                // Conectar WebSocket com suporte binário
+                const protocol = window.location.protocol === 'https:' ? 'wss:' : 'ws:';
+                const wsUrl = `${protocol}//${window.location.host}/ws`;
+                ws = new WebSocket(wsUrl);
+                ws.binaryType = 'arraybuffer';
+                ws.onopen = () => {
+                    isConnected = true;
+                    elements.statusDot.classList.add('connected');
+                    elements.statusText.textContent = 'Conectado';
+                    elements.connectBtn.textContent = 'Desconectar';
+                    elements.connectBtn.classList.remove('btn-primary');
+                    elements.connectBtn.classList.add('btn-danger');
+                    elements.talkBtn.disabled = false;
+                    // Enviar voz selecionada ao conectar
+                    const currentVoice = elements.voiceSelect.value || elements.ttsVoiceSelect.value || 'pf_dora';
+                    ws.send(JSON.stringify({
+                        type: 'set-voice',
+                        voice_id: currentVoice
+                    }));
+                    log(`🔊 Voz configurada: ${currentVoice}`, 'info');
+                    elements.ttsPlayBtn.disabled = false; // Habilitar TTS button
+                    log('✅ Conectado ao servidor', 'success');
+                };
+                ws.onmessage = (event) => {
+                    if (event.data instanceof ArrayBuffer) {
+                        // Áudio PCM binário recebido
+                        handlePCMAudio(event.data);
+                    } else {
+                        // Mensagem JSON
+                        const data = JSON.parse(event.data);
+                        handleMessage(data);
+                    }
+                };
+                ws.onerror = (error) => {
+                    log(`❌ Erro WebSocket: ${error}`, 'error');
+                };
+                ws.onclose = () => {
+                    disconnect();
+                };
+            } catch (error) {
+                log(`❌ Erro ao conectar: ${error.message}`, 'error');
+            }
+        }
+        // Desconectar
+        function disconnect() {
+            isConnected = false;
+            if (ws) {
+                ws.close();
+                ws = null;
+            }
+            if (stream) {
+                stream.getTracks().forEach(track => track.stop());
+                stream = null;
+            }
+            if (audioContext) {
+                audioContext.close();
+                audioContext = null;
+            }
+            elements.statusDot.classList.remove('connected');
+            elements.statusText.textContent = 'Desconectado';
+            elements.connectBtn.textContent = 'Conectar';
+            elements.connectBtn.classList.remove('btn-danger');
+            elements.connectBtn.classList.add('btn-primary');
+            elements.talkBtn.disabled = true;
+            log('👋 Desconectado', 'warning');
+        }
+        // Iniciar gravação PCM
+        function startRecording() {
+            if (isRecording) return;
+            isRecording = true;
+            metrics.recordingStartTime = Date.now();
+            elements.talkBtn.classList.add('recording');
+            elements.talkBtn.textContent = 'Gravando...';
+            pcmBuffer = [];
+            const sampleRate = 24000; // Sempre usar melhor qualidade
+            log(`🎤 Gravando PCM 16-bit @ ${sampleRate}Hz (alta qualidade)`, 'info');
+            // Criar AudioContext se necessário
+            if (!audioContext) {
+                // Sempre usar melhor qualidade (24kHz)
+                const sampleRate = 24000;
+                audioContext = new (window.AudioContext || window.webkitAudioContext)({
+                    sampleRate: sampleRate
+                });
+                log(`🎧 AudioContext criado: ${sampleRate}Hz (alta qualidade)`, 'info');
+            }
+            // Criar processador de áudio
+            audioSource = audioContext.createMediaStreamSource(stream);
+            audioProcessor = audioContext.createScriptProcessor(4096, 1, 1);
+            audioProcessor.onaudioprocess = (e) => {
+                if (!isRecording) return;
+                const inputData = e.inputBuffer.getChannelData(0);
+                // Calcular RMS (Root Mean Square) para melhor detecção de volume
+                let sumSquares = 0;
+                for (let i = 0; i < inputData.length; i++) {
+                    sumSquares += inputData[i] * inputData[i];
+                }
+                const rms = Math.sqrt(sumSquares / inputData.length);
+                // Calcular amplitude máxima também
+                let maxAmplitude = 0;
+                for (let i = 0; i < inputData.length; i++) {
+                    maxAmplitude = Math.max(maxAmplitude, Math.abs(inputData[i]));
+                }
+                // Detecção de voz baseada em RMS (mais confiável que amplitude máxima)
+                const voiceThreshold = 0.01; // Threshold para detectar voz
+                const hasVoice = rms > voiceThreshold;
+                // Aplicar ganho suave apenas se necessário
+                let gain = 1.0;
+                if (hasVoice && rms < 0.05) {
+                    // Ganho suave baseado em RMS, máximo 5x
+                    gain = Math.min(5.0, 0.05 / rms);
+                    if (gain > 1.2) {
+                        log(`🎤 Volume baixo detectado, aplicando ganho: ${gain.toFixed(1)}x`, 'info');
+                    }
+                }
+                // Converter Float32 para Int16 com processamento melhorado
+                const pcmData = new Int16Array(inputData.length);
+                for (let i = 0; i < inputData.length; i++) {
+                    // Aplicar ganho suave
+                    let sample = inputData[i] * gain;
+                    // Soft clipping para evitar distorção
+                    if (Math.abs(sample) > 0.95) {
+                        sample = Math.sign(sample) * (0.95 + 0.05 * Math.tanh((Math.abs(sample) - 0.95) * 10));
+                    }
+                    // Converter para Int16
+                    sample = Math.max(-1, Math.min(1, sample));
+                    pcmData[i] = sample < 0 ? sample * 0x8000 : sample * 0x7FFF;
+                }
+                // Adicionar ao buffer apenas se detectar voz
+                if (hasVoice) {
+                    pcmBuffer.push(pcmData);
+                }
+            };
+            audioSource.connect(audioProcessor);
+            audioProcessor.connect(audioContext.destination);
+        }
+        // Parar gravação e enviar
+        function stopRecording() {
+            if (!isRecording) return;
+            isRecording = false;
+            const duration = Date.now() - metrics.recordingStartTime;
+            elements.talkBtn.classList.remove('recording');
+            elements.talkBtn.textContent = 'Push to Talk';
+            // Desconectar processador
+            if (audioProcessor) {
+                audioProcessor.disconnect();
+                audioProcessor = null;
+            }
+            if (audioSource) {
+                audioSource.disconnect();
+                audioSource = null;
+            }
+            // Verificar se há áudio para enviar
+            if (pcmBuffer.length === 0) {
+                log(`⚠️ Nenhum áudio capturado (silêncio ou volume muito baixo)`, 'warning');
+                pcmBuffer = [];
+                return;
+            }
+            // Combinar todos os chunks PCM
+            const totalLength = pcmBuffer.reduce((acc, chunk) => acc + chunk.length, 0);
+            // Verificar tamanho mínimo (0.5 segundos)
+            const sampleRate = 24000; // Sempre 24kHz
+            const minSamples = sampleRate * 0.5;
+            if (totalLength < minSamples) {
+                log(`⚠️ Áudio muito curto: ${(totalLength/sampleRate).toFixed(2)}s (mínimo 0.5s)`, 'warning');
+                pcmBuffer = [];
+                return;
+            }
+            const fullPCM = new Int16Array(totalLength);
+            let offset = 0;
+            for (const chunk of pcmBuffer) {
+                fullPCM.set(chunk, offset);
+                offset += chunk.length;
+            }
+            // Calcular amplitude final para debug
+            let maxAmp = 0;
+            for (let i = 0; i < Math.min(fullPCM.length, 1000); i++) {
+                maxAmp = Math.max(maxAmp, Math.abs(fullPCM[i] / 32768));
+            }
+            // Enviar PCM binário direto (sem Base64!)
+            if (ws && ws.readyState === WebSocket.OPEN) {
+                // Enviar um header simples antes do áudio
+                const header = new ArrayBuffer(8);
+                const view = new DataView(header);
+                view.setUint32(0, 0x50434D16); // Magic: "PCM16"
+                view.setUint32(4, fullPCM.length * 2); // Tamanho em bytes
+                ws.send(header);
+                ws.send(fullPCM.buffer);
+                metrics.sentBytes += fullPCM.length * 2;
+                updateMetrics();
+                const sampleRate = 24000; // Sempre 24kHz
+                log(`📤 PCM enviado: ${(fullPCM.length * 2 / 1024).toFixed(1)}KB, ${(totalLength/sampleRate).toFixed(1)}s @ ${sampleRate}Hz, amp:${maxAmp.toFixed(3)}`, 'success');
+            }
+            // Limpar buffer após enviar
+            pcmBuffer = [];
+        }
+        // Processar mensagem JSON
+        function handleMessage(data) {
+            switch (data.type) {
+                case 'metrics':
+                    metrics.latency = data.latency;
+                    updateMetrics();
+                    log(`📊 Resposta: "${data.response}" (${data.latency}ms)`, 'success');
+                    break;
+                case 'error':
+                    log(`❌ Erro: ${data.message}`, 'error');
+                    break;
+                case 'tts-response':
+                    // Resposta do TTS direto (Opus 24kHz ou PCM)
+                    if (data.audio) {
+                        // Decodificar base64 para arraybuffer
+                        const binaryString = atob(data.audio);
+                        const bytes = new Uint8Array(binaryString.length);
+                        for (let i = 0; i < binaryString.length; i++) {
+                            bytes[i] = binaryString.charCodeAt(i);
+                        }
+                        let audioData = bytes.buffer;
+                        // IMPORTANTE: Usar a taxa enviada pelo servidor
+                        const sampleRate = data.sampleRate || 24000;
+                        console.log(`🎯 TTS Response - Taxa recebida: ${sampleRate}Hz, Formato: ${data.format}, Tamanho: ${bytes.length} bytes`);
+                        // Se for Opus, usar WebAudio API para decodificar nativamente
+                        let wavBuffer;
+                        if (data.format === 'opus') {
+                            console.log(`🗜️ Opus 24kHz recebido: ${(bytes.length/1024).toFixed(1)}KB`);
+                            // Log de economia de banda
+                            if (data.originalSize) {
+                                const compression = Math.round(100 - (bytes.length / data.originalSize) * 100);
+                                console.log(`📊 Economia de banda: ${compression}% (${(data.originalSize/1024).toFixed(1)}KB → ${(bytes.length/1024).toFixed(1)}KB)`);
+                            }
+                            // WebAudio API pode decodificar Opus nativamente
+                            // Por agora, tratar como PCM até implementar decoder completo
+                            wavBuffer = addWavHeader(audioData, sampleRate);
+                        } else {
+                            // PCM - adicionar WAV header com a taxa correta
+                            wavBuffer = addWavHeader(audioData, sampleRate);
+                        }
+                        // Log da qualidade recebida
+                        console.log(`🎵 TTS pronto: ${(audioData.byteLength/1024).toFixed(1)}KB @ ${sampleRate}Hz (${data.quality || 'high'} quality, ${data.format || 'pcm'})`);
+                        // Criar blob e URL
+                        const blob = new Blob([wavBuffer], { type: 'audio/wav' });
+                        const audioUrl = URL.createObjectURL(blob);
+                        // Atualizar player
+                        elements.ttsAudio.src = audioUrl;
+                        elements.ttsPlayer.style.display = 'block';
+                        elements.ttsStatus.style.display = 'none';
+                        elements.ttsPlayBtn.disabled = false;
+                        elements.ttsPlayBtn.textContent = '▶️ Gerar Áudio';
+                        log('🎵 Áudio TTS gerado com sucesso!', 'success');
+                    }
+                    break;
+            }
+        }
+        // Processar áudio PCM recebido
+        function handlePCMAudio(arrayBuffer) {
+            metrics.receivedBytes += arrayBuffer.byteLength;
+            updateMetrics();
+            // Criar WAV header para reproduzir
+            const wavBuffer = addWavHeader(arrayBuffer);
+            // Criar blob e URL para o áudio
+            const blob = new Blob([wavBuffer], { type: 'audio/wav' });
+            const audioUrl = URL.createObjectURL(blob);
+            // Criar log com botão de play
+            const time = new Date().toLocaleTimeString('pt-BR');
+            const entry = document.createElement('div');
+            entry.className = 'log-entry success';
+            entry.innerHTML = `
+                <span class="log-time">[${time}]</span>
+                <span class="log-message">🔊 Áudio recebido: ${(arrayBuffer.byteLength / 1024).toFixed(1)}KB</span>
+                <div class="audio-player">
+                    <button class="play-btn" onclick="playAudio('${audioUrl}')">▶️ Play</button>
+                    <audio id="audio-${Date.now()}" src="${audioUrl}" style="display: none;"></audio>
+                </div>
+            `;
+            elements.log.appendChild(entry);
+            elements.log.scrollTop = elements.log.scrollHeight;
+            // Auto-play o áudio
+            const audio = new Audio(audioUrl);
+            audio.play().catch(err => {
+                console.log('Auto-play bloqueado, use o botão para reproduzir');
+            });
+        }
+        // Função para tocar áudio manualmente
+        function playAudio(url) {
+            const audio = new Audio(url);
+            audio.play();
+        }
+        // Adicionar header WAV ao PCM
+        function addWavHeader(pcmBuffer, customSampleRate) {
+            const pcmData = new Uint8Array(pcmBuffer);
+            const wavBuffer = new ArrayBuffer(44 + pcmData.length);
+            const view = new DataView(wavBuffer);
+            // WAV header
+            const writeString = (offset, string) => {
+                for (let i = 0; i < string.length; i++) {
+                    view.setUint8(offset + i, string.charCodeAt(i));
+                }
+            };
+            writeString(0, 'RIFF');
+            view.setUint32(4, 36 + pcmData.length, true);
+            writeString(8, 'WAVE');
+            writeString(12, 'fmt ');
+            view.setUint32(16, 16, true); // fmt chunk size
+            view.setUint16(20, 1, true); // PCM format
+            view.setUint16(22, 1, true); // Mono
+            // Usar taxa customizada se fornecida, senão usar 24kHz
+            let sampleRate = customSampleRate || 24000;
+            console.log(`📝 WAV Header - Configurando taxa: ${sampleRate}Hz`);
+            view.setUint32(24, sampleRate, true); // Sample rate
+            view.setUint32(28, sampleRate * 2, true); // Byte rate: sampleRate * 1 * 2
+            view.setUint16(32, 2, true); // Block align: 1 * 2
+            view.setUint16(34, 16, true); // Bits per sample: 16-bit
+            writeString(36, 'data');
+            view.setUint32(40, pcmData.length, true);
+            // Copiar dados PCM
+            new Uint8Array(wavBuffer, 44).set(pcmData);
+            return wavBuffer;
+        }
+        // Event Listeners
+        elements.connectBtn.addEventListener('click', () => {
+            if (isConnected) {
+                disconnect();
+            } else {
+                connect();
+            }
+        });
+        elements.talkBtn.addEventListener('mousedown', startRecording);
+        elements.talkBtn.addEventListener('mouseup', stopRecording);
+        elements.talkBtn.addEventListener('mouseleave', stopRecording);
+        // Voice selector listener
+        elements.voiceSelect.addEventListener('change', (e) => {
+            const voice_id = e.target.value;
+            console.log('Voice select changed to:', voice_id);
+            // Update current voice display
+            const currentVoiceElement = document.getElementById('currentVoice');
+            if (currentVoiceElement) {
+                currentVoiceElement.textContent = voice_id;
+            }
+            if (ws && ws.readyState === WebSocket.OPEN) {
+                console.log('Sending set-voice command:', voice_id);
+                ws.send(JSON.stringify({
+                    type: 'set-voice',
+                    voice_id: voice_id
+                }));
+                log(`🔊 Voz alterada para: ${voice_id} - ${e.target.options[e.target.selectedIndex].text}`, 'info');
+            } else {
+                console.log('WebSocket not connected, cannot send voice change');
+                log(`⚠️ Conecte-se primeiro para mudar a voz`, 'warning');
+            }
+        });
+        elements.talkBtn.addEventListener('touchstart', startRecording);
+        elements.talkBtn.addEventListener('touchend', stopRecording);
+        // TTS Voice selector listener
+        elements.ttsVoiceSelect.addEventListener('change', (e) => {
+            const voice_id = e.target.value;
+            // Update main voice selector
+            elements.voiceSelect.value = voice_id;
+            // Update current voice display
+            const currentVoiceElement = document.getElementById('currentVoice');
+            if (currentVoiceElement) {
+                currentVoiceElement.textContent = voice_id;
+            }
+            // Send voice change to server
+            if (ws && ws.readyState === WebSocket.OPEN) {
+                ws.send(JSON.stringify({
+                    type: 'set-voice',
+                    voice_id: voice_id
+                }));
+                log(`🎤 Voz TTS alterada para: ${voice_id}`, 'info');
+            }
+        });
+        // TTS Button Event Listener
+        elements.ttsPlayBtn.addEventListener('click', (e) => {
+            e.preventDefault();
+            e.stopPropagation();
+            console.log('TTS Button clicked!');
+            const text = elements.ttsText.value.trim();
+            const voice = elements.ttsVoiceSelect.value;
+            console.log('TTS Text:', text);
+            console.log('TTS Voice:', voice);
+            if (!text) {
+                alert('Por favor, digite algum texto para converter em áudio');
+                return;
+            }
+            if (!ws || ws.readyState !== WebSocket.OPEN) {
+                alert('Por favor, conecte-se primeiro clicando em "Conectar"');
+                return;
+            }
+            // Mostrar status
+            elements.ttsStatus.style.display = 'block';
+            elements.ttsStatusText.textContent = '⏳ Gerando áudio...';
+            elements.ttsPlayBtn.disabled = true;
+            elements.ttsPlayBtn.textContent = '⏳ Processando...';
+            elements.ttsPlayer.style.display = 'none';
+            // Sempre usar melhor qualidade (24kHz)
+            const quality = 'high';
+            // Enviar request para TTS com qualidade máxima
+            const ttsRequest = {
+                type: 'text-to-speech',
+                text: text,
+                voice_id: voice,
+                quality: quality,
+                format: 'opus'  // Opus 24kHz @ 32kbps - máxima qualidade, mínima banda
+            };
+            console.log('Sending TTS request:', ttsRequest);
+            ws.send(JSON.stringify(ttsRequest));
+            log(`🎤 Solicitando TTS: voz=${voice}, texto="${text.substring(0, 50)}..."`, 'info');
+        });
+        // Inicialização
+        log('🚀 Ultravox Chat PCM Otimizado', 'info');
+        log('📊 Formato: PCM 16-bit @ 16kHz', 'info');
+        log('⚡ Sem FFmpeg, sem Base64!', 'success');
+    </script>
+</body>
+</html>

services/webrtc_gateway/ultravox-chat-server.js CHANGED Viewed

@@ -317,6 +317,22 @@ function handleMessage(clientId, data) {
             handleAudioData(clientId, data.audio);
             break;
         case 'broadcast':
             handleBroadcast(clientId, data.message);
             break;
@@ -561,27 +577,146 @@ const pcmBuffers = new Map();
 function handleBinaryMessage(clientId, buffer) {
     // Verificar se é header ou dados
     if (buffer.length === 8) {
-        // Header PCM
         const view = new DataView(buffer.buffer, buffer.byteOffset, buffer.length);
         const magic = view.getUint32(0);
         const size = view.getUint32(4);
         if (magic === 0x50434D16) { // "PCM16"
             console.log(`🎤 PCM header: ${size} bytes esperados`);
-            pcmBuffers.set(clientId, { expectedSize: size, data: Buffer.alloc(0) });
         }
     } else {
-        // Processar PCM diretamente (com ou sem header prévio)
-        console.log(`🎵 Processando PCM direto: ${buffer.length} bytes`);
-        handlePCMData(clientId, buffer);
-        // Limpar buffer info se existir
-        if (pcmBuffers.has(clientId)) {
-            pcmBuffers.delete(clientId);
         }
     }
 }
 // Processar dados PCM direto (sem conversão!)
 async function handlePCMData(clientId, pcmBuffer) {
     const client = clients.get(clientId);
@@ -655,13 +790,26 @@ async function handlePCMData(clientId, pcmBuffer) {
             });
         }
         // Sintetizar áudio com TTS
         const ttsResult = await synthesizeWithTTS(clientId, response, session);
         const responseAudio = ttsResult.audioData;
         console.log(`   🔊 Áudio sintetizado: ${responseAudio.length} bytes @ ${ttsResult.sampleRate}Hz`);
-        // Enviar PCM direto (sem conversão para WebM!)
-        client.ws.send(responseAudio);
         const totalLatency = Date.now() - startTime;
         console.log(`⏱️ Latência total: ${totalLatency}ms`);

             handleAudioData(clientId, data.audio);
             break;
+        case 'audio':
+            // Processar áudio enviado em formato JSON (como no teste)
+            if (data.data && data.format) {
+                const audioBuffer = Buffer.from(data.data, 'base64');
+                console.log(`🎤 Received audio JSON: ${audioBuffer.length} bytes, format: ${data.format}`);
+                if (data.format === 'float32') {
+                    // Áudio já está em Float32, processar diretamente sem conversão
+                    handleFloat32Audio(clientId, audioBuffer);
+                } else {
+                    // Processar como PCM int16
+                    handlePCMData(clientId, audioBuffer);
+                }
+            }
+            break;
         case 'broadcast':
             handleBroadcast(clientId, data.message);
             break;
 function handleBinaryMessage(clientId, buffer) {
     // Verificar se é header ou dados
     if (buffer.length === 8) {
+        // Header PCM ou Opus
         const view = new DataView(buffer.buffer, buffer.byteOffset, buffer.length);
         const magic = view.getUint32(0);
         const size = view.getUint32(4);
         if (magic === 0x50434D16) { // "PCM16"
             console.log(`🎤 PCM header: ${size} bytes esperados`);
+            pcmBuffers.set(clientId, { expectedSize: size, data: Buffer.alloc(0), type: 'pcm' });
+        } else if (magic === 0x4F505553) { // "OPUS"
+            console.log(`🎵 Opus header: ${size} bytes esperados`);
+            pcmBuffers.set(clientId, { expectedSize: size, data: Buffer.alloc(0), type: 'opus' });
         }
     } else {
+        // Verificar se temos um buffer esperando dados
+        const bufferInfo = pcmBuffers.get(clientId);
+        if (bufferInfo) {
+            // Adicionar dados ao buffer
+            bufferInfo.data = Buffer.concat([bufferInfo.data, buffer]);
+            console.log(`📦 Buffer acumulado: ${bufferInfo.data.length}/${bufferInfo.expectedSize} bytes`);
+            // Se recebemos todos os dados esperados
+            if (bufferInfo.data.length >= bufferInfo.expectedSize) {
+                if (bufferInfo.type === 'opus') {
+                    console.log(`🎵 Processando Opus: ${bufferInfo.data.length} bytes`);
+                    handleOpusData(clientId, bufferInfo.data);
+                } else {
+                    console.log(`🎤 Processando PCM: ${bufferInfo.data.length} bytes`);
+                    handlePCMData(clientId, bufferInfo.data);
+                }
+                pcmBuffers.delete(clientId);
+            }
+        } else {
+            // Processar PCM diretamente (sem header)
+            console.log(`🎵 Processando PCM direto: ${buffer.length} bytes`);
+            handlePCMData(clientId, buffer);
         }
     }
 }
+// Processar dados Opus
+async function handleOpusData(clientId, opusBuffer) {
+    try {
+        // Descomprimir Opus para PCM
+        const pcmBuffer = decompressOpusToPCM(opusBuffer);
+        console.log(`🎵 Opus descomprimido: ${opusBuffer.length} bytes -> ${pcmBuffer.length} bytes PCM`);
+        // Processar como PCM
+        await handlePCMData(clientId, pcmBuffer);
+    } catch (error) {
+        console.error(`❌ Erro ao processar Opus: ${error.message}`);
+    }
+}
+// Processar áudio que já está em Float32
+async function handleFloat32Audio(clientId, float32Buffer) {
+    const client = clients.get(clientId);
+    const session = sessions.get(clientId);
+    if (!client || !session) return;
+    if (client.isProcessing) {
+        console.log('⚠️ Já processando áudio, ignorando...');
+        return;
+    }
+    client.isProcessing = true;
+    const startTime = Date.now();
+    try {
+        console.log(`\n🎤 FLOAT32 AUDIO RECEBIDO [${clientId}]`);
+        console.log(`   Tamanho: ${float32Buffer.length} bytes`);
+        console.log(`   Formato: Float32 normalizado`);
+        // Áudio já está em Float32, apenas passar adiante
+        console.log(`   📊 Áudio Float32 pronto: ${float32Buffer.length} bytes`);
+        // Processar com Ultravox
+        const response = await processWithUltravox(clientId, float32Buffer, session);
+        console.log(`   📝 Resposta: "${response}"`);
+        // Armazenar na memória de conversação
+        const conversationId = client.conversationId;
+        if (conversationId) {
+            conversationMemory.addMessage(conversationId, {
+                role: 'user',
+                content: '[Áudio processado]',
+                audioSize: float32Buffer.length,
+                timestamp: startTime
+            });
+            conversationMemory.addMessage(conversationId, {
+                role: 'assistant',
+                content: response,
+                latency: Date.now() - startTime
+            });
+        }
+        // Enviar transcrição primeiro
+        client.ws.send(JSON.stringify({
+            type: 'transcription',
+            text: response,
+            timestamp: Date.now()
+        }));
+        // Sintetizar áudio com TTS
+        const ttsResult = await synthesizeWithTTS(clientId, response, session);
+        const responseAudio = ttsResult.audioData;
+        console.log(`   🔊 Áudio sintetizado: ${responseAudio.length} bytes @ ${ttsResult.sampleRate}Hz`);
+        // Enviar áudio como JSON
+        client.ws.send(JSON.stringify({
+            type: 'audio',
+            data: responseAudio.toString('base64'),
+            format: 'pcm',
+            sampleRate: ttsResult.sampleRate || 16000,
+            isFinal: true
+        }));
+        const totalLatency = Date.now() - startTime;
+        console.log(`⏱️ Latência total: ${totalLatency}ms`);
+        // Enviar métricas
+        client.ws.send(JSON.stringify({
+            type: 'metrics',
+            latency: totalLatency,
+            response: response
+        }));
+    } catch (error) {
+        console.error('❌ Erro ao processar áudio Float32:', error);
+        client.ws.send(JSON.stringify({
+            type: 'error',
+            message: error.message
+        }));
+    } finally {
+        client.isProcessing = false;
+    }
+}
 // Processar dados PCM direto (sem conversão!)
 async function handlePCMData(clientId, pcmBuffer) {
     const client = clients.get(clientId);
             });
         }
+        // Enviar transcrição primeiro
+        client.ws.send(JSON.stringify({
+            type: 'transcription',
+            text: response,
+            timestamp: Date.now()
+        }));
         // Sintetizar áudio com TTS
         const ttsResult = await synthesizeWithTTS(clientId, response, session);
         const responseAudio = ttsResult.audioData;
         console.log(`   🔊 Áudio sintetizado: ${responseAudio.length} bytes @ ${ttsResult.sampleRate}Hz`);
+        // Enviar áudio como JSON
+        client.ws.send(JSON.stringify({
+            type: 'audio',
+            data: responseAudio.toString('base64'),
+            format: 'pcm',
+            sampleRate: ttsResult.sampleRate || 16000,
+            isFinal: true
+        }));
         const totalLatency = Date.now() - startTime;
         console.log(`⏱️ Latência total: ${totalLatency}ms`);

services/webrtc_gateway/ultravox-chat-tailwind.html ADDED Viewed

	@@ -0,0 +1,393 @@

+<!DOCTYPE html>
+<html lang="pt-BR">
+<head>
+    <meta charset="UTF-8">
+    <meta name="viewport" content="width=device-width, initial-scale=1.0">
+    <title>Ultravox Chat - Real-time Voice Assistant</title>
+    <script src="https://cdn.tailwindcss.com"></script>
+    <script src="opus-decoder.js"></script>
+    <script>
+        tailwind.config = {
+            theme: {
+                extend: {
+                    animation: {
+                        'pulse-slow': 'pulse 3s cubic-bezier(0.4, 0, 0.6, 1) infinite',
+                    }
+                }
+            }
+        }
+    </script>
+</head>
+<body class="min-h-screen bg-gradient-to-br from-purple-600 via-purple-500 to-pink-500 p-4 flex items-center justify-center">
+    <div class="w-full max-w-2xl bg-white/95 backdrop-blur-sm rounded-2xl shadow-2xl p-6 md:p-8 space-y-6">
+        <!-- Header -->
+        <div class="text-center space-y-2">
+            <h1 class="text-3xl md:text-4xl font-bold bg-gradient-to-r from-purple-600 to-pink-600 bg-clip-text text-transparent">
+                Ultravox Chat
+            </h1>
+            <p class="text-gray-600 text-sm md:text-base">Real-time Voice Assistant</p>
+        </div>
+        <!-- Status Card -->
+        <div class="bg-gray-50 rounded-xl p-4 space-y-3">
+            <div class="flex items-center justify-between">
+                <span class="text-gray-700 font-medium">Connection Status</span>
+                <span id="status" class="inline-flex items-center px-3 py-1 rounded-full text-xs font-medium bg-gray-200 text-gray-800">
+                    Disconnected
+                </span>
+            </div>
+            <!-- Voice Selection -->
+            <div class="flex flex-col sm:flex-row gap-3">
+                <div class="flex-1">
+                    <label class="block text-sm font-medium text-gray-700 mb-1">Voice</label>
+                    <select id="voiceSelect" class="w-full px-3 py-2 border border-gray-300 rounded-lg focus:ring-2 focus:ring-purple-500 focus:border-transparent transition">
+                        <option value="pf_dora">Dora (Portuguese Female)</option>
+                        <option value="pm_alex">Alex (Portuguese Male)</option>
+                        <option value="pm_santa">Santa (Portuguese Male)</option>
+                    </select>
+                </div>
+            </div>
+        </div>
+        <!-- Controls -->
+        <div class="space-y-4">
+            <!-- Connect Button -->
+            <button id="connectBtn"
+                class="w-full py-3 px-6 bg-gradient-to-r from-purple-600 to-pink-600 text-white font-semibold rounded-lg hover:shadow-lg transform hover:scale-[1.02] transition-all duration-200">
+                Connect to Server
+            </button>
+            <!-- Push to Talk Button -->
+            <button id="talkBtn"
+                disabled
+                class="w-full py-4 px-6 bg-gray-100 text-gray-400 font-semibold rounded-lg disabled:opacity-50 disabled:cursor-not-allowed transition-all duration-200 relative overflow-hidden group">
+                <span class="relative z-10 flex items-center justify-center gap-2">
+                    <svg class="w-5 h-5" fill="none" stroke="currentColor" viewBox="0 0 24 24">
+                        <path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M19 11a7 7 0 01-7 7m0 0a7 7 0 01-7-7m7 7v4m0 0H8m4 0h4m-4-8a3 3 0 01-3-3V5a3 3 0 116 0v6a3 3 0 01-3 3z"></path>
+                    </svg>
+                    <span id="talkBtnText">Push to Talk</span>
+                </span>
+                <div class="absolute inset-0 bg-gradient-to-r from-purple-600 to-pink-600 transform scale-x-0 group-enabled:group-active:scale-x-100 transition-transform duration-200 origin-left"></div>
+            </button>
+        </div>
+        <!-- Activity Logs -->
+        <div class="bg-gray-50 rounded-xl p-4">
+            <h3 class="text-sm font-semibold text-gray-700 mb-3 flex items-center gap-2">
+                <svg class="w-4 h-4" fill="none" stroke="currentColor" viewBox="0 0 24 24">
+                    <path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M9 12h6m-6 4h6m2 5H7a2 2 0 01-2-2V5a2 2 0 012-2h5.586a1 1 0 01.707.293l5.414 5.414a1 1 0 01.293.707V19a2 2 0 01-2 2z"></path>
+                </svg>
+                Activity Log
+            </h3>
+            <div id="logs" class="space-y-2 max-h-40 overflow-y-auto text-xs text-gray-600 font-mono">
+                <div class="text-gray-400">Waiting for connection...</div>
+            </div>
+        </div>
+        <!-- Debug Info (Hidden by default) -->
+        <details class="bg-gray-50 rounded-xl p-4">
+            <summary class="cursor-pointer text-sm font-medium text-gray-700 hover:text-purple-600">
+                Debug Information
+            </summary>
+            <div class="mt-3 space-y-2 text-xs text-gray-600">
+                <div>Sample Rate: <span id="debugSampleRate" class="font-mono">24000 Hz</span></div>
+                <div>Buffer Size: <span id="debugBufferSize" class="font-mono">4096</span></div>
+                <div>Latency: <span id="debugLatency" class="font-mono">--</span></div>
+            </div>
+        </details>
+    </div>
+    <script>
+        const elements = {
+            status: document.getElementById('status'),
+            connectBtn: document.getElementById('connectBtn'),
+            talkBtn: document.getElementById('talkBtn'),
+            talkBtnText: document.getElementById('talkBtnText'),
+            logs: document.getElementById('logs'),
+            voiceSelect: document.getElementById('voiceSelect'),
+            debugLatency: document.getElementById('debugLatency')
+        };
+        let ws = null;
+        let audioContext = null;
+        let mediaStream = null;
+        let processor = null;
+        let isRecording = false;
+        let audioQueue = [];
+        let isPlaying = false;
+        let startTime = null;
+        function updateStatus(status, type = 'info') {
+            const statusClasses = {
+                'success': 'bg-green-100 text-green-800',
+                'error': 'bg-red-100 text-red-800',
+                'warning': 'bg-yellow-100 text-yellow-800',
+                'info': 'bg-blue-100 text-blue-800',
+                'default': 'bg-gray-200 text-gray-800'
+            };
+            elements.status.className = `inline-flex items-center px-3 py-1 rounded-full text-xs font-medium ${statusClasses[type] || statusClasses.default}`;
+            elements.status.textContent = status;
+        }
+        function log(message, type = 'info') {
+            const timestamp = new Date().toLocaleTimeString('pt-BR');
+            const colorClasses = {
+                'success': 'text-green-600',
+                'error': 'text-red-600',
+                'warning': 'text-yellow-600',
+                'info': 'text-blue-600'
+            };
+            const div = document.createElement('div');
+            div.className = colorClasses[type] || 'text-gray-600';
+            div.innerHTML = `<span class="text-gray-400">[${timestamp}]</span> ${message}`;
+            elements.logs.appendChild(div);
+            elements.logs.scrollTop = elements.logs.scrollHeight;
+            // Keep only last 50 logs
+            while (elements.logs.children.length > 50) {
+                elements.logs.removeChild(elements.logs.firstChild);
+            }
+        }
+        async function initAudioContext() {
+            if (!audioContext) {
+                audioContext = new (window.AudioContext || window.webkitAudioContext)({
+                    sampleRate: 24000,
+                    latencyHint: 'interactive'
+                });
+                log('Audio context initialized', 'success');
+            }
+            if (audioContext.state === 'suspended') {
+                await audioContext.resume();
+            }
+        }
+        async function playAudioChunk(audioData) {
+            if (!audioContext) return;
+            try {
+                const audioBuffer = audioContext.createBuffer(1, audioData.length, 24000);
+                audioBuffer.getChannelData(0).set(audioData);
+                const source = audioContext.createBufferSource();
+                source.buffer = audioBuffer;
+                source.connect(audioContext.destination);
+                return new Promise((resolve) => {
+                    source.onended = resolve;
+                    source.start();
+                });
+            } catch (error) {
+                console.error('Error playing audio:', error);
+            }
+        }
+        async function processAudioQueue() {
+            if (isPlaying || audioQueue.length === 0) return;
+            isPlaying = true;
+            while (audioQueue.length > 0) {
+                const audioData = audioQueue.shift();
+                await playAudioChunk(audioData);
+            }
+            isPlaying = false;
+            // Update latency
+            if (startTime) {
+                const latency = Date.now() - startTime;
+                elements.debugLatency.textContent = `${latency}ms`;
+                startTime = null;
+            }
+        }
+        function connectWebSocket() {
+            const protocol = window.location.protocol === 'https:' ? 'wss:' : 'ws:';
+            const wsUrl = `${protocol}//${window.location.host}/ultravox`;
+            log(`Connecting to ${wsUrl}...`);
+            ws = new WebSocket(wsUrl);
+            ws.binaryType = 'arraybuffer';
+            ws.onopen = () => {
+                updateStatus('Connected', 'success');
+                log('WebSocket connected', 'success');
+                elements.connectBtn.textContent = 'Disconnect';
+                elements.connectBtn.classList.remove('from-purple-600', 'to-pink-600');
+                elements.connectBtn.classList.add('from-red-500', 'to-red-600');
+                elements.talkBtn.disabled = false;
+                elements.talkBtn.classList.remove('bg-gray-100', 'text-gray-400');
+                elements.talkBtn.classList.add('bg-white', 'text-purple-600', 'border', 'border-purple-300', 'hover:border-purple-400');
+                // Send selected voice immediately after connection
+                const currentVoice = elements.voiceSelect.value || 'pf_dora';
+                ws.send(JSON.stringify({
+                    type: 'set-voice',
+                    voice_id: currentVoice
+                }));
+                log(`Voice set to: ${currentVoice}`, 'info');
+            };
+            ws.onmessage = async (event) => {
+                if (event.data instanceof ArrayBuffer) {
+                    const int16Array = new Int16Array(event.data);
+                    const float32Array = new Float32Array(int16Array.length);
+                    for (let i = 0; i < int16Array.length; i++) {
+                        float32Array[i] = int16Array[i] / 32768.0;
+                    }
+                    audioQueue.push(float32Array);
+                    processAudioQueue();
+                } else {
+                    try {
+                        const data = JSON.parse(event.data);
+                        if (data.type === 'transcription') {
+                            log(`Transcription: ${data.text}`, 'info');
+                        } else if (data.type === 'response') {
+                            log(`Response: ${data.text}`, 'success');
+                        } else if (data.type === 'voice-changed') {
+                            log(`Voice changed to: ${data.voice_id}`, 'info');
+                        }
+                    } catch (e) {
+                        log(`Server: ${event.data}`, 'info');
+                    }
+                }
+            };
+            ws.onerror = (error) => {
+                log('WebSocket error', 'error');
+                updateStatus('Error', 'error');
+            };
+            ws.onclose = () => {
+                updateStatus('Disconnected', 'default');
+                log('WebSocket disconnected', 'warning');
+                elements.connectBtn.textContent = 'Connect to Server';
+                elements.connectBtn.classList.remove('from-red-500', 'to-red-600');
+                elements.connectBtn.classList.add('from-purple-600', 'to-pink-600');
+                elements.talkBtn.disabled = true;
+                elements.talkBtn.classList.remove('bg-white', 'text-purple-600', 'border', 'border-purple-300', 'hover:border-purple-400');
+                elements.talkBtn.classList.add('bg-gray-100', 'text-gray-400');
+                ws = null;
+            };
+        }
+        async function startRecording() {
+            try {
+                await initAudioContext();
+                mediaStream = await navigator.mediaDevices.getUserMedia({
+                    audio: {
+                        channelCount: 1,
+                        sampleRate: 24000,
+                        echoCancellation: true,
+                        noiseSuppression: true,
+                        autoGainControl: true
+                    }
+                });
+                const source = audioContext.createMediaStreamSource(mediaStream);
+                processor = audioContext.createScriptProcessor(4096, 1, 1);
+                processor.onaudioprocess = (e) => {
+                    if (!isRecording) return;
+                    const inputData = e.inputBuffer.getChannelData(0);
+                    const pcmData = new Int16Array(inputData.length);
+                    for (let i = 0; i < inputData.length; i++) {
+                        const s = Math.max(-1, Math.min(1, inputData[i]));
+                        pcmData[i] = s < 0 ? s * 0x8000 : s * 0x7FFF;
+                    }
+                    if (ws && ws.readyState === WebSocket.OPEN) {
+                        ws.send(pcmData.buffer);
+                    }
+                };
+                source.connect(processor);
+                processor.connect(audioContext.destination);
+                isRecording = true;
+                startTime = Date.now();
+                elements.talkBtn.classList.add('animate-pulse-slow');
+                elements.talkBtn.querySelector('span').classList.add('text-white');
+                elements.talkBtnText.textContent = 'Recording... Release to send';
+                updateStatus('Recording', 'error');
+                log('Recording started', 'info');
+            } catch (error) {
+                console.error('Error starting recording:', error);
+                log('Failed to start recording', 'error');
+            }
+        }
+        function stopRecording() {
+            isRecording = false;
+            elements.talkBtn.classList.remove('animate-pulse-slow');
+            elements.talkBtn.querySelector('span').classList.remove('text-white');
+            elements.talkBtnText.textContent = 'Push to Talk';
+            updateStatus('Connected', 'success');
+            if (processor) {
+                processor.disconnect();
+                processor = null;
+            }
+            if (mediaStream) {
+                mediaStream.getTracks().forEach(track => track.stop());
+                mediaStream = null;
+            }
+            if (ws && ws.readyState === WebSocket.OPEN) {
+                ws.send(JSON.stringify({ type: 'end_audio' }));
+            }
+            log('Recording stopped', 'info');
+        }
+        // Event Listeners
+        elements.connectBtn.addEventListener('click', () => {
+            if (ws && ws.readyState === WebSocket.OPEN) {
+                ws.close();
+            } else {
+                connectWebSocket();
+            }
+        });
+        elements.talkBtn.addEventListener('mousedown', startRecording);
+        elements.talkBtn.addEventListener('mouseup', stopRecording);
+        elements.talkBtn.addEventListener('mouseleave', () => {
+            if (isRecording) stopRecording();
+        });
+        // Touch events for mobile
+        elements.talkBtn.addEventListener('touchstart', (e) => {
+            e.preventDefault();
+            startRecording();
+        });
+        elements.talkBtn.addEventListener('touchend', (e) => {
+            e.preventDefault();
+            stopRecording();
+        });
+        // Voice selection change
+        elements.voiceSelect.addEventListener('change', () => {
+            if (ws && ws.readyState === WebSocket.OPEN) {
+                const voice = elements.voiceSelect.value;
+                ws.send(JSON.stringify({
+                    type: 'set-voice',
+                    voice_id: voice
+                }));
+                log(`Voice changed to: ${voice}`, 'info');
+            }
+        });
+        // Initialize
+        log('Application ready', 'success');
+    </script>
+</body>
+</html>

services/webrtc_gateway/ultravox-chat.html ADDED Viewed

	@@ -0,0 +1,964 @@

+<!DOCTYPE html>
+<html lang="pt-BR">
+<head>
+    <meta charset="UTF-8">
+    <meta name="viewport" content="width=device-width, initial-scale=1.0">
+    <title>Ultravox Chat PCM - Otimizado</title>
+    <script src="opus-decoder.js"></script>
+    <style>
+        * {
+            margin: 0;
+            padding: 0;
+            box-sizing: border-box;
+        }
+        body {
+            font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, sans-serif;
+            background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
+            min-height: 100vh;
+            display: flex;
+            justify-content: center;
+            align-items: center;
+            padding: 20px;
+        }
+        .container {
+            background: white;
+            border-radius: 20px;
+            box-shadow: 0 20px 60px rgba(0,0,0,0.3);
+            padding: 40px;
+            max-width: 600px;
+            width: 100%;
+        }
+        h1 {
+            text-align: center;
+            color: #333;
+            margin-bottom: 30px;
+            font-size: 28px;
+        }
+        .status {
+            background: #f8f9fa;
+            border-radius: 10px;
+            padding: 15px;
+            margin-bottom: 20px;
+            display: flex;
+            align-items: center;
+            justify-content: space-between;
+        }
+        .status-dot {
+            width: 12px;
+            height: 12px;
+            border-radius: 50%;
+            background: #dc3545;
+            margin-right: 10px;
+            display: inline-block;
+        }
+        .status-dot.connected {
+            background: #28a745;
+            animation: pulse 2s infinite;
+        }
+        @keyframes pulse {
+            0% { box-shadow: 0 0 0 0 rgba(40, 167, 69, 0.7); }
+            70% { box-shadow: 0 0 0 10px rgba(40, 167, 69, 0); }
+            100% { box-shadow: 0 0 0 0 rgba(40, 167, 69, 0); }
+        }
+        .controls {
+            display: flex;
+            gap: 10px;
+            margin-bottom: 20px;
+        }
+        .voice-selector {
+            display: flex;
+            align-items: center;
+            gap: 10px;
+            margin-bottom: 20px;
+            padding: 10px;
+            background: #f8f9fa;
+            border-radius: 10px;
+        }
+        .voice-selector label {
+            font-weight: 600;
+            color: #555;
+        }
+        .voice-selector select {
+            flex: 1;
+            padding: 8px;
+            border: 2px solid #ddd;
+            border-radius: 5px;
+            font-size: 14px;
+            background: white;
+            cursor: pointer;
+        }
+        .voice-selector select:focus {
+            outline: none;
+            border-color: #667eea;
+        }
+        button {
+            flex: 1;
+            padding: 15px;
+            border: none;
+            border-radius: 10px;
+            font-size: 16px;
+            font-weight: 600;
+            cursor: pointer;
+            transition: all 0.3s ease;
+        }
+        button:disabled {
+            opacity: 0.5;
+            cursor: not-allowed;
+        }
+        .btn-primary {
+            background: #007bff;
+            color: white;
+        }
+        .btn-primary:hover:not(:disabled) {
+            background: #0056b3;
+            transform: translateY(-2px);
+            box-shadow: 0 5px 15px rgba(0,123,255,0.3);
+        }
+        .btn-danger {
+            background: #dc3545;
+            color: white;
+        }
+        .btn-danger:hover:not(:disabled) {
+            background: #c82333;
+        }
+        .btn-success {
+            background: #28a745;
+            color: white;
+        }
+        .btn-success.recording {
+            background: #dc3545;
+            animation: recordPulse 1s infinite;
+        }
+        @keyframes recordPulse {
+            0%, 100% { opacity: 1; }
+            50% { opacity: 0.7; }
+        }
+        .metrics {
+            display: grid;
+            grid-template-columns: repeat(3, 1fr);
+            gap: 15px;
+            margin-bottom: 20px;
+        }
+        .metric {
+            background: #f8f9fa;
+            padding: 15px;
+            border-radius: 10px;
+            text-align: center;
+        }
+        .metric-label {
+            font-size: 12px;
+            color: #6c757d;
+            margin-bottom: 5px;
+        }
+        .metric-value {
+            font-size: 24px;
+            font-weight: bold;
+            color: #333;
+        }
+        .log {
+            background: #f8f9fa;
+            border-radius: 10px;
+            padding: 20px;
+            height: 300px;
+            overflow-y: auto;
+            font-family: 'Monaco', 'Menlo', monospace;
+            font-size: 12px;
+        }
+        .log-entry {
+            padding: 5px 0;
+            border-bottom: 1px solid #e9ecef;
+            display: flex;
+            align-items: flex-start;
+        }
+        .log-time {
+            color: #6c757d;
+            margin-right: 10px;
+            flex-shrink: 0;
+        }
+        .log-message {
+            flex: 1;
+        }
+        .log-entry.error { color: #dc3545; }
+        .log-entry.success { color: #28a745; }
+        .log-entry.info { color: #007bff; }
+        .log-entry.warning { color: #ffc107; }
+        .audio-player {
+            display: inline-flex;
+            align-items: center;
+            gap: 10px;
+            margin-left: 10px;
+        }
+        .play-btn {
+            background: #007bff;
+            color: white;
+            border: none;
+            border-radius: 5px;
+            padding: 5px 10px;
+            cursor: pointer;
+            font-size: 12px;
+        }
+        .play-btn:hover {
+            background: #0056b3;
+        }
+    </style>
+</head>
+<body>
+    <div class="container">
+        <h1>🚀 Ultravox PCM - Otimizado</h1>
+        <div class="status">
+            <div>
+                <span class="status-dot" id="statusDot"></span>
+                <span id="statusText">Desconectado</span>
+            </div>
+            <span id="latencyText">Latência: --ms</span>
+        </div>
+        <div class="voice-selector">
+            <label for="voiceSelect">🔊 Voz TTS:</label>
+            <select id="voiceSelect">
+                <option value="pf_dora" selected>🇧🇷 [pf_dora] Português Feminino (Dora)</option>
+                <option value="pm_alex">🇧🇷 [pm_alex] Português Masculino (Alex)</option>
+                <option value="af_heart">🌍 [af_heart] Alternativa Feminina (Heart)</option>
+                <option value="af_bella">🌍 [af_bella] Alternativa Feminina (Bella)</option>
+            </select>
+        </div>
+        <div class="controls">
+            <button id="connectBtn" class="btn-primary">Conectar</button>
+            <button id="talkBtn" class="btn-success" disabled>Push to Talk</button>
+        </div>
+        <div class="metrics">
+            <div class="metric">
+                <div class="metric-label">Enviado</div>
+                <div class="metric-value" id="sentBytes">0 KB</div>
+            </div>
+            <div class="metric">
+                <div class="metric-label">Recebido</div>
+                <div class="metric-value" id="receivedBytes">0 KB</div>
+            </div>
+            <div class="metric">
+                <div class="metric-label">Formato</div>
+                <div class="metric-value" id="format">PCM</div>
+            </div>
+            <div class="metric">
+                <div class="metric-label">🎤 Voz</div>
+                <div class="metric-value" id="currentVoice" style="font-family: monospace; color: #4CAF50; font-weight: bold;">pf_dora</div>
+            </div>
+        </div>
+        <div class="log" id="log"></div>
+    </div>
+    <!-- Seção TTS Direto -->
+    <div class="container" style="margin-top: 20px;">
+        <h2>🎵 Text-to-Speech Direto</h2>
+        <p>Digite ou edite o texto abaixo e escolha uma voz para converter em áudio</p>
+        <div class="section">
+            <textarea id="ttsText" style="width: 100%; height: 120px; padding: 10px; border: 1px solid #333; border-radius: 8px; background: #1e1e1e; color: #e0e0e0; font-family: 'Segoe UI', system-ui, sans-serif; font-size: 14px; resize: vertical;">Olá! Teste de voz.</textarea>
+        </div>
+        <div class="section" style="display: flex; gap: 10px; align-items: center; margin-top: 15px;">
+            <label for="ttsVoiceSelect" style="font-weight: 600;">🔊 Voz:</label>
+            <select id="ttsVoiceSelect" style="flex: 1; padding: 8px; border: 1px solid #333; border-radius: 5px; background: #2a2a2a; color: #e0e0e0;">
+                <optgroup label="🇧🇷 Português">
+                    <option value="pf_dora" selected>[pf_dora] Feminino - Dora</option>
+                    <option value="pm_alex">[pm_alex] Masculino - Alex</option>
+                    <option value="pm_santa">[pm_santa] Masculino - Santa (Festivo)</option>
+                </optgroup>
+                <optgroup label="🇫🇷 Francês">
+                    <option value="ff_siwis">[ff_siwis] Feminino - Siwis (Nativa)</option>
+                </optgroup>
+                <optgroup label="🇺🇸 Inglês Americano">
+                    <option value="af_alloy">Feminino - Alloy</option>
+                    <option value="af_aoede">Feminino - Aoede</option>
+                    <option value="af_bella">Feminino - Bella</option>
+                    <option value="af_heart">Feminino - Heart</option>
+                    <option value="af_jessica">Feminino - Jessica</option>
+                    <option value="af_kore">Feminino - Kore</option>
+                    <option value="af_nicole">Feminino - Nicole</option>
+                    <option value="af_nova">Feminino - Nova</option>
+                    <option value="af_river">Feminino - River</option>
+                    <option value="af_sarah">Feminino - Sarah</option>
+                    <option value="af_sky">Feminino - Sky</option>
+                    <option value="am_adam">Masculino - Adam</option>
+                    <option value="am_echo">Masculino - Echo</option>
+                    <option value="am_eric">Masculino - Eric</option>
+                    <option value="am_fenrir">Masculino - Fenrir</option>
+                    <option value="am_liam">Masculino - Liam</option>
+                    <option value="am_michael">Masculino - Michael</option>
+                    <option value="am_onyx">Masculino - Onyx</option>
+                    <option value="am_puck">Masculino - Puck</option>
+                    <option value="am_santa">Masculino - Santa</option>
+                </optgroup>
+                <optgroup label="🇬🇧 Inglês Britânico">
+                    <option value="bf_alice">Feminino - Alice</option>
+                    <option value="bf_emma">Feminino - Emma</option>
+                    <option value="bf_isabella">Feminino - Isabella</option>
+                    <option value="bf_lily">Feminino - Lily</option>
+                    <option value="bm_daniel">Masculino - Daniel</option>
+                    <option value="bm_fable">Masculino - Fable</option>
+                    <option value="bm_george">Masculino - George</option>
+                    <option value="bm_lewis">Masculino - Lewis</option>
+                </optgroup>
+                <optgroup label="🇪🇸 Espanhol">
+                    <option value="ef_dora">Feminino - Dora</option>
+                    <option value="em_alex">Masculino - Alex</option>
+                    <option value="em_santa">Masculino - Santa</option>
+                </optgroup>
+                <optgroup label="🇮🇹 Italiano">
+                    <option value="if_sara">Feminino - Sara</option>
+                    <option value="im_nicola">Masculino - Nicola</option>
+                </optgroup>
+                <optgroup label="🇯🇵 Japonês">
+                    <option value="jf_alpha">Feminino - Alpha</option>
+                    <option value="jf_gongitsune">Feminino - Gongitsune</option>
+                    <option value="jf_nezumi">Feminino - Nezumi</option>
+                    <option value="jf_tebukuro">Feminino - Tebukuro</option>
+                    <option value="jm_kumo">Masculino - Kumo</option>
+                </optgroup>
+                <optgroup label="🇨🇳 Chinês">
+                    <option value="zf_xiaobei">Feminino - Xiaobei</option>
+                    <option value="zf_xiaoni">Feminino - Xiaoni</option>
+                    <option value="zf_xiaoxiao">Feminino - Xiaoxiao</option>
+                    <option value="zf_xiaoyi">Feminino - Xiaoyi</option>
+                    <option value="zm_yunjian">Masculino - Yunjian</option>
+                    <option value="zm_yunxi">Masculino - Yunxi</option>
+                    <option value="zm_yunxia">Masculino - Yunxia</option>
+                    <option value="zm_yunyang">Masculino - Yunyang</option>
+                </optgroup>
+                <optgroup label="🇮🇳 Hindi">
+                    <option value="hf_alpha">Feminino - Alpha</option>
+                    <option value="hf_beta">Feminino - Beta</option>
+                    <option value="hm_omega">Masculino - Omega</option>
+                    <option value="hm_psi">Masculino - Psi</option>
+                </optgroup>
+            </select>
+            <button id="ttsPlayBtn" class="btn-success" disabled style="padding: 10px 20px;">
+                ▶️ Gerar Áudio
+            </button>
+        </div>
+        <div id="ttsStatus" style="display: none; margin-top: 15px; padding: 15px; background: #2a2a2a; border-radius: 8px;">
+            <span id="ttsStatusText">⏳ Processando...</span>
+        </div>
+        <div id="ttsPlayer" style="display: none; margin-top: 15px;">
+            <audio id="ttsAudio" controls style="width: 100%;"></audio>
+        </div>
+    </div>
+    <script>
+        // Estado da aplicação
+        let ws = null;
+        let isConnected = false;
+        let isRecording = false;
+        let audioContext = null;
+        let stream = null;
+        let audioSource = null;
+        let audioProcessor = null;
+        let pcmBuffer = [];
+        // Métricas
+        const metrics = {
+            sentBytes: 0,
+            receivedBytes: 0,
+            latency: 0,
+            recordingStartTime: 0
+        };
+        // Elementos DOM
+        const elements = {
+            statusDot: document.getElementById('statusDot'),
+            statusText: document.getElementById('statusText'),
+            latencyText: document.getElementById('latencyText'),
+            connectBtn: document.getElementById('connectBtn'),
+            talkBtn: document.getElementById('talkBtn'),
+            voiceSelect: document.getElementById('voiceSelect'),
+            sentBytes: document.getElementById('sentBytes'),
+            receivedBytes: document.getElementById('receivedBytes'),
+            format: document.getElementById('format'),
+            log: document.getElementById('log'),
+            // TTS elements
+            ttsText: document.getElementById('ttsText'),
+            ttsVoiceSelect: document.getElementById('ttsVoiceSelect'),
+            ttsPlayBtn: document.getElementById('ttsPlayBtn'),
+            ttsStatus: document.getElementById('ttsStatus'),
+            ttsStatusText: document.getElementById('ttsStatusText'),
+            ttsPlayer: document.getElementById('ttsPlayer'),
+            ttsAudio: document.getElementById('ttsAudio')
+        };
+        // Log no console visual
+        function log(message, type = 'info') {
+            const time = new Date().toLocaleTimeString('pt-BR');
+            const entry = document.createElement('div');
+            entry.className = `log-entry ${type}`;
+            entry.innerHTML = `
+                <span class="log-time">[${time}]</span>
+                <span class="log-message">${message}</span>
+            `;
+            elements.log.appendChild(entry);
+            elements.log.scrollTop = elements.log.scrollHeight;
+            console.log(`[${type}] ${message}`);
+        }
+        // Atualizar métricas
+        function updateMetrics() {
+            elements.sentBytes.textContent = `${(metrics.sentBytes / 1024).toFixed(1)} KB`;
+            elements.receivedBytes.textContent = `${(metrics.receivedBytes / 1024).toFixed(1)} KB`;
+            elements.latencyText.textContent = `Latência: ${metrics.latency}ms`;
+        }
+        // Conectar ao WebSocket
+        async function connect() {
+            try {
+                // Solicitar acesso ao microfone
+                stream = await navigator.mediaDevices.getUserMedia({
+                    audio: {
+                        echoCancellation: true,
+                        noiseSuppression: true,
+                        sampleRate: 24000  // High quality 24kHz
+                    }
+                });
+                log('✅ Microfone acessado', 'success');
+                // Conectar WebSocket com suporte binário
+                const protocol = window.location.protocol === 'https:' ? 'wss:' : 'ws:';
+                const wsUrl = `${protocol}//${window.location.host}/ws`;
+                ws = new WebSocket(wsUrl);
+                ws.binaryType = 'arraybuffer';
+                ws.onopen = () => {
+                    isConnected = true;
+                    elements.statusDot.classList.add('connected');
+                    elements.statusText.textContent = 'Conectado';
+                    elements.connectBtn.textContent = 'Desconectar';
+                    elements.connectBtn.classList.remove('btn-primary');
+                    elements.connectBtn.classList.add('btn-danger');
+                    elements.talkBtn.disabled = false;
+                    // Enviar voz selecionada ao conectar
+                    const currentVoice = elements.voiceSelect.value || elements.ttsVoiceSelect.value || 'pf_dora';
+                    ws.send(JSON.stringify({
+                        type: 'set-voice',
+                        voice_id: currentVoice
+                    }));
+                    log(`🔊 Voz configurada: ${currentVoice}`, 'info');
+                    elements.ttsPlayBtn.disabled = false; // Habilitar TTS button
+                    log('✅ Conectado ao servidor', 'success');
+                };
+                ws.onmessage = (event) => {
+                    if (event.data instanceof ArrayBuffer) {
+                        // Áudio PCM binário recebido
+                        handlePCMAudio(event.data);
+                    } else {
+                        // Mensagem JSON
+                        const data = JSON.parse(event.data);
+                        handleMessage(data);
+                    }
+                };
+                ws.onerror = (error) => {
+                    log(`❌ Erro WebSocket: ${error}`, 'error');
+                };
+                ws.onclose = () => {
+                    disconnect();
+                };
+            } catch (error) {
+                log(`❌ Erro ao conectar: ${error.message}`, 'error');
+            }
+        }
+        // Desconectar
+        function disconnect() {
+            isConnected = false;
+            if (ws) {
+                ws.close();
+                ws = null;
+            }
+            if (stream) {
+                stream.getTracks().forEach(track => track.stop());
+                stream = null;
+            }
+            if (audioContext) {
+                audioContext.close();
+                audioContext = null;
+            }
+            elements.statusDot.classList.remove('connected');
+            elements.statusText.textContent = 'Desconectado';
+            elements.connectBtn.textContent = 'Conectar';
+            elements.connectBtn.classList.remove('btn-danger');
+            elements.connectBtn.classList.add('btn-primary');
+            elements.talkBtn.disabled = true;
+            log('👋 Desconectado', 'warning');
+        }
+        // Iniciar gravação PCM
+        function startRecording() {
+            if (isRecording) return;
+            isRecording = true;
+            metrics.recordingStartTime = Date.now();
+            elements.talkBtn.classList.add('recording');
+            elements.talkBtn.textContent = 'Gravando...';
+            pcmBuffer = [];
+            const sampleRate = 24000; // Sempre usar melhor qualidade
+            log(`🎤 Gravando PCM 16-bit @ ${sampleRate}Hz (alta qualidade)`, 'info');
+            // Criar AudioContext se necessário
+            if (!audioContext) {
+                // Sempre usar melhor qualidade (24kHz)
+                const sampleRate = 24000;
+                audioContext = new (window.AudioContext || window.webkitAudioContext)({
+                    sampleRate: sampleRate
+                });
+                log(`🎧 AudioContext criado: ${sampleRate}Hz (alta qualidade)`, 'info');
+            }
+            // Criar processador de áudio
+            audioSource = audioContext.createMediaStreamSource(stream);
+            audioProcessor = audioContext.createScriptProcessor(4096, 1, 1);
+            audioProcessor.onaudioprocess = (e) => {
+                if (!isRecording) return;
+                const inputData = e.inputBuffer.getChannelData(0);
+                // Calcular RMS (Root Mean Square) para melhor detecção de volume
+                let sumSquares = 0;
+                for (let i = 0; i < inputData.length; i++) {
+                    sumSquares += inputData[i] * inputData[i];
+                }
+                const rms = Math.sqrt(sumSquares / inputData.length);
+                // Calcular amplitude máxima também
+                let maxAmplitude = 0;
+                for (let i = 0; i < inputData.length; i++) {
+                    maxAmplitude = Math.max(maxAmplitude, Math.abs(inputData[i]));
+                }
+                // Detecção de voz baseada em RMS (mais confiável que amplitude máxima)
+                const voiceThreshold = 0.01; // Threshold para detectar voz
+                const hasVoice = rms > voiceThreshold;
+                // Aplicar ganho suave apenas se necessário
+                let gain = 1.0;
+                if (hasVoice && rms < 0.05) {
+                    // Ganho suave baseado em RMS, máximo 5x
+                    gain = Math.min(5.0, 0.05 / rms);
+                    if (gain > 1.2) {
+                        log(`🎤 Volume baixo detectado, aplicando ganho: ${gain.toFixed(1)}x`, 'info');
+                    }
+                }
+                // Converter Float32 para Int16 com processamento melhorado
+                const pcmData = new Int16Array(inputData.length);
+                for (let i = 0; i < inputData.length; i++) {
+                    // Aplicar ganho suave
+                    let sample = inputData[i] * gain;
+                    // Soft clipping para evitar distorção
+                    if (Math.abs(sample) > 0.95) {
+                        sample = Math.sign(sample) * (0.95 + 0.05 * Math.tanh((Math.abs(sample) - 0.95) * 10));
+                    }
+                    // Converter para Int16
+                    sample = Math.max(-1, Math.min(1, sample));
+                    pcmData[i] = sample < 0 ? sample * 0x8000 : sample * 0x7FFF;
+                }
+                // Adicionar ao buffer apenas se detectar voz
+                if (hasVoice) {
+                    pcmBuffer.push(pcmData);
+                }
+            };
+            audioSource.connect(audioProcessor);
+            audioProcessor.connect(audioContext.destination);
+        }
+        // Parar gravação e enviar
+        function stopRecording() {
+            if (!isRecording) return;
+            isRecording = false;
+            const duration = Date.now() - metrics.recordingStartTime;
+            elements.talkBtn.classList.remove('recording');
+            elements.talkBtn.textContent = 'Push to Talk';
+            // Desconectar processador
+            if (audioProcessor) {
+                audioProcessor.disconnect();
+                audioProcessor = null;
+            }
+            if (audioSource) {
+                audioSource.disconnect();
+                audioSource = null;
+            }
+            // Verificar se há áudio para enviar
+            if (pcmBuffer.length === 0) {
+                log(`⚠️ Nenhum áudio capturado (silêncio ou volume muito baixo)`, 'warning');
+                pcmBuffer = [];
+                return;
+            }
+            // Combinar todos os chunks PCM
+            const totalLength = pcmBuffer.reduce((acc, chunk) => acc + chunk.length, 0);
+            // Verificar tamanho mínimo (0.5 segundos)
+            const sampleRate = 24000; // Sempre 24kHz
+            const minSamples = sampleRate * 0.5;
+            if (totalLength < minSamples) {
+                log(`⚠️ Áudio muito curto: ${(totalLength/sampleRate).toFixed(2)}s (mínimo 0.5s)`, 'warning');
+                pcmBuffer = [];
+                return;
+            }
+            const fullPCM = new Int16Array(totalLength);
+            let offset = 0;
+            for (const chunk of pcmBuffer) {
+                fullPCM.set(chunk, offset);
+                offset += chunk.length;
+            }
+            // Calcular amplitude final para debug
+            let maxAmp = 0;
+            for (let i = 0; i < Math.min(fullPCM.length, 1000); i++) {
+                maxAmp = Math.max(maxAmp, Math.abs(fullPCM[i] / 32768));
+            }
+            // Enviar PCM binário direto (sem Base64!)
+            if (ws && ws.readyState === WebSocket.OPEN) {
+                // Enviar um header simples antes do áudio
+                const header = new ArrayBuffer(8);
+                const view = new DataView(header);
+                view.setUint32(0, 0x50434D16); // Magic: "PCM16"
+                view.setUint32(4, fullPCM.length * 2); // Tamanho em bytes
+                ws.send(header);
+                ws.send(fullPCM.buffer);
+                metrics.sentBytes += fullPCM.length * 2;
+                updateMetrics();
+                const sampleRate = 24000; // Sempre 24kHz
+                log(`📤 PCM enviado: ${(fullPCM.length * 2 / 1024).toFixed(1)}KB, ${(totalLength/sampleRate).toFixed(1)}s @ ${sampleRate}Hz, amp:${maxAmp.toFixed(3)}`, 'success');
+            }
+            // Limpar buffer após enviar
+            pcmBuffer = [];
+        }
+        // Processar mensagem JSON
+        function handleMessage(data) {
+            switch (data.type) {
+                case 'metrics':
+                    metrics.latency = data.latency;
+                    updateMetrics();
+                    log(`📊 Resposta: "${data.response}" (${data.latency}ms)`, 'success');
+                    break;
+                case 'error':
+                    log(`❌ Erro: ${data.message}`, 'error');
+                    break;
+                case 'tts-response':
+                    // Resposta do TTS direto (Opus 24kHz ou PCM)
+                    if (data.audio) {
+                        // Decodificar base64 para arraybuffer
+                        const binaryString = atob(data.audio);
+                        const bytes = new Uint8Array(binaryString.length);
+                        for (let i = 0; i < binaryString.length; i++) {
+                            bytes[i] = binaryString.charCodeAt(i);
+                        }
+                        let audioData = bytes.buffer;
+                        // IMPORTANTE: Usar a taxa enviada pelo servidor
+                        const sampleRate = data.sampleRate || 24000;
+                        console.log(`🎯 TTS Response - Taxa recebida: ${sampleRate}Hz, Formato: ${data.format}, Tamanho: ${bytes.length} bytes`);
+                        // Se for Opus, usar WebAudio API para decodificar nativamente
+                        let wavBuffer;
+                        if (data.format === 'opus') {
+                            console.log(`🗜️ Opus 24kHz recebido: ${(bytes.length/1024).toFixed(1)}KB`);
+                            // Log de economia de banda
+                            if (data.originalSize) {
+                                const compression = Math.round(100 - (bytes.length / data.originalSize) * 100);
+                                console.log(`📊 Economia de banda: ${compression}% (${(data.originalSize/1024).toFixed(1)}KB → ${(bytes.length/1024).toFixed(1)}KB)`);
+                            }
+                            // WebAudio API pode decodificar Opus nativamente
+                            // Por agora, tratar como PCM até implementar decoder completo
+                            wavBuffer = addWavHeader(audioData, sampleRate);
+                        } else {
+                            // PCM - adicionar WAV header com a taxa correta
+                            wavBuffer = addWavHeader(audioData, sampleRate);
+                        }
+                        // Log da qualidade recebida
+                        console.log(`🎵 TTS pronto: ${(audioData.byteLength/1024).toFixed(1)}KB @ ${sampleRate}Hz (${data.quality || 'high'} quality, ${data.format || 'pcm'})`);
+                        // Criar blob e URL
+                        const blob = new Blob([wavBuffer], { type: 'audio/wav' });
+                        const audioUrl = URL.createObjectURL(blob);
+                        // Atualizar player
+                        elements.ttsAudio.src = audioUrl;
+                        elements.ttsPlayer.style.display = 'block';
+                        elements.ttsStatus.style.display = 'none';
+                        elements.ttsPlayBtn.disabled = false;
+                        elements.ttsPlayBtn.textContent = '▶️ Gerar Áudio';
+                        log('🎵 Áudio TTS gerado com sucesso!', 'success');
+                    }
+                    break;
+            }
+        }
+        // Processar áudio PCM recebido
+        function handlePCMAudio(arrayBuffer) {
+            metrics.receivedBytes += arrayBuffer.byteLength;
+            updateMetrics();
+            // Criar WAV header para reproduzir
+            const wavBuffer = addWavHeader(arrayBuffer);
+            // Criar blob e URL para o áudio
+            const blob = new Blob([wavBuffer], { type: 'audio/wav' });
+            const audioUrl = URL.createObjectURL(blob);
+            // Criar log com botão de play
+            const time = new Date().toLocaleTimeString('pt-BR');
+            const entry = document.createElement('div');
+            entry.className = 'log-entry success';
+            entry.innerHTML = `
+                <span class="log-time">[${time}]</span>
+                <span class="log-message">🔊 Áudio recebido: ${(arrayBuffer.byteLength / 1024).toFixed(1)}KB</span>
+                <div class="audio-player">
+                    <button class="play-btn" onclick="playAudio('${audioUrl}')">▶️ Play</button>
+                    <audio id="audio-${Date.now()}" src="${audioUrl}" style="display: none;"></audio>
+                </div>
+            `;
+            elements.log.appendChild(entry);
+            elements.log.scrollTop = elements.log.scrollHeight;
+            // Auto-play o áudio
+            const audio = new Audio(audioUrl);
+            audio.play().catch(err => {
+                console.log('Auto-play bloqueado, use o botão para reproduzir');
+            });
+        }
+        // Função para tocar áudio manualmente
+        function playAudio(url) {
+            const audio = new Audio(url);
+            audio.play();
+        }
+        // Adicionar header WAV ao PCM
+        function addWavHeader(pcmBuffer, customSampleRate) {
+            const pcmData = new Uint8Array(pcmBuffer);
+            const wavBuffer = new ArrayBuffer(44 + pcmData.length);
+            const view = new DataView(wavBuffer);
+            // WAV header
+            const writeString = (offset, string) => {
+                for (let i = 0; i < string.length; i++) {
+                    view.setUint8(offset + i, string.charCodeAt(i));
+                }
+            };
+            writeString(0, 'RIFF');
+            view.setUint32(4, 36 + pcmData.length, true);
+            writeString(8, 'WAVE');
+            writeString(12, 'fmt ');
+            view.setUint32(16, 16, true); // fmt chunk size
+            view.setUint16(20, 1, true); // PCM format
+            view.setUint16(22, 1, true); // Mono
+            // Usar taxa customizada se fornecida, senão usar 24kHz
+            let sampleRate = customSampleRate || 24000;
+            console.log(`📝 WAV Header - Configurando taxa: ${sampleRate}Hz`);
+            view.setUint32(24, sampleRate, true); // Sample rate
+            view.setUint32(28, sampleRate * 2, true); // Byte rate: sampleRate * 1 * 2
+            view.setUint16(32, 2, true); // Block align: 1 * 2
+            view.setUint16(34, 16, true); // Bits per sample: 16-bit
+            writeString(36, 'data');
+            view.setUint32(40, pcmData.length, true);
+            // Copiar dados PCM
+            new Uint8Array(wavBuffer, 44).set(pcmData);
+            return wavBuffer;
+        }
+        // Event Listeners
+        elements.connectBtn.addEventListener('click', () => {
+            if (isConnected) {
+                disconnect();
+            } else {
+                connect();
+            }
+        });
+        elements.talkBtn.addEventListener('mousedown', startRecording);
+        elements.talkBtn.addEventListener('mouseup', stopRecording);
+        elements.talkBtn.addEventListener('mouseleave', stopRecording);
+        // Voice selector listener
+        elements.voiceSelect.addEventListener('change', (e) => {
+            const voice_id = e.target.value;
+            console.log('Voice select changed to:', voice_id);
+            // Update current voice display
+            const currentVoiceElement = document.getElementById('currentVoice');
+            if (currentVoiceElement) {
+                currentVoiceElement.textContent = voice_id;
+            }
+            if (ws && ws.readyState === WebSocket.OPEN) {
+                console.log('Sending set-voice command:', voice_id);
+                ws.send(JSON.stringify({
+                    type: 'set-voice',
+                    voice_id: voice_id
+                }));
+                log(`🔊 Voz alterada para: ${voice_id} - ${e.target.options[e.target.selectedIndex].text}`, 'info');
+            } else {
+                console.log('WebSocket not connected, cannot send voice change');
+                log(`⚠️ Conecte-se primeiro para mudar a voz`, 'warning');
+            }
+        });
+        elements.talkBtn.addEventListener('touchstart', startRecording);
+        elements.talkBtn.addEventListener('touchend', stopRecording);
+        // TTS Voice selector listener
+        elements.ttsVoiceSelect.addEventListener('change', (e) => {
+            const voice_id = e.target.value;
+            // Update main voice selector
+            elements.voiceSelect.value = voice_id;
+            // Update current voice display
+            const currentVoiceElement = document.getElementById('currentVoice');
+            if (currentVoiceElement) {
+                currentVoiceElement.textContent = voice_id;
+            }
+            // Send voice change to server
+            if (ws && ws.readyState === WebSocket.OPEN) {
+                ws.send(JSON.stringify({
+                    type: 'set-voice',
+                    voice_id: voice_id
+                }));
+                log(`🎤 Voz TTS alterada para: ${voice_id}`, 'info');
+            }
+        });
+        // TTS Button Event Listener
+        elements.ttsPlayBtn.addEventListener('click', (e) => {
+            e.preventDefault();
+            e.stopPropagation();
+            console.log('TTS Button clicked!');
+            const text = elements.ttsText.value.trim();
+            const voice = elements.ttsVoiceSelect.value;
+            console.log('TTS Text:', text);
+            console.log('TTS Voice:', voice);
+            if (!text) {
+                alert('Por favor, digite algum texto para converter em áudio');
+                return;
+            }
+            if (!ws || ws.readyState !== WebSocket.OPEN) {
+                alert('Por favor, conecte-se primeiro clicando em "Conectar"');
+                return;
+            }
+            // Mostrar status
+            elements.ttsStatus.style.display = 'block';
+            elements.ttsStatusText.textContent = '⏳ Gerando áudio...';
+            elements.ttsPlayBtn.disabled = true;
+            elements.ttsPlayBtn.textContent = '⏳ Processando...';
+            elements.ttsPlayer.style.display = 'none';
+            // Sempre usar melhor qualidade (24kHz)
+            const quality = 'high';
+            // Enviar request para TTS com qualidade máxima
+            const ttsRequest = {
+                type: 'text-to-speech',
+                text: text,
+                voice_id: voice,
+                quality: quality,
+                format: 'opus'  // Opus 24kHz @ 32kbps - máxima qualidade, mínima banda
+            };
+            console.log('Sending TTS request:', ttsRequest);
+            ws.send(JSON.stringify(ttsRequest));
+            log(`🎤 Solicitando TTS: voz=${voice}, texto="${text.substring(0, 50)}..."`, 'info');
+        });
+        // Inicialização
+        log('🚀 Ultravox Chat PCM Otimizado', 'info');
+        log('📊 Formato: PCM 16-bit @ 16kHz', 'info');
+        log('⚡ Sem FFmpeg, sem Base64!', 'success');
+    </script>
+</body>
+</html>

services/webrtc_gateway/webrtc.pid ADDED Viewed

	@@ -0,0 +1 @@


1	+ 5415

test-24khz-support.html ADDED Viewed

	@@ -0,0 +1,243 @@

+<!DOCTYPE html>
+<html lang="pt-BR">
+<head>
+    <meta charset="UTF-8">
+    <title>Teste: Suporte 24kHz vs 16kHz no Navegador</title>
+    <style>
+        body {
+            font-family: 'Segoe UI', system-ui, sans-serif;
+            max-width: 800px;
+            margin: 50px auto;
+            padding: 20px;
+            background: #1a1a1a;
+            color: #e0e0e0;
+        }
+        .test-section {
+            background: #2a2a2a;
+            padding: 20px;
+            border-radius: 10px;
+            margin: 20px 0;
+        }
+        h2 { color: #4CAF50; }
+        .result {
+            padding: 10px;
+            margin: 10px 0;
+            border-radius: 5px;
+            background: #333;
+        }
+        .success { background: #1e4620; }
+        .warning { background: #4a3c1e; }
+        .error { background: #4a1e1e; }
+        button {
+            background: #4CAF50;
+            color: white;
+            border: none;
+            padding: 10px 20px;
+            border-radius: 5px;
+            cursor: pointer;
+            margin: 5px;
+            font-size: 16px;
+        }
+        button:hover { background: #45a049; }
+        audio { width: 100%; margin: 10px 0; }
+    </style>
+</head>
+<body>
+    <h1>🎵 Teste de Qualidade: 24kHz vs 16kHz</h1>
+    <div class="test-section">
+        <h2>📊 Capacidades do Navegador</h2>
+        <div id="capabilities"></div>
+    </div>
+    <div class="test-section">
+        <h2>🎤 Teste de Reprodução</h2>
+        <button onclick="test16kHz()">▶️ Tocar 16kHz (Atual)</button>
+        <button onclick="test24kHz()">▶️ Tocar 24kHz (Alta Qualidade)</button>
+        <button onclick="test22kHz()">▶️ Tocar 22.05kHz (CD)</button>
+        <button onclick="test48kHz()">▶️ Tocar 48kHz (Studio)</button>
+        <div id="playback-result"></div>
+    </div>
+    <div class="test-section">
+        <h2>📈 Análise de Banda</h2>
+        <div id="bandwidth"></div>
+    </div>
+    <div class="test-section">
+        <h2>💡 Recomendação</h2>
+        <div id="recommendation"></div>
+    </div>
+    <script>
+        // Testar capacidades do navegador
+        function checkCapabilities() {
+            const cap = document.getElementById('capabilities');
+            let html = '';
+            // Verificar AudioContext
+            const AC = window.AudioContext || window.webkitAudioContext;
+            if (AC) {
+                const ctx = new AC();
+                html += `<div class="result success">✅ AudioContext suportado</div>`;
+                html += `<div class="result">📍 Taxa padrão do sistema: ${ctx.sampleRate}Hz</div>`;
+                // Testar diferentes sample rates
+                const rates = [16000, 22050, 24000, 44100, 48000];
+                html += '<div class="result">📊 Taxas testadas:</div>';
+                rates.forEach(rate => {
+                    try {
+                        const testCtx = new AC({ sampleRate: rate });
+                        const actualRate = testCtx.sampleRate;
+                        if (actualRate === rate) {
+                            html += `<div class="result success">✅ ${rate}Hz: Suportado nativamente</div>`;
+                        } else {
+                            html += `<div class="result warning">⚠️ ${rate}Hz: Resampled para ${actualRate}Hz</div>`;
+                        }
+                        testCtx.close();
+                    } catch (e) {
+                        html += `<div class="result error">❌ ${rate}Hz: Erro - ${e.message}</div>`;
+                    }
+                });
+                ctx.close();
+            } else {
+                html += `<div class="result error">❌ AudioContext não suportado</div>`;
+            }
+            // Verificar Web Audio API features
+            if (window.AudioBuffer) {
+                html += `<div class="result success">✅ AudioBuffer suportado</div>`;
+            }
+            cap.innerHTML = html;
+        }
+        // Gerar tom de teste
+        function generateTone(sampleRate, frequency = 440, duration = 1) {
+            const samples = sampleRate * duration;
+            const buffer = new Float32Array(samples);
+            for (let i = 0; i < samples; i++) {
+                buffer[i] = Math.sin(2 * Math.PI * frequency * i / sampleRate) * 0.3;
+            }
+            return buffer;
+        }
+        // Testar reprodução em diferentes taxas
+        async function testSampleRate(rate) {
+            const result = document.getElementById('playback-result');
+            try {
+                const audioContext = new (window.AudioContext || window.webkitAudioContext)({
+                    sampleRate: rate
+                });
+                // Criar buffer de teste
+                const audioBuffer = audioContext.createBuffer(1, rate, rate);
+                const channelData = generateTone(rate, 440, 0.5);
+                audioBuffer.copyToChannel(channelData, 0);
+                // Tocar
+                const source = audioContext.createBufferSource();
+                source.buffer = audioBuffer;
+                source.connect(audioContext.destination);
+                source.start();
+                result.innerHTML = `<div class="result success">🔊 Tocando em ${rate}Hz (taxa real: ${audioContext.sampleRate}Hz)</div>`;
+                // Cleanup
+                setTimeout(() => {
+                    audioContext.close();
+                }, 600);
+            } catch (e) {
+                result.innerHTML = `<div class="result error">❌ Erro ao tocar ${rate}Hz: ${e.message}</div>`;
+            }
+        }
+        function test16kHz() { testSampleRate(16000); }
+        function test24kHz() { testSampleRate(24000); }
+        function test22kHz() { testSampleRate(22050); }
+        function test48kHz() { testSampleRate(48000); }
+        // Calcular uso de banda
+        function calculateBandwidth() {
+            const bw = document.getElementById('bandwidth');
+            const rates = [
+                { rate: 16000, name: '16kHz (Atual)' },
+                { rate: 22050, name: '22.05kHz (CD)' },
+                { rate: 24000, name: '24kHz (Kokoro)' },
+                { rate: 48000, name: '48kHz (Studio)' }
+            ];
+            let html = '<h3>📊 Comparação de Banda (PCM 16-bit mono):</h3>';
+            rates.forEach(r => {
+                const bytesPerSec = r.rate * 2; // 16-bit = 2 bytes
+                const kbps = (bytesPerSec * 8) / 1000;
+                const mbPerMin = (bytesPerSec * 60) / (1024 * 1024);
+                html += `<div class="result">`;
+                html += `<strong>${r.name}:</strong><br>`;
+                html += `• ${kbps.toFixed(0)} kbps<br>`;
+                html += `• ${mbPerMin.toFixed(2)} MB/min<br>`;
+                html += `• ${((r.rate/16000 - 1) * 100).toFixed(0)}% maior que 16kHz`;
+                html += `</div>`;
+            });
+            bw.innerHTML = html;
+        }
+        // Gerar recomendação
+        function generateRecommendation() {
+            const rec = document.getElementById('recommendation');
+            let html = `
+                <h3>✅ Recomendações:</h3>
+                <div class="result success">
+                    <strong>SIM, é possível e RECOMENDADO enviar 24kHz direto!</strong><br><br>
+                    <strong>Vantagens:</strong><br>
+                    • 🎵 Qualidade 50% superior (8kHz a mais de frequências)<br>
+                    • 🎤 Melhor clareza em português (consoantes mais nítidas)<br>
+                    • 💯 Preserva qualidade original do Kokoro<br>
+                    • ✅ Todos navegadores modernos suportam<br><br>
+                    <strong>Desvantagens:</strong><br>
+                    • 📊 50% mais banda (384 kbps vs 256 kbps)<br>
+                    • 💾 50% mais memória<br><br>
+                    <strong>Implementação Ideal:</strong><br>
+                    1. <strong>Opção Adaptativa:</strong> Detectar velocidade da conexão<br>
+                    2. <strong>Configurável:</strong> Botão "Qualidade: Normal | Alta | Ultra"<br>
+                    3. <strong>Padrão Inteligente:</strong><br>
+                    &nbsp;&nbsp;• WiFi/Ethernet: 24kHz<br>
+                    &nbsp;&nbsp;• 4G/5G: 22.05kHz<br>
+                    &nbsp;&nbsp;• 3G/Slow: 16kHz<br>
+                </div>
+                <div class="result warning">
+                    <strong>⚡ Para implementar agora (rápido):</strong><br>
+                    1. Mudar AudioContext para 24000Hz na interface<br>
+                    2. Remover downsampling no servidor<br>
+                    3. Ajustar WAV header para 24000Hz<br>
+                    4. Ganho imediato de 50% na qualidade!
+                </div>
+            `;
+            rec.innerHTML = html;
+        }
+        // Inicializar testes
+        window.onload = () => {
+            checkCapabilities();
+            calculateBandwidth();
+            generateRecommendation();
+        };
+    </script>
+</body>
+</html>

test-audio-cli.js ADDED Viewed

	@@ -0,0 +1,178 @@

+#!/usr/bin/env node
+/**
+ * Teste CLI para simular envio de áudio PCM ao servidor
+ * Similar ao que o navegador faz, mas via linha de comando
+ */
+const WebSocket = require('ws');
+const fs = require('fs');
+const path = require('path');
+const WS_URL = 'ws://localhost:8082/ws';
+class AudioTester {
+    constructor() {
+        this.ws = null;
+        this.conversationId = null;
+        this.clientId = null;
+    }
+    connect() {
+        return new Promise((resolve, reject) => {
+            console.log('🔌 Conectando ao WebSocket...');
+            this.ws = new WebSocket(WS_URL);
+            this.ws.on('open', () => {
+                console.log('✅ Conectado ao servidor');
+                resolve();
+            });
+            this.ws.on('error', (error) => {
+                console.error('❌ Erro:', error.message);
+                reject(error);
+            });
+            this.ws.on('message', (data) => {
+                // Verificar se é binário (áudio) ou JSON (mensagem)
+                if (data instanceof Buffer) {
+                    console.log(`🔊 Áudio recebido: ${(data.length / 1024).toFixed(1)}KB`);
+                    // Salvar áudio para análise
+                    const filename = `response_${Date.now()}.pcm`;
+                    fs.writeFileSync(filename, data);
+                    console.log(`   Salvo como: ${filename}`);
+                } else {
+                    try {
+                        const msg = JSON.parse(data);
+                        console.log('📨 Mensagem recebida:', msg);
+                        if (msg.type === 'init') {
+                            this.clientId = msg.clientId;
+                            this.conversationId = msg.conversationId;
+                            console.log(`🔑 Client ID: ${this.clientId}`);
+                            console.log(`🔑 Conversation ID: ${this.conversationId}`);
+                        } else if (msg.type === 'metrics') {
+                            console.log(`📊 Resposta: "${msg.response}" (${msg.latency}ms)`);
+                        }
+                    } catch (e) {
+                        console.log('📨 Dados recebidos:', data.toString());
+                    }
+                }
+            });
+        });
+    }
+    /**
+     * Gera áudio PCM sintético com tom de 440Hz (nota Lá)
+     * @param {number} durationMs - Duração em milissegundos
+     * @returns {Buffer} - Buffer PCM 16-bit @ 16kHz
+     */
+    generateTestAudio(durationMs = 2000) {
+        const sampleRate = 16000;
+        const frequency = 440; // Hz (nota Lá)
+        const samples = Math.floor(sampleRate * durationMs / 1000);
+        const buffer = Buffer.alloc(samples * 2); // 16-bit = 2 bytes por sample
+        for (let i = 0; i < samples; i++) {
+            // Gerar onda senoidal
+            const t = i / sampleRate;
+            const value = Math.sin(2 * Math.PI * frequency * t);
+            // Converter para int16
+            const int16Value = Math.floor(value * 32767);
+            // Escrever no buffer (little-endian)
+            buffer.writeInt16LE(int16Value, i * 2);
+        }
+        return buffer;
+    }
+    /**
+     * Gera áudio de fala real usando espeak (se disponível)
+     */
+    async generateSpeechAudio(text = "Olá, este é um teste de áudio") {
+        const { execSync } = require('child_process');
+        const tempFile = `/tmp/test_audio_${Date.now()}.raw`;
+        try {
+            // Usar espeak para gerar áudio
+            console.log(`🎤 Gerando áudio de fala: "${text}"`);
+            execSync(`espeak -s 150 -v pt-br "${text}" --stdout | sox - -r 16000 -b 16 -e signed-integer ${tempFile}`);
+            const audioBuffer = fs.readFileSync(tempFile);
+            fs.unlinkSync(tempFile); // Limpar arquivo temporário
+            return audioBuffer;
+        } catch (error) {
+            console.warn('⚠️ espeak/sox não disponível, usando áudio sintético');
+            return this.generateTestAudio(2000);
+        }
+    }
+    async sendAudio(audioBuffer) {
+        console.log(`\n📤 Enviando áudio PCM: ${(audioBuffer.length / 1024).toFixed(1)}KB`);
+        // Enviar como dados binários diretos (como o navegador faz)
+        this.ws.send(audioBuffer);
+        console.log('✅ Áudio enviado');
+    }
+    async testConversation() {
+        console.log('\n=== Iniciando teste de conversação ===\n');
+        // Teste 1: Enviar tom sintético
+        console.log('1️⃣ Teste com tom sintético (440Hz por 2s)');
+        const syntheticAudio = this.generateTestAudio(2000);
+        await this.sendAudio(syntheticAudio);
+        await this.wait(5000); // Aguardar resposta
+        // Teste 2: Enviar áudio de fala (se possível)
+        console.log('\n2️⃣ Teste com fala sintetizada');
+        const speechAudio = await this.generateSpeechAudio("Qual é o seu nome?");
+        await this.sendAudio(speechAudio);
+        await this.wait(5000); // Aguardar resposta
+        // Teste 3: Enviar silêncio
+        console.log('\n3️⃣ Teste com silêncio');
+        const silentAudio = Buffer.alloc(32000); // 1 segundo de silêncio
+        await this.sendAudio(silentAudio);
+        await this.wait(5000); // Aguardar resposta
+    }
+    wait(ms) {
+        return new Promise(resolve => setTimeout(resolve, ms));
+    }
+    disconnect() {
+        if (this.ws) {
+            console.log('\n👋 Desconectando...');
+            this.ws.close();
+        }
+    }
+}
+async function main() {
+    const tester = new AudioTester();
+    try {
+        await tester.connect();
+        await tester.wait(500);
+        await tester.testConversation();
+        await tester.wait(2000); // Aguardar últimas respostas
+    } catch (error) {
+        console.error('Erro fatal:', error);
+    } finally {
+        tester.disconnect();
+    }
+}
+console.log('╔═══════════════════════════════════════╗');
+console.log('║   Teste CLI de Áudio PCM              ║');
+console.log('╚═══════════════════════════════════════╝\n');
+console.log('Este teste simula o envio de áudio PCM');
+console.log('como o navegador faz, mas via CLI.\n');
+main().catch(console.error);

test-grpc-updated.py ADDED Viewed

	@@ -0,0 +1,161 @@

+#!/usr/bin/env python3
+"""
+Teste do servidor Ultravox via gRPC com formato de áudio atualizado
+"""
+import grpc
+import numpy as np
+import librosa
+import tempfile
+from gtts import gTTS
+import sys
+import os
+import time
+# Adicionar paths para protos
+sys.path.append('/workspace/ultravox-pipeline/services/ultravox')
+sys.path.append('/workspace/ultravox-pipeline/protos/generated')
+import speech_pb2
+import speech_pb2_grpc
+def generate_audio_for_grpc(text, lang='pt-br'):
+    """Gera áudio TTS e retorna como bytes float32 para gRPC"""
+    print(f"🔊 Gerando TTS: '{text}'")
+    # Criar arquivo temporário para o TTS
+    with tempfile.NamedTemporaryFile(suffix='.mp3', delete=False) as tmp_file:
+        tmp_path = tmp_file.name
+    try:
+        # Gerar TTS como MP3
+        tts = gTTS(text=text, lang=lang)
+        tts.save(tmp_path)
+        # Carregar com librosa (converte automaticamente para float32 normalizado)
+        audio, sr = librosa.load(tmp_path, sr=16000)
+        print(f"📊 Áudio carregado:")
+        print(f"   - Shape: {audio.shape}")
+        print(f"   - Dtype: {audio.dtype}")
+        print(f"   - Min: {audio.min():.3f}, Max: {audio.max():.3f}")
+        print(f"   - Sample rate: {sr} Hz")
+        # Converter para bytes para enviar via gRPC
+        audio_bytes = audio.tobytes()
+        return audio_bytes, sr
+    finally:
+        # Limpar arquivo temporário
+        if os.path.exists(tmp_path):
+            os.unlink(tmp_path)
+async def test_ultravox_grpc():
+    """Testa o servidor Ultravox via gRPC"""
+    print("=" * 60)
+    print("🚀 TESTE ULTRAVOX gRPC COM FORMATO ATUALIZADO")
+    print("=" * 60)
+    # Conectar ao servidor gRPC
+    channel = grpc.aio.insecure_channel('localhost:50051')
+    stub = speech_pb2_grpc.SpeechServiceStub(channel)
+    # Lista de testes
+    tests = [
+        {
+            "audio_text": "Quanto é dois mais dois?",
+            "prompt": "Responda em português:",
+            "lang": "pt-br",
+            "expected": ["quatro", "4", "dois mais dois"]
+        },
+        {
+            "audio_text": "Qual é a capital do Brasil?",
+            "prompt": "",  # Testar sem prompt customizado
+            "lang": "pt-br",
+            "expected": ["Brasília", "capital"]
+        },
+        {
+            "audio_text": "What is the capital of France?",
+            "prompt": "Answer the question:",
+            "lang": "en",
+            "expected": ["Paris", "capital", "France"]
+        }
+    ]
+    for i, test in enumerate(tests, 1):
+        print(f"\n{'='*50}")
+        print(f"📝 Teste {i}: {test['audio_text']}")
+        if test['prompt']:
+            print(f"   Prompt: {test['prompt']}")
+        print(f"   Esperado: {', '.join(test['expected'])}")
+        # Gerar áudio
+        audio_bytes, sample_rate = generate_audio_for_grpc(test['audio_text'], test['lang'])
+        # Criar requisição gRPC
+        async def generate_requests():
+            # Primeiro chunk com metadados
+            chunk = speech_pb2.AudioChunk()
+            chunk.session_id = f"test_{i}"
+            chunk.audio_data = audio_bytes[:len(audio_bytes)//2]  # Primeira metade
+            chunk.sample_rate = sample_rate
+            chunk.is_final_chunk = False
+            if test['prompt']:
+                chunk.system_prompt = test['prompt']
+            yield chunk
+            # Segundo chunk com resto do áudio
+            chunk = speech_pb2.AudioChunk()
+            chunk.session_id = f"test_{i}"
+            chunk.audio_data = audio_bytes[len(audio_bytes)//2:]  # Segunda metade
+            chunk.sample_rate = sample_rate
+            chunk.is_final_chunk = True
+            yield chunk
+        # Enviar e receber resposta
+        print("⏳ Enviando para servidor...")
+        start_time = time.time()
+        try:
+            response_text = ""
+            token_count = 0
+            async for token in stub.StreamingRecognize(generate_requests()):
+                if token.text:
+                    response_text += token.text
+                    token_count += 1
+                if token.is_final:
+                    break
+            elapsed = time.time() - start_time
+            # Verificar resposta
+            success = any(exp.lower() in response_text.lower() for exp in test['expected'])
+            print(f"💬 Resposta: '{response_text.strip()}'")
+            print(f"📊 Tokens: {token_count}")
+            print(f"⏱️ Tempo: {elapsed:.2f}s")
+            if success:
+                print(f"✅ SUCESSO! Resposta reconhecida")
+            else:
+                print(f"⚠️ Resposta não reconhecida")
+        except Exception as e:
+            print(f"❌ Erro: {e}")
+    await channel.close()
+    print("\n" + "=" * 60)
+    print("📊 TESTE CONCLUÍDO")
+    print("=" * 60)
+if __name__ == "__main__":
+    import asyncio
+    asyncio.run(test_ultravox_grpc())

test-opus-support.html ADDED Viewed

	@@ -0,0 +1,337 @@

+<!DOCTYPE html>
+<html lang="pt-BR">
+<head>
+    <meta charset="UTF-8">
+    <meta name="viewport" content="width=device-width, initial-scale=1.0">
+    <title>Opus Codec Test</title>
+    <style>
+        body {
+            font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', 'Roboto', 'Helvetica', 'Arial', sans-serif;
+            padding: 20px;
+            max-width: 800px;
+            margin: 0 auto;
+            background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
+            min-height: 100vh;
+        }
+        .container {
+            background: white;
+            border-radius: 12px;
+            padding: 30px;
+            box-shadow: 0 20px 40px rgba(0, 0, 0, 0.1);
+        }
+        h1 {
+            color: #333;
+            margin-bottom: 30px;
+        }
+        .codec-info {
+            background: #f0f0f5;
+            padding: 15px;
+            border-radius: 8px;
+            margin-bottom: 20px;
+            font-family: monospace;
+        }
+        .status {
+            display: inline-block;
+            padding: 5px 10px;
+            border-radius: 4px;
+            font-weight: bold;
+            margin-left: 10px;
+        }
+        .supported {
+            background: #4CAF50;
+            color: white;
+        }
+        .not-supported {
+            background: #f44336;
+            color: white;
+        }
+        .test-section {
+            margin: 20px 0;
+            padding: 20px;
+            border: 2px solid #e0e0e0;
+            border-radius: 8px;
+        }
+        button {
+            background: linear-gradient(145deg, #667eea, #764ba2);
+            color: white;
+            border: none;
+            padding: 12px 24px;
+            border-radius: 8px;
+            font-size: 16px;
+            cursor: pointer;
+            margin: 5px;
+            transition: transform 0.2s;
+        }
+        button:hover {
+            transform: translateY(-2px);
+        }
+        button:disabled {
+            opacity: 0.5;
+            cursor: not-allowed;
+        }
+        .log {
+            background: #1e1e1e;
+            color: #d4d4d4;
+            padding: 15px;
+            border-radius: 8px;
+            font-family: monospace;
+            font-size: 12px;
+            max-height: 300px;
+            overflow-y: auto;
+            margin-top: 20px;
+        }
+        .log-entry {
+            margin: 5px 0;
+            padding: 5px;
+            border-left: 3px solid #667eea;
+            padding-left: 10px;
+        }
+        .log-entry.error {
+            border-left-color: #f44336;
+            color: #ff9999;
+        }
+        .log-entry.success {
+            border-left-color: #4CAF50;
+            color: #90ee90;
+        }
+        .log-entry.info {
+            border-left-color: #2196F3;
+            color: #87ceeb;
+        }
+    </style>
+</head>
+<body>
+    <div class="container">
+        <h1>🎵 Opus Codec Support Test</h1>
+        <div class="codec-info">
+            <h3>Codec Support Detection:</h3>
+            <div id="codecStatus"></div>
+        </div>
+        <div class="test-section">
+            <h3>🎤 Recording Test</h3>
+            <button id="startRecord">Start Recording (Opus)</button>
+            <button id="stopRecord" disabled>Stop Recording</button>
+            <button id="startPCM">Start Recording (PCM)</button>
+            <button id="stopPCM" disabled>Stop Recording</button>
+        </div>
+        <div class="test-section">
+            <h3>📊 Recording Info</h3>
+            <div id="recordingInfo">
+                <p>Format: <span id="format">-</span></p>
+                <p>Size: <span id="size">-</span></p>
+                <p>Duration: <span id="duration">-</span></p>
+            </div>
+        </div>
+        <div class="log" id="log"></div>
+    </div>
+    <script>
+        let mediaRecorder;
+        let audioChunks = [];
+        let stream;
+        let startTime;
+        function log(message, type = 'info') {
+            const logEl = document.getElementById('log');
+            const entry = document.createElement('div');
+            entry.className = `log-entry ${type}`;
+            const time = new Date().toLocaleTimeString();
+            entry.textContent = `[${time}] ${message}`;
+            logEl.appendChild(entry);
+            logEl.scrollTop = logEl.scrollHeight;
+        }
+        // Check codec support
+        function checkCodecSupport() {
+            const statusEl = document.getElementById('codecStatus');
+            const codecs = [
+                'audio/webm;codecs=opus',
+                'audio/ogg;codecs=opus',
+                'audio/webm',
+                'audio/ogg'
+            ];
+            let html = '';
+            codecs.forEach(codec => {
+                const supported = MediaRecorder.isTypeSupported(codec);
+                html += `<div>${codec}: <span class="status ${supported ? 'supported' : 'not-supported'}">${supported ? 'SUPPORTED' : 'NOT SUPPORTED'}</span></div>`;
+                log(`Codec ${codec}: ${supported ? 'Supported' : 'Not Supported'}`, supported ? 'success' : 'error');
+            });
+            statusEl.innerHTML = html;
+        }
+        // Initialize
+        async function init() {
+            try {
+                stream = await navigator.mediaDevices.getUserMedia({ audio: true });
+                log('Microphone access granted', 'success');
+                checkCodecSupport();
+            } catch (error) {
+                log('Failed to get microphone access: ' + error.message, 'error');
+            }
+        }
+        // Start Opus recording
+        document.getElementById('startRecord').addEventListener('click', () => {
+            if (!stream) {
+                log('No stream available', 'error');
+                return;
+            }
+            audioChunks = [];
+            startTime = Date.now();
+            const mimeType = 'audio/webm;codecs=opus';
+            const options = {
+                mimeType: MediaRecorder.isTypeSupported(mimeType) ? mimeType : 'audio/webm',
+                audioBitsPerSecond: 32000
+            };
+            try {
+                mediaRecorder = new MediaRecorder(stream, options);
+                log(`Recording started with ${mediaRecorder.mimeType}`, 'success');
+                mediaRecorder.ondataavailable = (event) => {
+                    if (event.data.size > 0) {
+                        audioChunks.push(event.data);
+                        log(`Chunk received: ${event.data.size} bytes`);
+                    }
+                };
+                mediaRecorder.onstop = () => {
+                    const duration = ((Date.now() - startTime) / 1000).toFixed(2);
+                    const blob = new Blob(audioChunks, { type: mediaRecorder.mimeType });
+                    document.getElementById('format').textContent = mediaRecorder.mimeType;
+                    document.getElementById('size').textContent = `${(blob.size / 1024).toFixed(2)} KB`;
+                    document.getElementById('duration').textContent = `${duration} seconds`;
+                    log(`Recording stopped. Total size: ${(blob.size / 1024).toFixed(2)} KB`, 'success');
+                    // Create download link
+                    const url = URL.createObjectURL(blob);
+                    const a = document.createElement('a');
+                    a.href = url;
+                    a.download = `opus-test-${Date.now()}.webm`;
+                    a.click();
+                };
+                mediaRecorder.start(100);
+                document.getElementById('startRecord').disabled = true;
+                document.getElementById('stopRecord').disabled = false;
+            } catch (error) {
+                log('Failed to start recording: ' + error.message, 'error');
+            }
+        });
+        // Stop Opus recording
+        document.getElementById('stopRecord').addEventListener('click', () => {
+            if (mediaRecorder && mediaRecorder.state === 'recording') {
+                mediaRecorder.stop();
+                document.getElementById('startRecord').disabled = false;
+                document.getElementById('stopRecord').disabled = true;
+            }
+        });
+        // PCM recording (for comparison)
+        let audioContext;
+        let audioSource;
+        let audioProcessor;
+        let pcmBuffer = [];
+        document.getElementById('startPCM').addEventListener('click', () => {
+            if (!stream) {
+                log('No stream available', 'error');
+                return;
+            }
+            pcmBuffer = [];
+            startTime = Date.now();
+            if (!audioContext) {
+                audioContext = new (window.AudioContext || window.webkitAudioContext)({ sampleRate: 24000 });
+            }
+            audioSource = audioContext.createMediaStreamSource(stream);
+            audioProcessor = audioContext.createScriptProcessor(4096, 1, 1);
+            audioProcessor.onaudioprocess = (e) => {
+                const inputData = e.inputBuffer.getChannelData(0);
+                const pcmData = new Int16Array(inputData.length);
+                for (let i = 0; i < inputData.length; i++) {
+                    const sample = Math.max(-1, Math.min(1, inputData[i]));
+                    pcmData[i] = sample < 0 ? sample * 0x8000 : sample * 0x7FFF;
+                }
+                pcmBuffer.push(pcmData);
+            };
+            audioSource.connect(audioProcessor);
+            audioProcessor.connect(audioContext.destination);
+            log('PCM recording started (24kHz, 16-bit)', 'success');
+            document.getElementById('startPCM').disabled = true;
+            document.getElementById('stopPCM').disabled = false;
+        });
+        document.getElementById('stopPCM').addEventListener('click', () => {
+            if (audioProcessor) {
+                audioProcessor.disconnect();
+                audioProcessor = null;
+            }
+            if (audioSource) {
+                audioSource.disconnect();
+                audioSource = null;
+            }
+            const duration = ((Date.now() - startTime) / 1000).toFixed(2);
+            const totalLength = pcmBuffer.reduce((acc, chunk) => acc + chunk.length, 0);
+            const fullPCM = new Int16Array(totalLength);
+            let offset = 0;
+            for (const chunk of pcmBuffer) {
+                fullPCM.set(chunk, offset);
+                offset += chunk.length;
+            }
+            const sizeKB = (fullPCM.length * 2 / 1024).toFixed(2);
+            document.getElementById('format').textContent = 'PCM 16-bit 24kHz';
+            document.getElementById('size').textContent = `${sizeKB} KB`;
+            document.getElementById('duration').textContent = `${duration} seconds`;
+            log(`PCM recording stopped. Total size: ${sizeKB} KB`, 'success');
+            document.getElementById('startPCM').disabled = false;
+            document.getElementById('stopPCM').disabled = true;
+        });
+        // Initialize on load
+        init();
+    </script>
+</body>
+</html>

test-simple.py ADDED Viewed

	@@ -0,0 +1,70 @@

+#!/usr/bin/env python3
+"""
+Teste simples do Ultravox com prompt básico
+"""
+import grpc
+import numpy as np
+import time
+import sys
+sys.path.append('/workspace/ultravox-pipeline/ultravox')
+sys.path.append('/workspace/ultravox-pipeline/protos')
+import speech_pb2
+import speech_pb2_grpc
+def test_ultravox():
+    """Testa o Ultravox com áudio simples"""
+    print("📡 Conectando ao Ultravox...")
+    channel = grpc.insecure_channel('localhost:50051')
+    stub = speech_pb2_grpc.SpeechServiceStub(channel)
+    # Criar áudio simples de silêncio
+    # O modelo deveria processar mesmo sem áudio real
+    audio = np.zeros(16000, dtype=np.float32)  # 1 segundo de silêncio
+    print(f"🎵 Áudio: {len(audio)} samples @ 16kHz")
+    # Criar requisição simples
+    def audio_generator():
+        chunk = speech_pb2.AudioChunk()
+        chunk.audio_data = audio.tobytes()
+        chunk.sample_rate = 16000
+        chunk.is_final_chunk = True
+        chunk.session_id = f"test_{int(time.time())}"
+        # Não enviar prompt - usar padrão <|audio|>
+        yield chunk
+    print("⏳ Processando...")
+    start_time = time.time()
+    try:
+        response_text = ""
+        token_count = 0
+        for response in stub.StreamingRecognize(audio_generator()):
+            if response.text:
+                response_text += response.text
+                token_count += 1
+                print(f"   Token {token_count}: '{response.text.strip()}'")
+                if response.is_final:
+                    print(" [FINAL]")
+                    break
+        elapsed = time.time() - start_time
+        print(f"\n📊 Resultado:")
+        print(f"   - Resposta: '{response_text.strip()}'")
+        print(f"   - Tempo: {elapsed:.2f}s")
+        print(f"   - Tokens: {token_count}")
+    except grpc.RpcError as e:
+        print(f"❌ Erro gRPC: {e.code()} - {e.details()}")
+    except Exception as e:
+        print(f"❌ Erro: {e}")
+if __name__ == "__main__":
+    test_ultravox()

test-tts-button.html ADDED Viewed

	@@ -0,0 +1,65 @@

+<!DOCTYPE html>
+<html>
+<head>
+    <title>Test TTS Button</title>
+</head>
+<body>
+    <h1>Test TTS WebSocket</h1>
+    <button id="connectBtn">Connect</button>
+    <button id="testTTSBtn" disabled>Test TTS</button>
+    <div id="log"></div>
+    <script>
+        let ws = null;
+        const log = document.getElementById('log');
+        function addLog(msg) {
+            log.innerHTML += `<p>${msg}</p>`;
+            console.log(msg);
+        }
+        document.getElementById('connectBtn').onclick = () => {
+            ws = new WebSocket('ws://localhost:8082/ws');
+            ws.binaryType = 'arraybuffer';
+            ws.onopen = () => {
+                addLog('✅ Connected');
+                document.getElementById('testTTSBtn').disabled = false;
+            };
+            ws.onmessage = (event) => {
+                if (event.data instanceof ArrayBuffer) {
+                    addLog(`📦 Received binary: ${event.data.byteLength} bytes`);
+                } else {
+                    try {
+                        const data = JSON.parse(event.data);
+                        addLog(`📨 Received JSON: ${JSON.stringify(data)}`);
+                    } catch (e) {
+                        addLog(`📨 Received text: ${event.data}`);
+                    }
+                }
+            };
+            ws.onerror = (error) => {
+                addLog(`❌ Error: ${error}`);
+            };
+            ws.onclose = () => {
+                addLog('❌ Disconnected');
+                document.getElementById('testTTSBtn').disabled = true;
+            };
+        };
+        document.getElementById('testTTSBtn').onclick = () => {
+            const ttsRequest = {
+                type: 'text-to-speech',
+                text: 'Teste de TTS direto',
+                voice_id: 'pf_dora'
+            };
+            addLog(`📤 Sending: ${JSON.stringify(ttsRequest)}`);
+            ws.send(JSON.stringify(ttsRequest));
+        };
+    </script>
+</body>
+</html>

test-ultravox-auto.py ADDED Viewed

	@@ -0,0 +1,172 @@

+#!/usr/bin/env python3
+"""
+Teste automatizado do Ultravox com TTS
+"""
+import grpc
+import numpy as np
+import time
+import sys
+import os
+from gtts import gTTS
+from pydub import AudioSegment
+import io
+# Adicionar paths
+sys.path.append('/workspace/ultravox-pipeline/ultravox')
+sys.path.append('/workspace/ultravox-pipeline/protos')
+# Importar os protobuffers compilados
+import speech_pb2
+import speech_pb2_grpc
+def generate_tts_audio(text, lang='pt-br'):
+    """Gera áudio TTS a partir de texto"""
+    print(f"🔊 Gerando TTS: '{text}'")
+    tts = gTTS(text=text, lang=lang)
+    mp3_buffer = io.BytesIO()
+    tts.write_to_fp(mp3_buffer)
+    mp3_buffer.seek(0)
+    # Converter MP3 para PCM 16kHz
+    audio = AudioSegment.from_mp3(mp3_buffer)
+    audio = audio.set_frame_rate(16000).set_channels(1).set_sample_width(2)
+    # Converter para numpy float32
+    samples = np.array(audio.get_array_of_samples()).astype(np.float32) / 32768.0
+    return samples
+def test_ultravox(question, expected_answer=None):
+    """Testa o Ultravox com uma pergunta"""
+    print(f"\n{'='*60}")
+    print(f"📝 Pergunta: {question}")
+    if expected_answer:
+        print(f"✅ Resposta esperada: {expected_answer}")
+    print(f"{'='*60}")
+    # Gerar áudio da pergunta
+    audio = generate_tts_audio(question)
+    print(f"🎵 Áudio gerado: {len(audio)} samples @ 16kHz ({len(audio)/16000:.2f}s)")
+    # Conectar ao Ultravox
+    print("📡 Conectando ao Ultravox...")
+    channel = grpc.insecure_channel('localhost:50051')
+    stub = speech_pb2_grpc.SpeechServiceStub(channel)
+    # Criar requisição
+    def audio_generator():
+        chunk = speech_pb2.AudioChunk()
+        chunk.audio_data = audio.tobytes()
+        chunk.sample_rate = 16000
+        chunk.is_final_chunk = True
+        chunk.session_id = f"test_{int(time.time())}"
+        # Não enviar system_prompt - deixar o servidor usar o padrão com <|audio|>
+        # chunk.system_prompt = ""
+        yield chunk
+    # Enviar e receber resposta
+    print("⏳ Processando...")
+    start_time = time.time()
+    try:
+        response_text = ""
+        token_count = 0
+        for response in stub.StreamingRecognize(audio_generator()):
+            if response.text:
+                response_text += response.text
+                token_count += 1
+                print(f"   Token {token_count}: '{response.text.strip()}'", end="")
+                if response.is_final:
+                    print(" [FINAL]")
+                    break
+                else:
+                    print()
+        elapsed = time.time() - start_time
+        print(f"\n📊 Estatísticas:")
+        print(f"   - Resposta: '{response_text.strip()}'")
+        print(f"   - Tempo: {elapsed:.2f}s")
+        print(f"   - Tokens: {token_count}")
+        # Verificar resposta esperada
+        if expected_answer:
+            if expected_answer.lower() in response_text.lower():
+                print(f"   ✅ SUCESSO! Resposta contém '{expected_answer}'")
+                return True
+            else:
+                print(f"   ⚠️ AVISO: Resposta não contém '{expected_answer}'")
+                return False
+        return True
+    except grpc.RpcError as e:
+        print(f"❌ Erro gRPC: {e.code()} - {e.details()}")
+        return False
+    except Exception as e:
+        print(f"❌ Erro: {e}")
+        return False
+def main():
+    """Executa bateria de testes"""
+    print("\n" + "="*60)
+    print("🚀 TESTE AUTOMATIZADO DO ULTRAVOX")
+    print("="*60)
+    # Lista de testes
+    tests = [
+        {
+            "question": "Quanto é dois mais dois?",
+            "expected": "quatro"
+        },
+        {
+            "question": "Qual é a capital do Brasil?",
+            "expected": "Brasília"
+        },
+        {
+            "question": "Que dia é hoje?",
+            "expected": None  # Resposta variável
+        },
+        {
+            "question": "Olá, como você está?",
+            "expected": None  # Resposta variável
+        }
+    ]
+    # Executar testes
+    results = []
+    for i, test in enumerate(tests, 1):
+        print(f"\n🧪 TESTE {i}/{len(tests)}")
+        success = test_ultravox(test["question"], test.get("expected"))
+        results.append(success)
+        time.sleep(2)  # Pausa entre testes
+    # Resumo
+    print("\n" + "="*60)
+    print("📊 RESUMO DOS TESTES")
+    print("="*60)
+    total = len(results)
+    passed = sum(1 for r in results if r)
+    failed = total - passed
+    print(f"Total: {total}")
+    print(f"✅ Passou: {passed}")
+    print(f"❌ Falhou: {failed}")
+    print(f"Taxa de sucesso: {(passed/total)*100:.1f}%")
+    if passed == total:
+        print("\n🎉 TODOS OS TESTES PASSARAM!")
+        return 0
+    else:
+        print(f"\n⚠️ {failed} teste(s) falharam")
+        return 1
+if __name__ == "__main__":
+    sys.exit(main())

test-ultravox-librosa.py ADDED Viewed

	@@ -0,0 +1,166 @@

+#!/usr/bin/env python3
+"""
+Teste do Ultravox com formato de áudio correto usando librosa
+"""
+import sys
+sys.path.append('/workspace/ultravox-pipeline/ultravox')
+from vllm import LLM, SamplingParams
+import numpy as np
+import librosa
+import soundfile as sf
+import tempfile
+from gtts import gTTS
+import time
+import os
+def generate_audio_librosa(text, lang='pt-br'):
+    """Gera áudio TTS e converte para formato esperado pelo Ultravox"""
+    print(f"🔊 Gerando TTS: '{text}'")
+    # Criar arquivo temporário para o TTS
+    with tempfile.NamedTemporaryFile(suffix='.mp3', delete=False) as tmp_file:
+        tmp_path = tmp_file.name
+    try:
+        # Gerar TTS como MP3
+        tts = gTTS(text=text, lang=lang)
+        tts.save(tmp_path)
+        # Carregar com librosa (converte automaticamente para float32 normalizado)
+        # librosa normaliza entre -1 e 1 automaticamente
+        audio, sr = librosa.load(tmp_path, sr=16000)
+        print(f"📊 Áudio carregado com librosa:")
+        print(f"   - Shape: {audio.shape}")
+        print(f"   - Dtype: {audio.dtype}")
+        print(f"   - Min: {audio.min():.3f}, Max: {audio.max():.3f}")
+        print(f"   - Sample rate: {sr} Hz")
+        return audio, sr
+    finally:
+        # Limpar arquivo temporário
+        if os.path.exists(tmp_path):
+            os.unlink(tmp_path)
+def test_ultravox_librosa():
+    """Testa Ultravox com formato de áudio correto"""
+    print("=" * 60)
+    print("🚀 TESTE ULTRAVOX COM LIBROSA (FORMATO CORRETO)")
+    print("=" * 60)
+    # Configurar modelo
+    model_name = "fixie-ai/ultravox-v0_5-llama-3_2-1b"
+    # Inicializar LLM
+    print(f"📡 Inicializando {model_name}...")
+    llm = LLM(
+        model=model_name,
+        trust_remote_code=True,
+        enforce_eager=True,
+        max_model_len=256,
+        gpu_memory_utilization=0.3
+    )
+    # Parâmetros de sampling
+    sampling_params = SamplingParams(
+        temperature=0.3,
+        max_tokens=50,
+        repetition_penalty=1.1
+    )
+    # Lista de testes
+    tests = [
+        ("Quanto é dois mais dois?", "pt-br", "quatro"),
+        ("Qual é a capital do Brasil?", "pt-br", "Brasília"),
+        ("What is two plus two?", "en", "four"),
+    ]
+    results = []
+    for question, lang, expected in tests:
+        print(f"\n{'='*50}")
+        print(f"📝 Pergunta: {question}")
+        print(f"✅ Esperado: {expected}")
+        # Gerar áudio com librosa
+        audio, sr = generate_audio_librosa(question, lang)
+        # Preparar prompt com token de áudio
+        prompt = "<|audio|>"
+        # Preparar entrada com áudio
+        llm_input = {
+            "prompt": prompt,
+            "multi_modal_data": {
+                "audio": audio  # Agora no formato correto do librosa
+            }
+        }
+        # Fazer inferência
+        print("⏳ Processando...")
+        start_time = time.time()
+        try:
+            outputs = llm.generate(
+                prompts=[llm_input],
+                sampling_params=sampling_params
+            )
+            elapsed = time.time() - start_time
+            # Extrair resposta
+            response = outputs[0].outputs[0].text.strip()
+            # Verificar se a resposta contém o esperado
+            success = expected.lower() in response.lower() if expected else False
+            print(f"💬 Resposta: '{response}'")
+            print(f"⏱️ Tempo: {elapsed:.2f}s")
+            if success:
+                print(f"✅ SUCESSO! Resposta contém '{expected}'")
+            else:
+                print(f"⚠️ Resposta não contém '{expected}'")
+            results.append({
+                'question': question,
+                'expected': expected,
+                'response': response,
+                'success': success,
+                'time': elapsed
+            })
+        except Exception as e:
+            print(f"❌ Erro: {e}")
+            results.append({
+                'question': question,
+                'expected': expected,
+                'response': str(e),
+                'success': False,
+                'time': 0
+            })
+    # Resumo
+    print("\n" + "=" * 60)
+    print("📊 RESUMO DOS TESTES")
+    print("=" * 60)
+    total = len(results)
+    passed = sum(1 for r in results if r['success'])
+    for i, result in enumerate(results, 1):
+        status = "✅" if result['success'] else "❌"
+        print(f"{status} Teste {i}: {result['question'][:30]}...")
+        print(f"   Resposta: {result['response'][:50]}...")
+    print(f"\nTotal: {total}")
+    print(f"✅ Passou: {passed}")
+    print(f"❌ Falhou: {total - passed}")
+    print(f"Taxa de sucesso: {(passed/total)*100:.1f}%")
+if __name__ == "__main__":
+    test_ultravox_librosa()

test-ultravox-simple-prompt.py ADDED Viewed

	@@ -0,0 +1,206 @@

+#!/usr/bin/env python3
+"""
+Teste do Ultravox com prompt simples sem chat template
+"""
+import sys
+sys.path.append('/workspace/ultravox-pipeline/ultravox')
+from vllm import LLM, SamplingParams
+import numpy as np
+import librosa
+import tempfile
+from gtts import gTTS
+import time
+import os
+def generate_audio_tuple(text, lang='pt-br'):
+    """Gera áudio TTS e retorna como tupla (audio, sample_rate)"""
+    print(f"🔊 Gerando TTS: '{text}'")
+    # Criar arquivo temporário para o TTS
+    with tempfile.NamedTemporaryFile(suffix='.mp3', delete=False) as tmp_file:
+        tmp_path = tmp_file.name
+    try:
+        # Gerar TTS como MP3
+        tts = gTTS(text=text, lang=lang)
+        tts.save(tmp_path)
+        # Carregar com librosa (converte automaticamente para float32 normalizado)
+        audio, sr = librosa.load(tmp_path, sr=16000)
+        print(f"📊 Áudio carregado:")
+        print(f"   - Shape: {audio.shape}")
+        print(f"   - Dtype: {audio.dtype}")
+        print(f"   - Min: {audio.min():.3f}, Max: {audio.max():.3f}")
+        print(f"   - Sample rate: {sr} Hz")
+        # Retornar como tupla (audio, sample_rate) - formato esperado pelo vLLM
+        return (audio, sr)
+    finally:
+        # Limpar arquivo temporário
+        if os.path.exists(tmp_path):
+            os.unlink(tmp_path)
+def test_ultravox_simple():
+    """Testa Ultravox com prompt simples"""
+    print("=" * 60)
+    print("🚀 TESTE ULTRAVOX COM PROMPT SIMPLES")
+    print("=" * 60)
+    # Configurar modelo
+    model_name = "fixie-ai/ultravox-v0_5-llama-3_2-1b"
+    # Inicializar LLM
+    print(f"📡 Inicializando {model_name}...")
+    llm = LLM(
+        model=model_name,
+        trust_remote_code=True,
+        enforce_eager=True,
+        max_model_len=4096,
+        gpu_memory_utilization=0.3
+    )
+    # Parâmetros de sampling
+    sampling_params = SamplingParams(
+        temperature=0.2,
+        max_tokens=64
+    )
+    # Lista de testes com diferentes formatos de prompt
+    tests = [
+        {
+            "audio_text": "Quanto é dois mais dois?",
+            "prompts": [
+                "<|audio|>",  # Apenas o token
+                "<|audio|>\nResponda em português:",  # Com instrução
+                "<|audio|>\nO que foi perguntado no áudio?",  # Com pergunta
+            ],
+            "lang": "pt-br",
+            "expected": ["quatro", "4", "dois mais dois", "2+2"]
+        },
+        {
+            "audio_text": "What is the capital of France?",
+            "prompts": [
+                "<|audio|>",
+                "<|audio|>\nAnswer the question:",
+                "<|audio|>\nWhat did you hear?",
+            ],
+            "lang": "en",
+            "expected": ["Paris", "capital", "France"]
+        }
+    ]
+    results = []
+    for test in tests:
+        audio_tuple = generate_audio_tuple(test['audio_text'], test['lang'])
+        for prompt in test['prompts']:
+            print(f"\n{'='*50}")
+            print(f"📝 Áudio: {test['audio_text']}")
+            print(f"📝 Prompt: {prompt[:50]}...")
+            print(f"✅ Esperado: {', '.join(test['expected'])}")
+            # Preparar entrada com áudio no formato de tupla
+            llm_input = {
+                "prompt": prompt,
+                "multi_modal_data": {
+                    "audio": [audio_tuple]  # Lista de tuplas (audio, sample_rate)
+                }
+            }
+            # Fazer inferência
+            print("⏳ Processando...")
+            start_time = time.time()
+            try:
+                outputs = llm.generate(
+                    prompts=[llm_input],
+                    sampling_params=sampling_params
+                )
+                elapsed = time.time() - start_time
+                # Extrair resposta
+                response = outputs[0].outputs[0].text.strip()
+                # Verificar se a resposta contém algum dos esperados
+                success = any(exp.lower() in response.lower() for exp in test['expected'])
+                print(f"💬 Resposta: '{response[:100]}...'")
+                print(f"⏱️ Tempo: {elapsed:.2f}s")
+                if success:
+                    print(f"✅ SUCESSO! Resposta reconhecida")
+                else:
+                    print(f"⚠️ Resposta não reconhecida")
+                results.append({
+                    'audio': test['audio_text'],
+                    'prompt': prompt[:30],
+                    'response': response,
+                    'success': success,
+                    'time': elapsed
+                })
+            except Exception as e:
+                print(f"❌ Erro: {e}")
+                results.append({
+                    'audio': test['audio_text'],
+                    'prompt': prompt[:30],
+                    'response': str(e),
+                    'success': False,
+                    'time': 0
+                })
+    # Resumo
+    print("\n" + "=" * 60)
+    print("📊 RESUMO DOS TESTES")
+    print("=" * 60)
+    total = len(results)
+    passed = sum(1 for r in results if r['success'])
+    # Agrupar por áudio
+    audio_groups = {}
+    for result in results:
+        if result['audio'] not in audio_groups:
+            audio_groups[result['audio']] = []
+        audio_groups[result['audio']].append(result)
+    for audio, group in audio_groups.items():
+        print(f"\n📝 Áudio: {audio}")
+        for result in group:
+            status = "✅" if result['success'] else "❌"
+            print(f"  {status} Prompt: {result['prompt']}...")
+            print(f"     Resposta: {result['response'][:60]}...")
+    print(f"\n📊 Estatísticas:")
+    print(f"Total de testes: {total}")
+    print(f"✅ Passou: {passed}")
+    print(f"❌ Falhou: {total - passed}")
+    print(f"Taxa de sucesso: {(passed/total)*100:.1f}%")
+    # Encontrar o melhor prompt
+    prompt_success = {}
+    for result in results:
+        prompt_key = result['prompt']
+        if prompt_key not in prompt_success:
+            prompt_success[prompt_key] = {'success': 0, 'total': 0}
+        prompt_success[prompt_key]['total'] += 1
+        if result['success']:
+            prompt_success[prompt_key]['success'] += 1
+    print(f"\n🏆 Melhor formato de prompt:")
+    for prompt, stats in sorted(prompt_success.items(),
+                                key=lambda x: x[1]['success']/x[1]['total'],
+                                reverse=True):
+        rate = (stats['success']/stats['total'])*100
+        print(f"  {rate:.0f}% - {prompt}...")
+if __name__ == "__main__":
+    test_ultravox_simple()

test-ultravox-tts.py ADDED Viewed

	@@ -0,0 +1,121 @@

+#!/usr/bin/env python3
+"""
+Script de teste para Ultravox com TTS
+Envia uma pergunta via áudio sintetizado e verifica a resposta
+"""
+import grpc
+import numpy as np
+import asyncio
+import time
+from gtts import gTTS
+from pydub import AudioSegment
+import io
+import sys
+import os
+# Adicionar o path para os protobuffers
+sys.path.append('/workspace/ultravox-pipeline/ultravox')
+import speech_pb2
+import speech_pb2_grpc
+async def test_ultravox_with_tts():
+    """Testa o Ultravox enviando áudio TTS com a pergunta 'Quanto é 2 + 2?'"""
+    print("🎤 Iniciando teste do Ultravox com TTS...")
+    # 1. Gerar áudio TTS com a pergunta
+    print("🔊 Gerando áudio TTS: 'Quanto é dois mais dois?'")
+    tts = gTTS(text="Quanto é dois mais dois?", lang='pt-br')
+    # Salvar em buffer de memória
+    mp3_buffer = io.BytesIO()
+    tts.write_to_fp(mp3_buffer)
+    mp3_buffer.seek(0)
+    # Converter MP3 para PCM 16kHz
+    audio = AudioSegment.from_mp3(mp3_buffer)
+    audio = audio.set_frame_rate(16000).set_channels(1).set_sample_width(2)
+    # Converter para numpy array float32
+    samples = np.array(audio.get_array_of_samples()).astype(np.float32) / 32768.0
+    print(f"✅ Áudio gerado: {len(samples)} samples @ 16kHz")
+    print(f"   Duração: {len(samples)/16000:.2f} segundos")
+    # 2. Conectar ao servidor Ultravox
+    print("\n📡 Conectando ao Ultravox na porta 50051...")
+    try:
+        channel = grpc.aio.insecure_channel('localhost:50051')
+        stub = speech_pb2_grpc.UltravoxServiceStub(channel)
+        # 3. Criar request com o áudio
+        session_id = f"test_{int(time.time())}"
+        async def audio_generator():
+            """Gera chunks de áudio para enviar"""
+            request = speech_pb2.AudioRequest()
+            request.session_id = session_id
+            request.audio_data = samples.tobytes()
+            request.sample_rate = 16000
+            request.is_final_chunk = True
+            request.system_prompt = "Responda em português de forma simples e direta"
+            print(f"📤 Enviando áudio para sessão: {session_id}")
+            yield request
+        # 4. Enviar e receber resposta
+        print("\n⏳ Aguardando resposta do Ultravox...")
+        start_time = time.time()
+        response_text = ""
+        token_count = 0
+        async for response in stub.TranscribeStream(audio_generator()):
+            if response.text:
+                response_text += response.text
+                token_count += 1
+                print(f"   Token {token_count}: '{response.text.strip()}'")
+                if response.is_final:
+                    break
+        elapsed = time.time() - start_time
+        # 5. Verificar resposta
+        print(f"\n📝 Resposta completa: '{response_text.strip()}'")
+        print(f"⏱️  Tempo de resposta: {elapsed:.2f}s")
+        print(f"📊 Tokens recebidos: {token_count}")
+        # Verificar se a resposta contém "4" ou "quatro"
+        if "4" in response_text.lower() or "quatro" in response_text.lower():
+            print("\n✅ SUCESSO! O Ultravox respondeu corretamente!")
+        else:
+            print("\n⚠️  AVISO: A resposta não contém '4' ou 'quatro'")
+        await channel.close()
+    except grpc.RpcError as e:
+        print(f"\n❌ Erro gRPC: {e.code()} - {e.details()}")
+        return False
+    except Exception as e:
+        print(f"\n❌ Erro: {e}")
+        return False
+    return True
+if __name__ == "__main__":
+    print("=" * 60)
+    print("TESTE ULTRAVOX COM TTS")
+    print("=" * 60)
+    # Executar teste
+    success = asyncio.run(test_ultravox_with_tts())
+    if success:
+        print("\n🎉 Teste concluído com sucesso!")
+    else:
+        print("\n❌ Teste falhou!")
+    print("=" * 60)

test-ultravox-tuple.py ADDED Viewed

	@@ -0,0 +1,202 @@

+#!/usr/bin/env python3
+"""
+Teste do Ultravox com formato correto de tupla (audio, sample_rate)
+Baseado no exemplo oficial do vLLM
+"""
+import sys
+sys.path.append('/workspace/ultravox-pipeline/ultravox')
+from vllm import LLM, SamplingParams
+import numpy as np
+import librosa
+import tempfile
+from gtts import gTTS
+import time
+import os
+from transformers import AutoTokenizer
+def generate_audio_tuple(text, lang='pt-br'):
+    """Gera áudio TTS e retorna como tupla (audio, sample_rate)"""
+    print(f"🔊 Gerando TTS: '{text}'")
+    # Criar arquivo temporário para o TTS
+    with tempfile.NamedTemporaryFile(suffix='.mp3', delete=False) as tmp_file:
+        tmp_path = tmp_file.name
+    try:
+        # Gerar TTS como MP3
+        tts = gTTS(text=text, lang=lang)
+        tts.save(tmp_path)
+        # Carregar com librosa (converte automaticamente para float32 normalizado)
+        audio, sr = librosa.load(tmp_path, sr=16000)
+        print(f"📊 Áudio carregado:")
+        print(f"   - Shape: {audio.shape}")
+        print(f"   - Dtype: {audio.dtype}")
+        print(f"   - Min: {audio.min():.3f}, Max: {audio.max():.3f}")
+        print(f"   - Sample rate: {sr} Hz")
+        # Retornar como tupla (audio, sample_rate) - formato esperado pelo vLLM
+        return (audio, sr)
+    finally:
+        # Limpar arquivo temporário
+        if os.path.exists(tmp_path):
+            os.unlink(tmp_path)
+def test_ultravox_tuple():
+    """Testa Ultravox com formato de tupla correto"""
+    print("=" * 60)
+    print("🚀 TESTE ULTRAVOX COM FORMATO DE TUPLA")
+    print("=" * 60)
+    # Configurar modelo
+    model_name = "fixie-ai/ultravox-v0_5-llama-3_2-1b"
+    # Inicializar tokenizer
+    print(f"📡 Inicializando tokenizer...")
+    tokenizer = AutoTokenizer.from_pretrained(model_name)
+    # Inicializar LLM
+    print(f"📡 Inicializando {model_name}...")
+    llm = LLM(
+        model=model_name,
+        trust_remote_code=True,
+        enforce_eager=True,
+        max_model_len=4096,  # Aumentando para 4096 como no exemplo oficial
+        gpu_memory_utilization=0.3
+    )
+    # Parâmetros de sampling
+    sampling_params = SamplingParams(
+        temperature=0.2,  # Usando 0.2 como no exemplo oficial
+        max_tokens=64     # Usando 64 como no exemplo oficial
+    )
+    # Lista de testes
+    tests = [
+        {
+            "audio_text": "Quanto é dois mais dois?",
+            "question": "O que foi perguntado no áudio?",
+            "lang": "pt-br",
+            "expected": ["quatro", "2+2", "dois mais dois"]
+        },
+        {
+            "audio_text": "Qual é a capital do Brasil?",
+            "question": "Responda a pergunta que você ouviu.",
+            "lang": "pt-br",
+            "expected": ["Brasília", "capital", "Brasil"]
+        },
+        {
+            "audio_text": "What is two plus two?",
+            "question": "Answer the question you heard.",
+            "lang": "en",
+            "expected": ["four", "4", "two plus two"]
+        }
+    ]
+    results = []
+    for test in tests:
+        print(f"\n{'='*50}")
+        print(f"📝 Áudio: {test['audio_text']}")
+        print(f"❓ Pergunta: {test['question']}")
+        print(f"✅ Esperado: {', '.join(test['expected'])}")
+        # Gerar áudio como tupla
+        audio_tuple = generate_audio_tuple(test['audio_text'], test['lang'])
+        # Criar mensagem com token de áudio
+        messages = [{
+            "role": "user",
+            "content": f"<|audio|>\n{test['question']}"
+        }]
+        # Aplicar chat template
+        prompt = tokenizer.apply_chat_template(
+            messages,
+            tokenize=False,
+            add_generation_prompt=True
+        )
+        print(f"📝 Prompt gerado: {prompt[:100]}...")
+        # Preparar entrada com áudio no formato de tupla
+        llm_input = {
+            "prompt": prompt,
+            "multi_modal_data": {
+                "audio": [audio_tuple]  # Lista de tuplas (audio, sample_rate)
+            }
+        }
+        # Fazer inferência
+        print("⏳ Processando...")
+        start_time = time.time()
+        try:
+            outputs = llm.generate(
+                prompts=[llm_input],
+                sampling_params=sampling_params
+            )
+            elapsed = time.time() - start_time
+            # Extrair resposta
+            response = outputs[0].outputs[0].text.strip()
+            # Verificar se a resposta contém algum dos esperados
+            success = any(exp.lower() in response.lower() for exp in test['expected'])
+            print(f"💬 Resposta: '{response}'")
+            print(f"⏱️ Tempo: {elapsed:.2f}s")
+            if success:
+                print(f"✅ SUCESSO! Resposta reconhecida")
+            else:
+                print(f"���️ Resposta não reconhecida")
+            results.append({
+                'audio': test['audio_text'],
+                'question': test['question'],
+                'expected': test['expected'],
+                'response': response,
+                'success': success,
+                'time': elapsed
+            })
+        except Exception as e:
+            print(f"❌ Erro: {e}")
+            import traceback
+            traceback.print_exc()
+            results.append({
+                'audio': test['audio_text'],
+                'question': test['question'],
+                'expected': test['expected'],
+                'response': str(e),
+                'success': False,
+                'time': 0
+            })
+    # Resumo
+    print("\n" + "=" * 60)
+    print("📊 RESUMO DOS TESTES")
+    print("=" * 60)
+    total = len(results)
+    passed = sum(1 for r in results if r['success'])
+    for i, result in enumerate(results, 1):
+        status = "✅" if result['success'] else "❌"
+        print(f"{status} Teste {i}: {result['audio'][:30]}...")
+        print(f"   Resposta: {result['response'][:80]}...")
+    print(f"\nTotal: {total}")
+    print(f"✅ Passou: {passed}")
+    print(f"❌ Falhou: {total - passed}")
+    print(f"Taxa de sucesso: {(passed/total)*100:.1f}%")
+if __name__ == "__main__":
+    test_ultravox_tuple()

test-ultravox-vllm.py ADDED Viewed

	@@ -0,0 +1,113 @@

+#!/usr/bin/env python3
+"""
+Teste do Ultravox usando vLLM diretamente
+Baseado no exemplo oficial
+"""
+import sys
+sys.path.append('/workspace/ultravox-pipeline/ultravox')
+from vllm import LLM, SamplingParams
+import numpy as np
+from gtts import gTTS
+from pydub import AudioSegment
+import io
+import time
+def generate_audio(text, lang='pt-br'):
+    """Gera áudio TTS"""
+    print(f"🔊 Gerando TTS: '{text}'")
+    tts = gTTS(text=text, lang=lang)
+    mp3_buffer = io.BytesIO()
+    tts.write_to_fp(mp3_buffer)
+    mp3_buffer.seek(0)
+    # Converter MP3 para PCM 16kHz
+    audio = AudioSegment.from_mp3(mp3_buffer)
+    audio = audio.set_frame_rate(16000).set_channels(1).set_sample_width(2)
+    # Converter para numpy float32
+    samples = np.array(audio.get_array_of_samples()).astype(np.float32) / 32768.0
+    return samples
+def test_ultravox():
+    """Testa Ultravox diretamente com vLLM"""
+    print("=" * 60)
+    print("🚀 TESTE DIRETO DO ULTRAVOX COM vLLM")
+    print("=" * 60)
+    # Configurar modelo
+    model_name = "fixie-ai/ultravox-v0_5-llama-3_2-1b"
+    # Inicializar LLM
+    print(f"📡 Inicializando {model_name}...")
+    llm = LLM(
+        model=model_name,
+        trust_remote_code=True,
+        enforce_eager=True,
+        max_model_len=256,
+        gpu_memory_utilization=0.3
+    )
+    # Parâmetros de sampling
+    sampling_params = SamplingParams(
+        temperature=0.3,
+        max_tokens=50,
+        repetition_penalty=1.1
+    )
+    # Lista de testes
+    tests = [
+        ("What is 2 + 2?", "en"),
+        ("Quanto é dois mais dois?", "pt-br"),
+        ("What is the capital of Brazil?", "en")
+    ]
+    for question, lang in tests:
+        print(f"\n📝 Pergunta: {question}")
+        # Gerar áudio
+        audio = generate_audio(question, lang)
+        print(f"🎵 Áudio: {len(audio)} samples @ 16kHz")
+        # Preparar prompt com token de áudio
+        prompt = "<|audio|>"
+        # Preparar entrada com áudio
+        llm_input = {
+            "prompt": prompt,
+            "multi_modal_data": {
+                "audio": audio
+            }
+        }
+        # Fazer inferência
+        print("⏳ Processando...")
+        start_time = time.time()
+        try:
+            outputs = llm.generate(
+                prompts=[llm_input],
+                sampling_params=sampling_params
+            )
+            elapsed = time.time() - start_time
+            # Extrair resposta
+            response = outputs[0].outputs[0].text
+            print(f"✅ Resposta: '{response}'")
+            print(f"⏱️ Tempo: {elapsed:.2f}s")
+        except Exception as e:
+            print(f"❌ Erro: {e}")
+    print("\n" + "=" * 60)
+    print("✅ TESTE CONCLUÍDO")
+    print("=" * 60)
+if __name__ == "__main__":
+    test_ultravox()

test-vllm-openai.py ADDED Viewed

	@@ -0,0 +1,90 @@

+#!/usr/bin/env python3
+"""
+Teste do Ultravox usando vLLM OpenAI API
+Baseado no exemplo oficial
+"""
+import requests
+import json
+import numpy as np
+import base64
+from gtts import gTTS
+from pydub import AudioSegment
+import io
+def generate_audio(text):
+    """Gera áudio TTS"""
+    print(f"🔊 Gerando TTS: '{text}'")
+    tts = gTTS(text=text, lang='pt-br')
+    mp3_buffer = io.BytesIO()
+    tts.write_to_fp(mp3_buffer)
+    mp3_buffer.seek(0)
+    # Converter MP3 para PCM 16kHz
+    audio = AudioSegment.from_mp3(mp3_buffer)
+    audio = audio.set_frame_rate(16000).set_channels(1).set_sample_width(2)
+    # Converter para numpy float32
+    samples = np.array(audio.get_array_of_samples()).astype(np.float32) / 32768.0
+    return samples
+def test_vllm_api():
+    """Testa usando a API OpenAI do vLLM"""
+    # Gerar áudio de teste
+    audio = generate_audio("Quanto é dois mais dois?")
+    print(f"🎵 Áudio: {len(audio)} samples @ 16kHz")
+    # Codificar áudio em base64
+    audio_bytes = audio.tobytes()
+    audio_b64 = base64.b64encode(audio_bytes).decode('utf-8')
+    # Criar mensagem no formato OpenAI com áudio
+    messages = [
+        {
+            "role": "user",
+            "content": [
+                {
+                    "type": "audio",
+                    "audio": {
+                        "data": audio_b64,
+                        "format": "pcm16"
+                    }
+                },
+                {
+                    "type": "text",
+                    "text": "What did you hear?"
+                }
+            ]
+        }
+    ]
+    # Fazer requisição para vLLM OpenAI API
+    url = "http://localhost:8000/v1/chat/completions"
+    payload = {
+        "model": "fixie-ai/ultravox-v0_5-llama-3_2-1b",
+        "messages": messages,
+        "temperature": 0.3,
+        "max_tokens": 50
+    }
+    print("📡 Enviando para vLLM API...")
+    try:
+        response = requests.post(url, json=payload)
+        if response.status_code == 200:
+            result = response.json()
+            print("✅ Resposta:", result['choices'][0]['message']['content'])
+        else:
+            print(f"❌ Erro: {response.status_code}")
+            print(response.text)
+    except Exception as e:
+        print(f"❌ Erro: {e}")
+if __name__ == "__main__":
+    test_vllm_api()

tts_server_kokoro.py ADDED Viewed

	@@ -0,0 +1,255 @@

+#!/usr/bin/env python3
+"""
+TTS Server usando Kokoro para baixa latência
+Retorna PCM direto sem conversões MP3/WAV
+"""
+import grpc
+import asyncio
+import sys
+import os
+import time
+import logging
+import numpy as np
+from concurrent import futures
+from pathlib import Path
+import importlib.util
+# Adicionar paths
+sys.path.append('/workspace/ultravox-pipeline')
+sys.path.append('/workspace/ultravox-pipeline/protos/generated')
+sys.path.append('/workspace/tts-service-kokoro/engines/kokoro')
+import tts_pb2
+import tts_pb2_grpc
+# Logging
+logging.basicConfig(
+    level=logging.INFO,
+    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
+)
+logger = logging.getLogger(__name__)
+class KokoroTTSService(tts_pb2_grpc.TTSServiceServicer):
+    """Servidor TTS usando Kokoro para síntese de voz em português"""
+    def __init__(self):
+        logger.info("🚀 Inicializando Kokoro TTS Service...")
+        self.pipeline = None
+        self.is_loaded = False
+        self.total_requests = 0
+        self.load_model()
+    def load_model(self):
+        """Carrega o modelo Kokoro uma vez e mantém em memória"""
+        if self.is_loaded:
+            return True
+        try:
+            logger.info("📚 Carregando modelo Kokoro...")
+            start_time = time.time()
+            # Importar módulo Kokoro dinamicamente
+            kokoro_path = Path('/workspace/tts-service-kokoro/engines/kokoro/gerar_audio.py')
+            if not kokoro_path.exists():
+                # Fallback para implementação simplificada
+                logger.warning("⚠️ Kokoro não encontrado, usando TTS simplificado")
+                self.use_simple_tts = True
+                self.is_loaded = True
+                return True
+            spec = importlib.util.spec_from_file_location("gerar_audio", kokoro_path)
+            gerar_audio = importlib.util.module_from_spec(spec)
+            spec.loader.exec_module(gerar_audio)
+            # Criar pipeline Kokoro
+            KPipeline = gerar_audio.KPipeline
+            self.pipeline = KPipeline(lang_code='p')  # Português
+            self.use_simple_tts = False
+            load_time = time.time() - start_time
+            logger.info(f"✅ Kokoro carregado em {load_time:.2f}s")
+            self.is_loaded = True
+            # Warm-up
+            self.warmup()
+            return True
+        except Exception as e:
+            logger.error(f"❌ Erro ao carregar Kokoro: {e}")
+            logger.info("📌 Usando TTS simplificado como fallback")
+            self.use_simple_tts = True
+            self.is_loaded = True
+            return True
+    def warmup(self):
+        """Aquece o modelo com uma síntese teste"""
+        try:
+            if not self.use_simple_tts:
+                logger.info("🔥 Aquecendo modelo Kokoro...")
+                start = time.time()
+                _ = self.synthesize_text("Teste")
+                logger.info(f"✅ Warm-up completo em {time.time() - start:.2f}s")
+        except Exception as e:
+            logger.error(f"⚠️ Erro no warm-up: {e}")
+    def synthesize_text(self, text: str) -> bytes:
+        """Sintetiza texto para áudio PCM"""
+        try:
+            if self.use_simple_tts or not self.pipeline:
+                # Fallback para síntese simples
+                return self._generate_simple_pcm(text)
+            # Usar Kokoro
+            start = time.time()
+            # Gerar áudio com Kokoro (retorna numpy array)
+            audio_array = self.pipeline.generate(
+                text,
+                voice='p_gemidao',  # Voz portuguesa
+                speed=1.0
+            )
+            # Converter para PCM 16-bit
+            if audio_array.dtype != np.int16:
+                # Normalizar e converter
+                audio_array = np.clip(audio_array * 32767, -32768, 32767).astype(np.int16)
+            synthesis_time = time.time() - start
+            logger.info(f"🎵 Kokoro synthesis: {synthesis_time*1000:.1f}ms")
+            return audio_array.tobytes()
+        except Exception as e:
+            logger.error(f"❌ Erro na síntese Kokoro: {e}")
+            # Fallback para síntese simples
+            return self._generate_simple_pcm(text)
+    def _generate_simple_pcm(self, text: str) -> bytes:
+        """Gera PCM sintético simples como fallback"""
+        try:
+            # Parâmetros de áudio
+            sample_rate = 16000
+            duration = max(0.5, len(text) * 0.08)  # ~80ms por caractere
+            # Gerar samples
+            num_samples = int(sample_rate * duration)
+            t = np.linspace(0, duration, num_samples)
+            # Frequência base (voz feminina)
+            base_freq = 220 + (hash(text) % 50)
+            # Gerar onda com harmônicos para som mais natural
+            signal = np.sin(2 * np.pi * base_freq * t) * 0.5
+            signal += np.sin(2 * np.pi * base_freq * 2 * t) * 0.3  # 2º harmônico
+            signal += np.sin(2 * np.pi * base_freq * 3 * t) * 0.2  # 3º harmônico
+            # Adicionar modulação para variação natural
+            modulation = np.sin(2 * np.pi * 3 * t) * 0.2
+            signal = signal * (0.8 + modulation)
+            # Envelope ADSR
+            fade_samples = int(0.02 * sample_rate)  # 20ms fade
+            signal[:fade_samples] *= np.linspace(0, 1, fade_samples)
+            signal[-fade_samples:] *= np.linspace(1, 0, fade_samples)
+            # Converter para PCM 16-bit
+            pcm_data = np.clip(signal * 32767, -32768, 32767).astype(np.int16)
+            return pcm_data.tobytes()
+        except Exception as e:
+            logger.error(f"❌ Erro no TTS simples: {e}")
+            # Retornar silêncio
+            return np.zeros(16000, dtype=np.int16).tobytes()
+    def StreamingSynthesize(self, request, context):
+        """
+        Implementação de streaming synthesis
+        Retorna PCM 16-bit @ 16kHz direto, sem conversões
+        """
+        try:
+            text = request.text
+            voice_id = request.voice_id or "kokoro_pt"
+            logger.info(f"🎤 TTS Request: '{text}' [{len(text)} chars]")
+            start_time = time.time()
+            # Sintetizar áudio
+            pcm_data = self.synthesize_text(text)
+            # Enviar em chunks para streaming
+            chunk_size = 4096  # 4KB chunks
+            total_chunks = len(pcm_data) // chunk_size + 1
+            for i in range(total_chunks):
+                start_idx = i * chunk_size
+                end_idx = min((i + 1) * chunk_size, len(pcm_data))
+                if start_idx < len(pcm_data):
+                    chunk_data = pcm_data[start_idx:end_idx]
+                    response = tts_pb2.AudioResponse(
+                        audio_data=chunk_data,
+                        samples_count=len(chunk_data) // 2,  # int16 = 2 bytes
+                        is_final_chunk=(i == total_chunks - 1),
+                        timestamp_ms=int(time.time() * 1000)
+                    )
+                    yield response
+                    # Simular streaming realista (sem await pois não é async)
+                    if not self.use_simple_tts:
+                        time.sleep(0.001)  # 1ms entre chunks
+            total_time = (time.time() - start_time) * 1000
+            self.total_requests += 1
+            logger.info(f"✅ TTS completo: {total_time:.1f}ms, {len(pcm_data)/1024:.1f}KB")
+            logger.info(f"📊 Total requests: {self.total_requests}")
+        except Exception as e:
+            logger.error(f"❌ TTS Synthesis error: {e}")
+            context.set_code(grpc.StatusCode.INTERNAL)
+            context.set_details(f"Synthesis failed: {e}")
+async def serve():
+    """Iniciar servidor TTS com Kokoro"""
+    logger.info("🚀 Iniciando Kokoro TTS Server...")
+    # Criar servidor gRPC assíncrono
+    server = grpc.aio.server(
+        futures.ThreadPoolExecutor(max_workers=10),
+        options=[
+            ('grpc.max_send_message_length', 50 * 1024 * 1024),  # 50MB
+            ('grpc.max_receive_message_length', 50 * 1024 * 1024),
+        ]
+    )
+    # Adicionar serviço
+    tts_service = KokoroTTSService()
+    tts_pb2_grpc.add_TTSServiceServicer_to_server(tts_service, server)
+    # Configurar porta
+    listen_addr = '[::]:50054'
+    server.add_insecure_port(listen_addr)
+    # Iniciar servidor
+    await server.start()
+    logger.info(f"🎵 Kokoro TTS Server rodando em {listen_addr}")
+    logger.info("💡 Latência esperada: <100ms para síntese")
+    logger.info("🔊 Formato: PCM 16-bit @ 16kHz (sem conversões!)")
+    # Manter rodando
+    try:
+        await server.wait_for_termination()
+    except KeyboardInterrupt:
+        logger.info("🛑 Parando servidor...")
+        await server.stop(5)
+if __name__ == '__main__':
+    asyncio.run(serve())

tunnel-macbook.sh ADDED Viewed

	@@ -0,0 +1,70 @@

+#!/bin/bash
+# Script simplificado para MacBook - SSH Tunnel para Ultravox WebRTC
+# Copie este script para seu MacBook e execute localmente
+# Configurações - EDITE ESTAS VARIÁVEIS
+REMOTE_HOST="SEU_SERVIDOR_AQUI"     # Coloque o IP ou hostname do servidor
+REMOTE_USER="ubuntu"                 # Seu usuário SSH
+SSH_KEY="~/.ssh/id_rsa"             # Caminho da sua chave SSH (opcional)
+# Portas (não precisa mudar)
+WEBRTC_PORT=8082
+ULTRAVOX_PORT=50051
+TTS_PORT=50054
+# Cores
+GREEN='\033[0;32m'
+YELLOW='\033[1;33m'
+RED='\033[0;31m'
+BLUE='\033[0;34m'
+NC='\033[0m'
+clear
+echo -e "${BLUE}╔═══════════════════════════════════════════════════════╗${NC}"
+echo -e "${BLUE}║     🚇 Ultravox WebRTC - Túnel SSH para MacBook      ║${NC}"
+echo -e "${BLUE}╚═══════════════════════════════════════════════════════╝${NC}"
+echo
+# Verificar se o host foi configurado
+if [ "$REMOTE_HOST" = "SEU_SERVIDOR_AQUI" ]; then
+    echo -e "${RED}❌ Erro: Configure o REMOTE_HOST no script primeiro!${NC}"
+    echo -e "${YELLOW}   Edite a linha 6 e coloque o IP ou hostname do seu servidor${NC}"
+    exit 1
+fi
+# Matar túneis existentes
+echo -e "${YELLOW}🔍 Verificando túneis existentes...${NC}"
+pkill -f "ssh.*$REMOTE_HOST.*8082:localhost:8082" 2>/dev/null
+sleep 1
+echo -e "${YELLOW}📡 Criando túnel SSH...${NC}"
+echo -e "   Servidor: ${GREEN}$REMOTE_USER@$REMOTE_HOST${NC}"
+echo
+# Criar túnel SSH (encaminha apenas a porta do WebRTC)
+if [ -f "$SSH_KEY" ]; then
+    ssh -f -N -L 8082:localhost:8082 -i "$SSH_KEY" $REMOTE_USER@$REMOTE_HOST
+else
+    ssh -f -N -L 8082:localhost:8082 $REMOTE_USER@$REMOTE_HOST
+fi
+if [ $? -eq 0 ]; then
+    echo -e "${GREEN}✅ Túnel SSH criado com sucesso!${NC}"
+    echo
+    echo -e "${BLUE}╔═══════════════════════════════════════════════════════╗${NC}"
+    echo -e "${BLUE}║                  ACESSE NO SEU MACBOOK                ║${NC}"
+    echo -e "${BLUE}╚═══════════════════════════════════════════════════════╝${NC}"
+    echo
+    echo -e "  ${GREEN}➜ http://localhost:8082${NC}"
+    echo -e "  ${GREEN}➜ http://localhost:8082/ultravox-chat.html${NC}"
+    echo -e "  ${GREEN}➜ http://localhost:8082/ultravox-chat-ios.html${NC}"
+    echo
+    echo -e "${YELLOW}Para fechar o túnel:${NC}"
+    echo -e "  ${BLUE}pkill -f 'ssh.*8082:localhost:8082'${NC}"
+    echo
+else
+    echo -e "${RED}❌ Erro ao criar túnel SSH${NC}"
+    echo -e "${YELLOW}Verifique suas credenciais SSH e conexão${NC}"
+    exit 1
+fi

tunnel.sh ADDED Viewed

	@@ -0,0 +1,95 @@

+#!/bin/bash
+# SSH Tunnel Script para acessar Ultravox WebRTC do MacBook local
+# Este script cria um túnel SSH para encaminhar a porta 8082 do servidor remoto para sua máquina local
+# Cores para output
+GREEN='\033[0;32m'
+YELLOW='\033[1;33m'
+RED='\033[0;31m'
+BLUE='\033[0;34m'
+NC='\033[0m' # No Color
+# Configurações
+REMOTE_HOST="${SSH_HOST:-seu-servidor.com}"  # Substitua com o endereço do seu servidor
+REMOTE_USER="${SSH_USER:-ubuntu}"            # Substitua com seu usuário SSH
+REMOTE_PORT="${REMOTE_PORT:-8082}"           # Porta do WebRTC no servidor remoto
+LOCAL_PORT="${LOCAL_PORT:-8082}"             # Porta local no seu MacBook
+SSH_KEY="${SSH_KEY:-~/.ssh/id_rsa}"         # Caminho para sua chave SSH
+echo -e "${BLUE}═══════════════════════════════════════════════════════${NC}"
+echo -e "${BLUE}     🚇 Ultravox WebRTC SSH Tunnel - MacBook Access${NC}"
+echo -e "${BLUE}═══════════════════════════════════════════════════════${NC}"
+echo
+# Verificar se já existe um túnel na porta
+if lsof -Pi :$LOCAL_PORT -sTCP:LISTEN -t >/dev/null 2>&1; then
+    echo -e "${YELLOW}⚠️  Porta $LOCAL_PORT já está em uso${NC}"
+    echo -e "${YELLOW}Matando processo existente...${NC}"
+    lsof -ti:$LOCAL_PORT | xargs kill -9 2>/dev/null
+    sleep 1
+fi
+# Função para limpar ao sair
+cleanup() {
+    echo
+    echo -e "${YELLOW}Fechando túnel SSH...${NC}"
+    exit 0
+}
+# Capturar Ctrl+C
+trap cleanup INT
+echo -e "${YELLOW}📡 Configuração do Túnel:${NC}"
+echo -e "   Servidor Remoto: ${GREEN}$REMOTE_USER@$REMOTE_HOST${NC}"
+echo -e "   Porta Remota: ${GREEN}$REMOTE_PORT${NC}"
+echo -e "   Porta Local: ${GREEN}$LOCAL_PORT${NC}"
+echo
+echo -e "${YELLOW}🔗 Estabelecendo túnel SSH...${NC}"
+# Criar o túnel SSH
+# -N: Não executar comando remoto
+# -L: Port forwarding local
+# -o: Opções SSH para reconexão automática
+ssh -N \
+    -L $LOCAL_PORT:localhost:$REMOTE_PORT \
+    -o ServerAliveInterval=60 \
+    -o ServerAliveCountMax=3 \
+    -o ExitOnForwardFailure=yes \
+    -o StrictHostKeyChecking=no \
+    -i $SSH_KEY \
+    $REMOTE_USER@$REMOTE_HOST &
+SSH_PID=$!
+# Aguardar conexão
+sleep 2
+# Verificar se o túnel foi estabelecido
+if kill -0 $SSH_PID 2>/dev/null; then
+    echo -e "${GREEN}✅ Túnel SSH estabelecido com sucesso!${NC}"
+    echo
+    echo -e "${BLUE}═══════════════════════════════════════════════════════${NC}"
+    echo -e "${GREEN}🎉 Acesse o Ultravox Chat no seu MacBook:${NC}"
+    echo
+    echo -e "   ${BLUE}➜${NC} ${GREEN}http://localhost:$LOCAL_PORT${NC}"
+    echo -e "   ${BLUE}➜${NC} ${GREEN}http://localhost:$LOCAL_PORT/ultravox-chat.html${NC}"
+    echo -e "   ${BLUE}➜${NC} ${GREEN}http://localhost:$LOCAL_PORT/ultravox-chat-ios.html${NC}"
+    echo
+    echo -e "${BLUE}═══════════════════════════════════════════════════════${NC}"
+    echo
+    echo -e "${YELLOW}📌 Pressione Ctrl+C para fechar o túnel${NC}"
+    echo
+    # Manter o túnel aberto
+    wait $SSH_PID
+else
+    echo -e "${RED}❌ Falha ao estabelecer túnel SSH${NC}"
+    echo -e "${YELLOW}Verifique:${NC}"
+    echo -e "  1. O endereço do servidor está correto: $REMOTE_HOST"
+    echo -e "  2. Suas credenciais SSH estão configuradas"
+    echo -e "  3. O servidor está acessível"
+    echo -e "  4. A porta $REMOTE_PORT está ativa no servidor"
+    exit 1
+fi

ultravox/restart_ultravox.sh ADDED Viewed

	@@ -0,0 +1,39 @@

+#!/bin/bash
+# Script para reiniciar o servidor Ultravox com limpeza completa
+echo "🔄 Reiniciando servidor Ultravox..."
+echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
+echo ""
+# 1. Executar script de parada
+SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
+echo "📍 Parando servidor atual..."
+bash "$SCRIPT_DIR/stop_ultravox.sh"
+# 2. Aguardar um pouco mais para garantir liberação completa
+echo ""
+echo "⏳ Aguardando liberação completa de recursos..."
+sleep 5
+# 3. Verificar se realmente liberou
+echo "🔍 Verificando liberação..."
+if lsof -i :50051 >/dev/null 2>&1; then
+    echo "   ⚠️  Porta 50051 ainda ocupada, forçando limpeza..."
+    kill -9 $(lsof -t -i:50051) 2>/dev/null
+    sleep 2
+fi
+# 4. Verificar GPU uma última vez
+GPU_FREE=$(nvidia-smi --query-gpu=memory.free --format=csv,noheader,nounits 2>/dev/null | head -1)
+if [ -n "$GPU_FREE" ] && [ "$GPU_FREE" -lt "20000" ]; then
+    echo "   ⚠️  GPU com menos de 20GB livres, limpeza adicional..."
+    pkill -9 -f "python" 2>/dev/null
+    sleep 3
+fi
+# 5. Iniciar servidor
+echo ""
+echo "🚀 Iniciando novo servidor..."
+echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
+bash "$SCRIPT_DIR/start_ultravox.sh"

ultravox/server.py CHANGED Viewed

@@ -118,12 +118,10 @@ class UltravoxServicer(speech_pb2_grpc.SpeechServiceServicer):
                 enforce_eager=True,  # Desabilitar CUDA graphs para modelos customizados
                 enable_prefix_caching=False,  # Desabilitar cache de prefixo
             )
-            # Parâmetros otimizados baseados nos testes
             self.sampling_params = SamplingParams(
-                temperature=0.3,  # Mais conservador para respostas consistentes
-                max_tokens=50,    # Respostas mais concisas
-                repetition_penalty=1.1,  # Evitar repetições
-                stop=[".", "!", "?", "\n\n"]  # Parar em pontuação natural
             )
             self.pipeline = None  # Não usar pipeline do Transformers
@@ -216,11 +214,14 @@ class UltravoxServicer(speech_pb2_grpc.SpeechServiceServicer):
                 logger.warning(f"Nenhum áudio recebido para sessão {session_id}")
                 return
-            # Usar prompt padrão otimizado (formato que funciona!)
             if not prompt:
-                # IMPORTANTE: Incluir o token <|audio|> que o Ultravox espera
-                prompt = "Você é um assistente brasileiro. <|audio|>\nResponda à pergunta que ouviu em português:"
-                logger.info("Usando prompt padrão com token <|audio|>")
             # Concatenar todo o áudio
             full_audio = np.concatenate(audio_chunks)
@@ -244,26 +245,43 @@ class UltravoxServicer(speech_pb2_grpc.SpeechServiceServicer):
                 from vllm import SamplingParams
                 # Preparar entrada para vLLM com áudio
-                # Formato otimizado que funciona com Ultravox v0.5
-                # GARANTIR que o prompt tenha o token <|audio|>
-                if "<|audio|>" not in prompt:
-                    # Adicionar o token se não estiver presente
-                    vllm_prompt = prompt.rstrip() + " <|audio|>\nResponda em português:"
-                    logger.warning(f"Token <|audio|> não encontrado no prompt, adicionando automaticamente")
                 else:
-                    vllm_prompt = prompt
                 # 🔍 LOG DETALHADO DO PROMPT PARA DEBUG
                 logger.info(f"🔍 PROMPT COMPLETO enviado para vLLM:")
-                logger.info(f"  📝 Prompt original recebido: '{prompt[:200]}...'")
-                logger.info(f"  🎯 Prompt formatado final: '{vllm_prompt[:200]}...'")
                 logger.info(f"  🎵 Áudio shape: {full_audio.shape}, dtype: {full_audio.dtype}")
                 logger.info(f"  📊 Áudio stats: min={full_audio.min():.3f}, max={full_audio.max():.3f}")
                 logger.info("=" * 80)
                 vllm_input = {
-                    "prompt": vllm_prompt,
                     "multi_modal_data": {
-                        "audio": full_audio  # numpy array já em 16kHz
                     }
                 }

                 enforce_eager=True,  # Desabilitar CUDA graphs para modelos customizados
                 enable_prefix_caching=False,  # Desabilitar cache de prefixo
             )
+            # Parâmetros otimizados baseados nos testes bem-sucedidos
             self.sampling_params = SamplingParams(
+                temperature=0.2,  # Temperatura baixa para respostas mais precisas
+                max_tokens=64    # Tokens suficientes para respostas completas
             )
             self.pipeline = None  # Não usar pipeline do Transformers
                 logger.warning(f"Nenhum áudio recebido para sessão {session_id}")
                 return
+            # SEMPRE incluir o token de áudio no prompt
             if not prompt:
+                prompt = "<|audio|>"
+                logger.info("Usando prompt simples com token de áudio")
+            elif "<|audio|>" not in prompt:
+                # Se tem prompt mas não tem o token de áudio, adicionar
+                prompt = f"{prompt}\n<|audio|>"
+                logger.info(f"Adicionando token <|audio|> ao prompt customizado")
             # Concatenar todo o áudio
             full_audio = np.concatenate(audio_chunks)
                 from vllm import SamplingParams
                 # Preparar entrada para vLLM com áudio
+                # Importar tokenizer para chat template
+                from transformers import AutoTokenizer
+                model_name = self.model_config['model_path']
+                tokenizer = AutoTokenizer.from_pretrained(model_name)
+                # Criar mensagem com token de áudio
+                if prompt and "<|audio|>" not in prompt:
+                    user_content = f"<|audio|>\n{prompt}"
+                elif not prompt:
+                    user_content = "<|audio|>\nResponda em português:"
                 else:
+                    user_content = prompt
+                messages = [{"role": "user", "content": user_content}]
+                # Aplicar chat template
+                formatted_prompt = tokenizer.apply_chat_template(
+                    messages,
+                    tokenize=False,
+                    add_generation_prompt=True
+                )
+                # Criar tupla (audio, sample_rate) - formato esperado pelo vLLM
+                audio_tuple = (full_audio, sample_rate)
                 # 🔍 LOG DETALHADO DO PROMPT PARA DEBUG
                 logger.info(f"🔍 PROMPT COMPLETO enviado para vLLM:")
+                logger.info(f"  📝 Prompt original: '{prompt[:100]}...'")
+                logger.info(f"  🎯 Prompt formatado: '{formatted_prompt[:100]}...'")
                 logger.info(f"  🎵 Áudio shape: {full_audio.shape}, dtype: {full_audio.dtype}")
                 logger.info(f"  📊 Áudio stats: min={full_audio.min():.3f}, max={full_audio.max():.3f}")
                 logger.info("=" * 80)
                 vllm_input = {
+                    "prompt": formatted_prompt,
                     "multi_modal_data": {
+                        "audio": [audio_tuple]  # Lista de tuplas (audio, sample_rate)
                     }
                 }

ultravox/server_backup.py ADDED Viewed

	@@ -0,0 +1,446 @@

+#!/usr/bin/env python3
+"""
+Servidor Ultravox gRPC - Implementação com vLLM para aceleração
+Usa vLLM quando disponível, fallback para Transformers
+"""
+import grpc
+import asyncio
+import logging
+import numpy as np
+import time
+import sys
+import os
+import torch
+import transformers
+from typing import Iterator, Optional
+from concurrent import futures
+# Tentar importar vLLM
+try:
+    from vllm import LLM, SamplingParams
+    VLLM_AVAILABLE = True
+    logger_vllm = logging.getLogger("vllm")
+    logger_vllm.info("✅ vLLM disponível - usando inferência acelerada")
+except ImportError:
+    VLLM_AVAILABLE = False
+    logger_vllm = logging.getLogger("vllm")
+    logger_vllm.warning("⚠️ vLLM não disponível - usando Transformers padrão")
+# Adicionar paths para protos
+sys.path.append('/workspace/ultravox-pipeline/services/ultravox')
+sys.path.append('/workspace/ultravox-pipeline/protos/generated')
+import speech_pb2
+import speech_pb2_grpc
+logging.basicConfig(
+    level=logging.INFO,
+    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
+)
+logger = logging.getLogger(__name__)
+class UltravoxServicer(speech_pb2_grpc.SpeechServiceServicer):
+    """Implementação gRPC do Ultravox usando a arquitetura correta"""
+    def __init__(self):
+        """Inicializa o serviço"""
+        logger.info("Inicializando Ultravox Service...")
+        # Verificar GPU antes de inicializar
+        if not torch.cuda.is_available():
+            logger.error("❌ GPU não disponível! Ultravox requer GPU para funcionar.")
+            logger.error("Verifique se CUDA está instalado e funcionando.")
+            raise RuntimeError("GPU não disponível. Ultravox não pode funcionar sem GPU.")
+        # Forçar uso da GPU com mais memória livre
+        best_gpu = 0
+        best_free = 0
+        for i in range(torch.cuda.device_count()):
+            total = torch.cuda.get_device_properties(i).total_memory / (1024**3)
+            allocated = torch.cuda.memory_allocated(i) / (1024**3)
+            free = total - allocated
+            logger.info(f"GPU {i}: {torch.cuda.get_device_name(i)} - {free:.1f}GB livre de {total:.1f}GB")
+            if free > best_free:
+                best_free = free
+                best_gpu = i
+        torch.cuda.set_device(best_gpu)
+        logger.info(f"✅ Usando GPU {best_gpu}: {torch.cuda.get_device_name(best_gpu)}")
+        logger.info(f"   Memória livre: {best_free:.1f}GB")
+        if best_free < 3.0:  # Ultravox 1B precisa ~3GB
+            logger.warning(f"⚠️  Pouca memória GPU disponível ({best_free:.1f}GB). Recomendado: 3GB+")
+        # Configuração do modelo usando Transformers Pipeline
+        self.model_config = {
+            'model_path': "fixie-ai/ultravox-v0_5-llama-3_2-1b",  # Modelo v0.5 com Llama-3.2-1B (funcionando com vLLM)
+            'device': f"cuda:{best_gpu}",  # GPU específica
+            'max_new_tokens': 200,
+            'temperature': 0.7,  # Temperatura para respostas mais naturais
+            'token': os.getenv('HF_TOKEN', '')  # Token HuggingFace via env var
+        }
+        # Pipeline de transformers (API estável)
+        self.pipeline = None
+        self.conversation_states = {}  # Estado por sessão
+        # Métricas
+        self.total_requests = 0
+        self.active_sessions = 0
+        self.total_tokens_generated = 0
+        self._start_time = time.time()
+        # Inicializar modelo
+        self._initialize_model()
+    def _initialize_model(self):
+        """Inicializa o modelo Ultravox usando vLLM ou Transformers"""
+        try:
+            start_time = time.time()
+            if not VLLM_AVAILABLE:
+                logger.error("❌ vLLM NÃO está instalado! Este servidor REQUER vLLM.")
+                logger.error("Instale com: pip install vllm")
+                raise RuntimeError("vLLM é obrigatório para este servidor")
+            # USAR APENAS vLLM - SEM FALLBACK
+            logger.info("🚀 Carregando modelo Ultravox via vLLM (OBRIGATÓRIO)...")
+            # vLLM para modelos multimodais
+            self.vllm_model = LLM(
+                model=self.model_config['model_path'],
+                trust_remote_code=True,
+                dtype="bfloat16",
+                gpu_memory_utilization=0.30,  # 30% (~7.2GB) para ter memória suficiente
+                max_model_len=128,  # Reduzir contexto para 128 tokens para economizar memória
+                enforce_eager=True,  # Desabilitar CUDA graphs para modelos customizados
+                enable_prefix_caching=False,  # Desabilitar cache de prefixo
+            )
+            # Parâmetros otimizados baseados nos testes
+            self.sampling_params = SamplingParams(
+                temperature=0.3,  # Mais conservador para respostas consistentes
+                max_tokens=50,    # Respostas mais concisas
+                repetition_penalty=1.1,  # Evitar repetições
+                stop=[".", "!", "?", "\n\n"]  # Parar em pontuação natural
+            )
+            self.pipeline = None  # Não usar pipeline do Transformers
+            load_time = time.time() - start_time
+            logger.info(f"✅ Modelo carregado em {load_time:.2f}s via vLLM")
+            logger.info("🎯 Usando vLLM para inferência acelerada!")
+        except Exception as e:
+            logger.error(f"Erro ao carregar modelo: {e}")
+            raise
+    def _get_conversation_state(self, session_id: str):
+        """Obtém ou cria estado de conversação para sessão"""
+        if session_id not in self.conversation_states:
+            self.conversation_states[session_id] = {
+                'created_at': time.time(),
+                'turn_count': 0,
+                'conversation_history': []
+            }
+            logger.info(f"Estado de conversação criado para sessão: {session_id}")
+        return self.conversation_states[session_id]
+    def _cleanup_old_sessions(self, max_age: int = 1800):  # 30 minutos
+        """Remove sessões antigas"""
+        current_time = time.time()
+        expired_sessions = [
+            sid for sid, state in self.conversation_states.items()
+            if current_time - state['created_at'] > max_age
+        ]
+        for sid in expired_sessions:
+            del self.conversation_states[sid]
+            logger.info(f"Sessão expirada removida: {sid}")
+    async def StreamingRecognize(self,
+                                 request_iterator,
+                                 context: grpc.ServicerContext) -> Iterator[speech_pb2.TranscriptToken]:
+        """
+        Processa stream de áudio usando a arquitetura Ultravox completa
+        Args:
+            request_iterator: Iterator de chunks de áudio
+            context: Contexto gRPC
+        Yields:
+            Tokens de transcrição + resposta do LLM
+        """
+        session_id = None
+        start_time = time.time()
+        self.total_requests += 1
+        try:
+            # Coletar todo o áudio primeiro (como no Gradio)
+            audio_chunks = []
+            sample_rate = 16000
+            prompt = None  # Será obtido do metadata ou usado padrão
+            # Processar chunks de entrada
+            async for audio_chunk in request_iterator:
+                if not session_id:
+                    session_id = audio_chunk.session_id or f"session_{self.total_requests}"
+                    logger.info(f"Nova sessão Ultravox: {session_id}")
+                    self.active_sessions += 1
+                # DEBUG: Log todos os campos recebidos
+                logger.info(f"DEBUG - Chunk recebido para {session_id}:")
+                logger.info(f"  - audio_data: {len(audio_chunk.audio_data)} bytes")
+                logger.info(f"  - sample_rate: {audio_chunk.sample_rate}")
+                logger.info(f"  - is_final_chunk: {audio_chunk.is_final_chunk}")
+                # Obter prompt do campo system_prompt
+                if not prompt and audio_chunk.system_prompt:
+                    prompt = audio_chunk.system_prompt
+                    logger.info(f"✅ PROMPT DINÂMICO recebido: {prompt[:100]}...")
+                elif not audio_chunk.system_prompt:
+                    logger.info(f"DEBUG - Sem system_prompt no chunk")
+                sample_rate = audio_chunk.sample_rate or 16000
+                # CRUCIAL: Converter de bytes para numpy float32 (como descoberto no Gradio)
+                audio_data = np.frombuffer(audio_chunk.audio_data, dtype=np.float32)
+                audio_chunks.append(audio_data)
+                # Se é chunk final, processar
+                if audio_chunk.is_final_chunk:
+                    break
+            if not audio_chunks:
+                logger.warning(f"Nenhum áudio recebido para sessão {session_id}")
+                return
+            # SEMPRE incluir o token de áudio, mesmo com system_prompt
+            if prompt and "<|audio|>" not in prompt:
+                # Se tem prompt mas não tem o token de áudio, adicionar
+                prompt = f"{prompt}\n<|audio|>"
+                logger.info(f"Adicionando token <|audio|> ao prompt customizado")
+            elif not prompt:
+                # ⚠️ FORMATO SIMPLES QUE FUNCIONA COM ULTRAVOX v0.5! ⚠️
+                # O token <|audio|> é substituído pelo áudio automaticamente
+                prompt = "<|audio|>"
+                logger.info("Usando prompt simples com apenas token de áudio")
+            # Concatenar todo o áudio
+            full_audio = np.concatenate(audio_chunks)
+            logger.info(f"Áudio processado: {len(full_audio)} samples @ {sample_rate}Hz para sessão {session_id}")
+            # Obter estado de conversação
+            conv_state = self._get_conversation_state(session_id)
+            conv_state['turn_count'] += 1
+            # Processar com vLLM ou Transformers
+            backend = "vLLM" if self.vllm_model else "Transformers"
+            logger.info(f"Iniciando inferência {backend} para sessão {session_id}")
+            inference_start = time.time()
+            try:
+                # USAR APENAS vLLM - SEM FALLBACK
+                if not self.vllm_model:
+                    raise RuntimeError("vLLM não está carregado! Este servidor REQUER vLLM.")
+                # Usar vLLM para inferência acelerada (v0.10+ suporta Ultravox!)
+                from vllm import SamplingParams
+                # USAR PROMPT DIRETO - Ultravox v0.5 com vLLM funciona melhor assim
+                # O token <|audio|> é substituído automaticamente pelo áudio
+                vllm_prompt = prompt
+                # 🔍 LOG DETALHADO DO PROMPT PARA DEBUG
+                logger.info(f"🔍 PROMPT COMPLETO enviado para vLLM:")
+                logger.info(f"  🎯 Prompt: '{vllm_prompt[:200]}...'")
+                logger.info(f"  🎵 Áudio shape: {full_audio.shape}, dtype: {full_audio.dtype}")
+                logger.info(f"  📊 Áudio stats: min={full_audio.min():.3f}, max={full_audio.max():.3f}")
+                logger.info("=" * 80)
+                vllm_input = {
+                    "prompt": vllm_prompt,
+                    "multi_modal_data": {
+                        "audio": full_audio  # numpy array já em 16kHz
+                    }
+                }
+                # Fazer inferência com vLLM
+                outputs = self.vllm_model.generate(
+                    prompts=[vllm_input],
+                    sampling_params=self.sampling_params
+                )
+                inference_time = time.time() - inference_start
+                logger.info(f"⚡ Inferência vLLM concluída em {inference_time*1000:.0f}ms")
+                # 🔍 LOG DETALHADO DA RESPOSTA vLLM
+                logger.info(f"🔍 RESPOSTA DETALHADA do vLLM:")
+                logger.info(f"  📤 Outputs count: {len(outputs)}")
+                logger.info(f"  📤 Outputs[0].outputs count: {len(outputs[0].outputs)}")
+                # Extrair resposta
+                response_text = outputs[0].outputs[0].text
+                logger.info(f"  📝 Resposta RAW: '{response_text}'")
+                logger.info(f"  📏 Tamanho resposta: {len(response_text)} chars")
+                if not response_text:
+                    response_text = "Desculpe, não consegui processar o áudio. Poderia repetir?"
+                    logger.info(f"  ⚠️ Resposta vazia, usando fallback")
+                else:
+                    logger.info(f"  ✅ Resposta válida recebida")
+                logger.info(f"  🎯 Resposta final: '{response_text[:100]}...'")
+                logger.info("=" * 80)
+                # Sem else - SEMPRE usar vLLM
+                # Simular streaming dividindo a resposta em tokens
+                words = response_text.split()
+                token_count = 0
+                for word in words:
+                    # Criar token de resposta
+                    token = speech_pb2.TranscriptToken()
+                    token.text = word + " "
+                    token.confidence = 0.95
+                    token.is_final = False
+                    token.timestamp_ms = int((time.time() - start_time) * 1000)
+                    # Metadados de emoção
+                    token.emotion.emotion = speech_pb2.EmotionMetadata.NEUTRAL
+                    token.emotion.confidence = 0.8
+                    # Metadados de prosódia
+                    token.prosody.speech_rate = 120.0
+                    token.prosody.pitch_mean = 150.0
+                    token.prosody.energy = -20.0
+                    token.prosody.pitch_variance = 50.0
+                    token_count += 1
+                    self.total_tokens_generated += 1
+                    logger.debug(f"Token {token_count}: '{word}' para sessão {session_id}")
+                    yield token
+                    # Pequena pausa para simular streaming
+                    await asyncio.sleep(0.05)
+                # Token final
+                final_token = speech_pb2.TranscriptToken()
+                final_token.text = ""  # Token vazio indica fim
+                final_token.confidence = 1.0
+                final_token.is_final = True
+                final_token.timestamp_ms = int((time.time() - start_time) * 1000)
+                logger.info(f"✅ Processamento completo: {token_count} tokens, {inference_time*1000:.0f}ms")
+                yield final_token
+            except Exception as model_error:
+                logger.error(f"Erro no modelo Transformers: {model_error}")
+                # Retornar erro como token
+                error_token = speech_pb2.TranscriptToken()
+                error_token.text = f"Erro no processamento: {str(model_error)}"
+                error_token.confidence = 0.0
+                error_token.is_final = True
+                error_token.timestamp_ms = int((time.time() - start_time) * 1000)
+                yield error_token
+            # Limpar sessões antigas periodicamente
+            if self.total_requests % 10 == 0:
+                self._cleanup_old_sessions()
+        except Exception as e:
+            logger.error(f"Erro na transcrição para sessão {session_id}: {e}")
+            # Enviar token de erro
+            error_token = speech_pb2.TranscriptToken()
+            error_token.text = ""
+            error_token.confidence = 0.0
+            error_token.is_final = True
+            error_token.timestamp_ms = int((time.time() - start_time) * 1000)
+            yield error_token
+        finally:
+            if session_id:
+                self.active_sessions = max(0, self.active_sessions - 1)
+                processing_time = time.time() - start_time
+                logger.info(f"Sessão {session_id} concluída. Latência: {processing_time*1000:.2f}ms")
+    async def GetMetrics(self, request: speech_pb2.Empty,
+                        context: grpc.ServicerContext) -> speech_pb2.Metrics:
+        """Retorna métricas do serviço"""
+        import psutil
+        import torch
+        metrics = speech_pb2.Metrics()
+        metrics.total_requests = self.total_requests
+        metrics.active_sessions = self.active_sessions
+        # Latência média (placeholder)
+        metrics.average_latency_ms = 500.0
+        # Uso de GPU (sempre GPU conforme solicitado)
+        try:
+            metrics.gpu_usage_percent = float(torch.cuda.utilization())
+            metrics.memory_usage_mb = float(torch.cuda.memory_allocated() / (1024 * 1024))
+        except:
+            metrics.gpu_usage_percent = 0.0
+            metrics.memory_usage_mb = 0.0
+        # Tokens por segundo (deve ser int64 conforme protobuf)
+        metrics.tokens_per_second = int(self.total_tokens_generated / max(1, time.time() - self._start_time))
+        return metrics
+async def serve():
+    """Inicia servidor gRPC"""
+    # Configurar servidor
+    server = grpc.aio.server(
+        futures.ThreadPoolExecutor(max_workers=10),
+        options=[
+            ('grpc.max_send_message_length', 10 * 1024 * 1024),
+            ('grpc.max_receive_message_length', 10 * 1024 * 1024),
+            ('grpc.keepalive_time_ms', 30000),
+            ('grpc.keepalive_timeout_ms', 10000),
+            ('grpc.http2.min_time_between_pings_ms', 30000),
+        ]
+    )
+    # Adicionar serviço
+    speech_pb2_grpc.add_SpeechServiceServicer_to_server(
+        UltravoxServicer(), server
+    )
+    # Configurar porta
+    port = os.getenv('ULTRAVOX_PORT', '50051')
+    # Bind dual stack - IPv4 e IPv6 para compatibilidade
+    server.add_insecure_port(f'0.0.0.0:{port}')  # IPv4
+    server.add_insecure_port(f'[::]:{port}')     # IPv6
+    logger.info(f"Ultravox Server iniciando na porta {port}...")
+    await server.start()
+    logger.info(f"Ultravox Server rodando na porta {port}")
+    try:
+        await server.wait_for_termination()
+    except KeyboardInterrupt:
+        logger.info("Parando servidor...")
+        await server.stop(grace_period=5)
+def main():
+    """Função principal"""
+    try:
+        asyncio.run(serve())
+    except Exception as e:
+        logger.error(f"Erro fatal: {e}")
+        sys.exit(1)
+if __name__ == "__main__":
+    main()

ultravox/server_vllm_090_broken.py ADDED Viewed

	@@ -0,0 +1,447 @@

+#!/usr/bin/env python3
+"""
+Servidor Ultravox gRPC - Implementação com vLLM para aceleração
+Usa vLLM quando disponível, fallback para Transformers
+"""
+import grpc
+import asyncio
+import logging
+import numpy as np
+import time
+import sys
+import os
+import torch
+import transformers
+from typing import Iterator, Optional
+from concurrent import futures
+# Tentar importar vLLM
+try:
+    from vllm import LLM, SamplingParams
+    VLLM_AVAILABLE = True
+    logger_vllm = logging.getLogger("vllm")
+    logger_vllm.info("✅ vLLM disponível - usando inferência acelerada")
+except ImportError:
+    VLLM_AVAILABLE = False
+    logger_vllm = logging.getLogger("vllm")
+    logger_vllm.warning("⚠️ vLLM não disponível - usando Transformers padrão")
+# Adicionar paths para protos
+sys.path.append('/workspace/ultravox-pipeline/services/ultravox')
+sys.path.append('/workspace/ultravox-pipeline/protos/generated')
+import speech_pb2
+import speech_pb2_grpc
+logging.basicConfig(
+    level=logging.INFO,
+    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
+)
+logger = logging.getLogger(__name__)
+class UltravoxServicer(speech_pb2_grpc.SpeechServiceServicer):
+    """Implementação gRPC do Ultravox usando a arquitetura correta"""
+    def __init__(self):
+        """Inicializa o serviço"""
+        logger.info("Inicializando Ultravox Service...")
+        # Verificar GPU antes de inicializar
+        if not torch.cuda.is_available():
+            logger.error("❌ GPU não disponível! Ultravox requer GPU para funcionar.")
+            logger.error("Verifique se CUDA está instalado e funcionando.")
+            raise RuntimeError("GPU não disponível. Ultravox não pode funcionar sem GPU.")
+        # Forçar uso da GPU com mais memória livre
+        best_gpu = 0
+        best_free = 0
+        for i in range(torch.cuda.device_count()):
+            total = torch.cuda.get_device_properties(i).total_memory / (1024**3)
+            allocated = torch.cuda.memory_allocated(i) / (1024**3)
+            free = total - allocated
+            logger.info(f"GPU {i}: {torch.cuda.get_device_name(i)} - {free:.1f}GB livre de {total:.1f}GB")
+            if free > best_free:
+                best_free = free
+                best_gpu = i
+        torch.cuda.set_device(best_gpu)
+        logger.info(f"✅ Usando GPU {best_gpu}: {torch.cuda.get_device_name(best_gpu)}")
+        logger.info(f"   Memória livre: {best_free:.1f}GB")
+        if best_free < 3.0:  # Ultravox 1B precisa ~3GB
+            logger.warning(f"⚠️  Pouca memória GPU disponível ({best_free:.1f}GB). Recomendado: 3GB+")
+        # Configuração do modelo usando Transformers Pipeline
+        self.model_config = {
+            'model_path': "fixie-ai/ultravox-v0_5-llama-3_2-1b",  # Modelo v0.5 com Llama-3.2-1B
+            'device': f"cuda:{best_gpu}",  # GPU específica
+            'max_new_tokens': 200,
+            'temperature': 0.7,  # Temperatura para respostas mais naturais
+            'token': os.getenv('HF_TOKEN', '')  # Token HuggingFace via env var
+        }
+        # Pipeline de transformers (API estável)
+        self.pipeline = None
+        self.conversation_states = {}  # Estado por sessão
+        # Métricas
+        self.total_requests = 0
+        self.active_sessions = 0
+        self.total_tokens_generated = 0
+        self._start_time = time.time()
+        # Inicializar modelo
+        self._initialize_model()
+    def _initialize_model(self):
+        """Inicializa o modelo Ultravox usando vLLM ou Transformers"""
+        try:
+            start_time = time.time()
+            if not VLLM_AVAILABLE:
+                logger.error("❌ vLLM NÃO está instalado! Este servidor REQUER vLLM.")
+                logger.error("Instale com: pip install vllm")
+                raise RuntimeError("vLLM é obrigatório para este servidor")
+            # USAR APENAS vLLM - SEM FALLBACK
+            logger.info("🚀 Carregando modelo Ultravox via vLLM (OBRIGATÓRIO)...")
+            # vLLM para modelos multimodais - Gemma 3 27B com quantização INT4
+            self.vllm_model = LLM(
+                model=self.model_config['model_path'],
+                trust_remote_code=True,
+                dtype="bfloat16",
+                gpu_memory_utilization=0.60,  # Usar 60% da GPU para o modelo 27B quantizado
+                max_model_len=256,  # Reduzir contexto para 256 tokens
+                enforce_eager=True,  # Desabilitar CUDA graphs para modelos customizados
+                enable_prefix_caching=False,  # Desabilitar cache de prefixo
+            )
+            # Parâmetros otimizados baseados nos testes
+            self.sampling_params = SamplingParams(
+                temperature=0.3,  # Mais conservador para respostas consistentes
+                max_tokens=50,    # Respostas mais concisas
+                repetition_penalty=1.1,  # Evitar repeti��ões
+                stop=[".", "!", "?", "\n\n"]  # Parar em pontuação natural
+            )
+            self.pipeline = None  # Não usar pipeline do Transformers
+            load_time = time.time() - start_time
+            logger.info(f"✅ Modelo carregado em {load_time:.2f}s via vLLM")
+            logger.info("🎯 Usando vLLM para inferência acelerada!")
+        except Exception as e:
+            logger.error(f"Erro ao carregar modelo: {e}")
+            raise
+    def _get_conversation_state(self, session_id: str):
+        """Obtém ou cria estado de conversação para sessão"""
+        if session_id not in self.conversation_states:
+            self.conversation_states[session_id] = {
+                'created_at': time.time(),
+                'turn_count': 0,
+                'conversation_history': []
+            }
+            logger.info(f"Estado de conversação criado para sessão: {session_id}")
+        return self.conversation_states[session_id]
+    def _cleanup_old_sessions(self, max_age: int = 1800):  # 30 minutos
+        """Remove sessões antigas"""
+        current_time = time.time()
+        expired_sessions = [
+            sid for sid, state in self.conversation_states.items()
+            if current_time - state['created_at'] > max_age
+        ]
+        for sid in expired_sessions:
+            del self.conversation_states[sid]
+            logger.info(f"Sessão expirada removida: {sid}")
+    async def StreamingRecognize(self,
+                                 request_iterator,
+                                 context: grpc.ServicerContext) -> Iterator[speech_pb2.TranscriptToken]:
+        """
+        Processa stream de áudio usando a arquitetura Ultravox completa
+        Args:
+            request_iterator: Iterator de chunks de áudio
+            context: Contexto gRPC
+        Yields:
+            Tokens de transcrição + resposta do LLM
+        """
+        session_id = None
+        start_time = time.time()
+        self.total_requests += 1
+        try:
+            # Coletar todo o áudio primeiro (como no Gradio)
+            audio_chunks = []
+            sample_rate = 16000
+            prompt = None  # Será obtido do metadata ou usado padrão
+            # Processar chunks de entrada
+            async for audio_chunk in request_iterator:
+                if not session_id:
+                    session_id = audio_chunk.session_id or f"session_{self.total_requests}"
+                    logger.info(f"Nova sessão Ultravox: {session_id}")
+                    self.active_sessions += 1
+                # DEBUG: Log todos os campos recebidos
+                logger.info(f"DEBUG - Chunk recebido para {session_id}:")
+                logger.info(f"  - audio_data: {len(audio_chunk.audio_data)} bytes")
+                logger.info(f"  - sample_rate: {audio_chunk.sample_rate}")
+                logger.info(f"  - is_final_chunk: {audio_chunk.is_final_chunk}")
+                # Obter prompt do campo system_prompt
+                if not prompt and audio_chunk.system_prompt:
+                    prompt = audio_chunk.system_prompt
+                    logger.info(f"✅ PROMPT DINÂMICO recebido: {prompt[:100]}...")
+                elif not audio_chunk.system_prompt:
+                    logger.info(f"DEBUG - Sem system_prompt no chunk")
+                sample_rate = audio_chunk.sample_rate or 16000
+                # CRUCIAL: Converter de bytes para numpy float32 (como descoberto no Gradio)
+                audio_data = np.frombuffer(audio_chunk.audio_data, dtype=np.float32)
+                audio_chunks.append(audio_data)
+                # Se é chunk final, processar
+                if audio_chunk.is_final_chunk:
+                    break
+            if not audio_chunks:
+                logger.warning(f"Nenhum áudio recebido para sessão {session_id}")
+                return
+            # Usar prompt padrão otimizado (formato que funciona!)
+            if not prompt:
+                # IMPORTANTE: Incluir o token <|audio|> que o Ultravox espera
+                # FALLBACK: Usar inglês simples que o modelo entende bem
+                prompt = "You are a helpful assistant. <|audio|>\nRespond in Portuguese:"
+                logger.info("Usando prompt SIMPLES em inglês com instrução para responder em português")
+            # Concatenar todo o áudio
+            full_audio = np.concatenate(audio_chunks)
+            logger.info(f"Áudio processado: {len(full_audio)} samples @ {sample_rate}Hz para sessão {session_id}")
+            # Obter estado de conversação
+            conv_state = self._get_conversation_state(session_id)
+            conv_state['turn_count'] += 1
+            # Processar com vLLM ou Transformers
+            backend = "vLLM" if self.vllm_model else "Transformers"
+            logger.info(f"Iniciando inferência {backend} para sessão {session_id}")
+            inference_start = time.time()
+            try:
+                # USAR APENAS vLLM - SEM FALLBACK
+                if not self.vllm_model:
+                    raise RuntimeError("vLLM não está carregado! Este servidor REQUER vLLM.")
+                # Usar vLLM para inferência acelerada (v0.10+ suporta Ultravox!)
+                from vllm import SamplingParams
+                # Preparar entrada para vLLM com áudio
+                # Formato otimizado que funciona com Ultravox v0.5
+                # GARANTIR que o prompt tenha o token <|audio|>
+                if "<|audio|>" not in prompt:
+                    # Adicionar o token se não estiver presente
+                    vllm_prompt = prompt.rstrip() + " <|audio|>\nResponda em português:"
+                    logger.warning(f"Token <|audio|> não encontrado no prompt, adicionando automaticamente")
+                else:
+                    vllm_prompt = prompt
+                # 🔍 LOG DETALHADO DO PROMPT PARA DEBUG
+                logger.info(f"🔍 PROMPT COMPLETO enviado para vLLM:")
+                logger.info(f"  📝 Prompt original recebido: '{prompt[:200]}...'")
+                logger.info(f"  🎯 Prompt formatado final: '{vllm_prompt[:200]}...'")
+                logger.info(f"  🎵 Áudio shape: {full_audio.shape}, dtype: {full_audio.dtype}")
+                logger.info(f"  📊 Áudio stats: min={full_audio.min():.3f}, max={full_audio.max():.3f}")
+                logger.info("=" * 80)
+                vllm_input = {
+                    "prompt": vllm_prompt,
+                    "multi_modal_data": {
+                        "audio": full_audio  # numpy array já em 16kHz
+                    }
+                }
+                # Fazer inferência com vLLM
+                outputs = self.vllm_model.generate(
+                    prompts=[vllm_input],
+                    sampling_params=self.sampling_params
+                )
+                inference_time = time.time() - inference_start
+                logger.info(f"⚡ Inferência vLLM concluída em {inference_time*1000:.0f}ms")
+                # 🔍 LOG DETALHADO DA RESPOSTA vLLM
+                logger.info(f"🔍 RESPOSTA DETALHADA do vLLM:")
+                logger.info(f"  📤 Outputs count: {len(outputs)}")
+                logger.info(f"  📤 Outputs[0].outputs count: {len(outputs[0].outputs)}")
+                # Extrair resposta
+                response_text = outputs[0].outputs[0].text
+                logger.info(f"  📝 Resposta RAW: '{response_text}'")
+                logger.info(f"  📏 Tamanho resposta: {len(response_text)} chars")
+                if not response_text:
+                    response_text = "Desculpe, não consegui processar o áudio. Poderia repetir?"
+                    logger.info(f"  ⚠️ Resposta vazia, usando fallback")
+                else:
+                    logger.info(f"  ✅ Resposta válida recebida")
+                logger.info(f"  🎯 Resposta final: '{response_text[:100]}...'")
+                logger.info("=" * 80)
+                # Sem else - SEMPRE usar vLLM
+                # Simular streaming dividindo a resposta em tokens
+                words = response_text.split()
+                token_count = 0
+                for word in words:
+                    # Criar token de resposta
+                    token = speech_pb2.TranscriptToken()
+                    token.text = word + " "
+                    token.confidence = 0.95
+                    token.is_final = False
+                    token.timestamp_ms = int((time.time() - start_time) * 1000)
+                    # Metadados de emoção
+                    token.emotion.emotion = speech_pb2.EmotionMetadata.NEUTRAL
+                    token.emotion.confidence = 0.8
+                    # Metadados de prosódia
+                    token.prosody.speech_rate = 120.0
+                    token.prosody.pitch_mean = 150.0
+                    token.prosody.energy = -20.0
+                    token.prosody.pitch_variance = 50.0
+                    token_count += 1
+                    self.total_tokens_generated += 1
+                    logger.debug(f"Token {token_count}: '{word}' para sessão {session_id}")
+                    yield token
+                    # Pequena pausa para simular streaming
+                    await asyncio.sleep(0.05)
+                # Token final
+                final_token = speech_pb2.TranscriptToken()
+                final_token.text = ""  # Token vazio indica fim
+                final_token.confidence = 1.0
+                final_token.is_final = True
+                final_token.timestamp_ms = int((time.time() - start_time) * 1000)
+                logger.info(f"✅ Processamento completo: {token_count} tokens, {inference_time*1000:.0f}ms")
+                yield final_token
+            except Exception as model_error:
+                logger.error(f"Erro no modelo Transformers: {model_error}")
+                # Retornar erro como token
+                error_token = speech_pb2.TranscriptToken()
+                error_token.text = f"Erro no processamento: {str(model_error)}"
+                error_token.confidence = 0.0
+                error_token.is_final = True
+                error_token.timestamp_ms = int((time.time() - start_time) * 1000)
+                yield error_token
+            # Limpar sessões antigas periodicamente
+            if self.total_requests % 10 == 0:
+                self._cleanup_old_sessions()
+        except Exception as e:
+            logger.error(f"Erro na transcrição para sessão {session_id}: {e}")
+            # Enviar token de erro
+            error_token = speech_pb2.TranscriptToken()
+            error_token.text = ""
+            error_token.confidence = 0.0
+            error_token.is_final = True
+            error_token.timestamp_ms = int((time.time() - start_time) * 1000)
+            yield error_token
+        finally:
+            if session_id:
+                self.active_sessions = max(0, self.active_sessions - 1)
+                processing_time = time.time() - start_time
+                logger.info(f"Sessão {session_id} concluída. Latência: {processing_time*1000:.2f}ms")
+    async def GetMetrics(self, request: speech_pb2.Empty,
+                        context: grpc.ServicerContext) -> speech_pb2.Metrics:
+        """Retorna métricas do serviço"""
+        import psutil
+        import torch
+        metrics = speech_pb2.Metrics()
+        metrics.total_requests = self.total_requests
+        metrics.active_sessions = self.active_sessions
+        # Latência média (placeholder)
+        metrics.average_latency_ms = 500.0
+        # Uso de GPU (sempre GPU conforme solicitado)
+        try:
+            metrics.gpu_usage_percent = float(torch.cuda.utilization())
+            metrics.memory_usage_mb = float(torch.cuda.memory_allocated() / (1024 * 1024))
+        except:
+            metrics.gpu_usage_percent = 0.0
+            metrics.memory_usage_mb = 0.0
+        # Tokens por segundo (deve ser int64 conforme protobuf)
+        metrics.tokens_per_second = int(self.total_tokens_generated / max(1, time.time() - self._start_time))
+        return metrics
+async def serve():
+    """Inicia servidor gRPC"""
+    # Configurar servidor
+    server = grpc.aio.server(
+        futures.ThreadPoolExecutor(max_workers=10),
+        options=[
+            ('grpc.max_send_message_length', 10 * 1024 * 1024),
+            ('grpc.max_receive_message_length', 10 * 1024 * 1024),
+            ('grpc.keepalive_time_ms', 30000),
+            ('grpc.keepalive_timeout_ms', 10000),
+            ('grpc.http2.min_time_between_pings_ms', 30000),
+        ]
+    )
+    # Adicionar serviço
+    speech_pb2_grpc.add_SpeechServiceServicer_to_server(
+        UltravoxServicer(), server
+    )
+    # Configurar porta (IPv4 e IPv6)
+    port = os.getenv('ULTRAVOX_PORT', '50051')
+    server.add_insecure_port(f'0.0.0.0:{port}')  # IPv4
+    server.add_insecure_port(f'[::]:{port}')     # IPv6
+    logger.info(f"Ultravox Server iniciando na porta {port}...")
+    await server.start()
+    logger.info(f"Ultravox Server rodando na porta {port}")
+    try:
+        await server.wait_for_termination()
+    except KeyboardInterrupt:
+        logger.info("Parando servidor...")
+        await server.stop(grace_period=5)
+def main():
+    """Função principal"""
+    try:
+        asyncio.run(serve())
+    except Exception as e:
+        logger.error(f"Erro fatal: {e}")
+        sys.exit(1)
+if __name__ == "__main__":
+    main()

ultravox/server_working_original.py ADDED Viewed

	@@ -0,0 +1,440 @@

+#!/usr/bin/env python3
+"""
+Servidor Ultravox gRPC - Implementação com vLLM para aceleração
+Usa vLLM quando disponível, fallback para Transformers
+"""
+import grpc
+import asyncio
+import logging
+import numpy as np
+import time
+import sys
+import os
+import torch
+import transformers
+from typing import Iterator, Optional
+from concurrent import futures
+# Tentar importar vLLM
+try:
+    from vllm import LLM, SamplingParams
+    VLLM_AVAILABLE = True
+    logger_vllm = logging.getLogger("vllm")
+    logger_vllm.info("✅ vLLM disponível - usando inferência acelerada")
+except ImportError:
+    VLLM_AVAILABLE = False
+    logger_vllm = logging.getLogger("vllm")
+    logger_vllm.warning("⚠️ vLLM não disponível - usando Transformers padrão")
+# Adicionar paths para protos
+sys.path.append('/workspace/ultravox-pipeline/services/ultravox')
+sys.path.append('/workspace/ultravox-pipeline/protos/generated')
+import speech_pb2
+import speech_pb2_grpc
+logging.basicConfig(
+    level=logging.INFO,
+    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
+)
+logger = logging.getLogger(__name__)
+class UltravoxServicer(speech_pb2_grpc.SpeechServiceServicer):
+    """Implementação gRPC do Ultravox usando a arquitetura correta"""
+    def __init__(self):
+        """Inicializa o serviço"""
+        logger.info("Inicializando Ultravox Service...")
+        # Verificar GPU antes de inicializar
+        if not torch.cuda.is_available():
+            logger.error("❌ GPU não disponível! Ultravox requer GPU para funcionar.")
+            logger.error("Verifique se CUDA está instalado e funcionando.")
+            raise RuntimeError("GPU não disponível. Ultravox não pode funcionar sem GPU.")
+        # Forçar uso da GPU com mais memória livre
+        best_gpu = 0
+        best_free = 0
+        for i in range(torch.cuda.device_count()):
+            total = torch.cuda.get_device_properties(i).total_memory / (1024**3)
+            allocated = torch.cuda.memory_allocated(i) / (1024**3)
+            free = total - allocated
+            logger.info(f"GPU {i}: {torch.cuda.get_device_name(i)} - {free:.1f}GB livre de {total:.1f}GB")
+            if free > best_free:
+                best_free = free
+                best_gpu = i
+        torch.cuda.set_device(best_gpu)
+        logger.info(f"✅ Usando GPU {best_gpu}: {torch.cuda.get_device_name(best_gpu)}")
+        logger.info(f"   Memória livre: {best_free:.1f}GB")
+        if best_free < 3.0:  # Ultravox 1B precisa ~3GB
+            logger.warning(f"⚠️  Pouca memória GPU disponível ({best_free:.1f}GB). Recomendado: 3GB+")
+        # Configuração do modelo usando Transformers Pipeline
+        self.model_config = {
+            'model_path': "fixie-ai/ultravox-v0_5-llama-3_2-1b",  # Modelo v0.5 com Llama-3.2-1B (funcionando com vLLM)
+            'device': f"cuda:{best_gpu}",  # GPU específica
+            'max_new_tokens': 200,
+            'temperature': 0.7,  # Temperatura para respostas mais naturais
+            'token': os.getenv('HF_TOKEN', '')  # Token HuggingFace via env var
+        }
+        # Pipeline de transformers (API estável)
+        self.pipeline = None
+        self.conversation_states = {}  # Estado por sessão
+        # Métricas
+        self.total_requests = 0
+        self.active_sessions = 0
+        self.total_tokens_generated = 0
+        self._start_time = time.time()
+        # Inicializar modelo
+        self._initialize_model()
+    def _initialize_model(self):
+        """Inicializa o modelo Ultravox usando vLLM ou Transformers"""
+        try:
+            start_time = time.time()
+            if not VLLM_AVAILABLE:
+                logger.error("❌ vLLM NÃO está instalado! Este servidor REQUER vLLM.")
+                logger.error("Instale com: pip install vllm")
+                raise RuntimeError("vLLM é obrigatório para este servidor")
+            # USAR APENAS vLLM - SEM FALLBACK
+            logger.info("🚀 Carregando modelo Ultravox via vLLM (OBRIGATÓRIO)...")
+            # vLLM para modelos multimodais
+            self.vllm_model = LLM(
+                model=self.model_config['model_path'],
+                trust_remote_code=True,
+                dtype="bfloat16",
+                gpu_memory_utilization=0.30,  # Aumentar para 30% (~7.2GB de 24GB)
+                max_model_len=256,  # Reduzir contexto para 256 tokens
+                enforce_eager=True,  # Desabilitar CUDA graphs para modelos customizados
+                enable_prefix_caching=False,  # Desabilitar cache de prefixo
+            )
+            # Parâmetros otimizados baseados nos testes
+            self.sampling_params = SamplingParams(
+                temperature=0.3,  # Mais conservador para respostas consistentes
+                max_tokens=50,    # Respostas mais concisas
+                repetition_penalty=1.1,  # Evitar repetições
+                stop=[".", "!", "?", "\n\n"]  # Parar em pontuação natural
+            )
+            self.pipeline = None  # Não usar pipeline do Transformers
+            load_time = time.time() - start_time
+            logger.info(f"✅ Modelo carregado em {load_time:.2f}s via vLLM")
+            logger.info("🎯 Usando vLLM para inferência acelerada!")
+        except Exception as e:
+            logger.error(f"Erro ao carregar modelo: {e}")
+            raise
+    def _get_conversation_state(self, session_id: str):
+        """Obtém ou cria estado de conversação para sessão"""
+        if session_id not in self.conversation_states:
+            self.conversation_states[session_id] = {
+                'created_at': time.time(),
+                'turn_count': 0,
+                'conversation_history': []
+            }
+            logger.info(f"Estado de conversação criado para sessão: {session_id}")
+        return self.conversation_states[session_id]
+    def _cleanup_old_sessions(self, max_age: int = 1800):  # 30 minutos
+        """Remove sessões antigas"""
+        current_time = time.time()
+        expired_sessions = [
+            sid for sid, state in self.conversation_states.items()
+            if current_time - state['created_at'] > max_age
+        ]
+        for sid in expired_sessions:
+            del self.conversation_states[sid]
+            logger.info(f"Sessão expirada removida: {sid}")
+    async def StreamingRecognize(self,
+                                 request_iterator,
+                                 context: grpc.ServicerContext) -> Iterator[speech_pb2.TranscriptToken]:
+        """
+        Processa stream de áudio usando a arquitetura Ultravox completa
+        Args:
+            request_iterator: Iterator de chunks de áudio
+            context: Contexto gRPC
+        Yields:
+            Tokens de transcrição + resposta do LLM
+        """
+        session_id = None
+        start_time = time.time()
+        self.total_requests += 1
+        try:
+            # Coletar todo o áudio primeiro (como no Gradio)
+            audio_chunks = []
+            sample_rate = 16000
+            prompt = None  # Será obtido do metadata ou usado padrão
+            # Processar chunks de entrada
+            async for audio_chunk in request_iterator:
+                if not session_id:
+                    session_id = audio_chunk.session_id or f"session_{self.total_requests}"
+                    logger.info(f"Nova sessão Ultravox: {session_id}")
+                    self.active_sessions += 1
+                # DEBUG: Log todos os campos recebidos
+                logger.info(f"DEBUG - Chunk recebido para {session_id}:")
+                logger.info(f"  - audio_data: {len(audio_chunk.audio_data)} bytes")
+                logger.info(f"  - sample_rate: {audio_chunk.sample_rate}")
+                logger.info(f"  - is_final_chunk: {audio_chunk.is_final_chunk}")
+                # Obter prompt do campo system_prompt
+                if not prompt and audio_chunk.system_prompt:
+                    prompt = audio_chunk.system_prompt
+                    logger.info(f"✅ PROMPT DINÂMICO recebido: {prompt[:100]}...")
+                elif not audio_chunk.system_prompt:
+                    logger.info(f"DEBUG - Sem system_prompt no chunk")
+                sample_rate = audio_chunk.sample_rate or 16000
+                # CRUCIAL: Converter de bytes para numpy float32 (como descoberto no Gradio)
+                audio_data = np.frombuffer(audio_chunk.audio_data, dtype=np.float32)
+                audio_chunks.append(audio_data)
+                # Se é chunk final, processar
+                if audio_chunk.is_final_chunk:
+                    break
+            if not audio_chunks:
+                logger.warning(f"Nenhum áudio recebido para sessão {session_id}")
+                return
+            # Usar prompt padrão otimizado (formato que funciona!)
+            if not prompt:
+                prompt = """Você é um assistente brasileiro útil e conversacional.
+                Responda à pergunta que ouviu em português de forma natural e direta."""
+                logger.info("Usando prompt padrão")
+            # Concatenar todo o áudio
+            full_audio = np.concatenate(audio_chunks)
+            logger.info(f"Áudio processado: {len(full_audio)} samples @ {sample_rate}Hz para sessão {session_id}")
+            # Obter estado de conversação
+            conv_state = self._get_conversation_state(session_id)
+            conv_state['turn_count'] += 1
+            # Processar com vLLM ou Transformers
+            backend = "vLLM" if self.vllm_model else "Transformers"
+            logger.info(f"Iniciando inferência {backend} para sessão {session_id}")
+            inference_start = time.time()
+            try:
+                # USAR APENAS vLLM - SEM FALLBACK
+                if not self.vllm_model:
+                    raise RuntimeError("vLLM não está carregado! Este servidor REQUER vLLM.")
+                # Usar vLLM para inferência acelerada (v0.10+ suporta Ultravox!)
+                from vllm import SamplingParams
+                # Preparar entrada para vLLM com áudio
+                # Formato otimizado que funciona com Ultravox v0.5
+                # O prompt já vem formatado do cliente, usar diretamente
+                vllm_prompt = prompt
+                # 🔍 LOG DETALHADO DO PROMPT PARA DEBUG
+                logger.info(f"🔍 PROMPT COMPLETO enviado para vLLM:")
+                logger.info(f"  📝 Prompt original recebido: '{prompt[:200]}...'")
+                logger.info(f"  🎯 Prompt formatado final: '{vllm_prompt[:200]}...'")
+                logger.info(f"  🎵 Áudio shape: {full_audio.shape}, dtype: {full_audio.dtype}")
+                logger.info(f"  📊 Áudio stats: min={full_audio.min():.3f}, max={full_audio.max():.3f}")
+                logger.info("=" * 80)
+                vllm_input = {
+                    "prompt": vllm_prompt,
+                    "multi_modal_data": {
+                        "audio": full_audio  # numpy array já em 16kHz
+                    }
+                }
+                # Fazer inferência com vLLM
+                outputs = self.vllm_model.generate(
+                    prompts=[vllm_input],
+                    sampling_params=self.sampling_params
+                )
+                inference_time = time.time() - inference_start
+                logger.info(f"⚡ Inferência vLLM concluída em {inference_time*1000:.0f}ms")
+                # 🔍 LOG DETALHADO DA RESPOSTA vLLM
+                logger.info(f"🔍 RESPOSTA DETALHADA do vLLM:")
+                logger.info(f"  📤 Outputs count: {len(outputs)}")
+                logger.info(f"  📤 Outputs[0].outputs count: {len(outputs[0].outputs)}")
+                # Extrair resposta
+                response_text = outputs[0].outputs[0].text
+                logger.info(f"  📝 Resposta RAW: '{response_text}'")
+                logger.info(f"  📏 Tamanho resposta: {len(response_text)} chars")
+                if not response_text:
+                    response_text = "Desculpe, não consegui processar o áudio. Poderia repetir?"
+                    logger.info(f"  ⚠️ Resposta vazia, usando fallback")
+                else:
+                    logger.info(f"  ✅ Resposta válida recebida")
+                logger.info(f"  🎯 Resposta final: '{response_text[:100]}...'")
+                logger.info("=" * 80)
+                # Sem else - SEMPRE usar vLLM
+                # Simular streaming dividindo a resposta em tokens
+                words = response_text.split()
+                token_count = 0
+                for word in words:
+                    # Criar token de resposta
+                    token = speech_pb2.TranscriptToken()
+                    token.text = word + " "
+                    token.confidence = 0.95
+                    token.is_final = False
+                    token.timestamp_ms = int((time.time() - start_time) * 1000)
+                    # Metadados de emoção
+                    token.emotion.emotion = speech_pb2.EmotionMetadata.NEUTRAL
+                    token.emotion.confidence = 0.8
+                    # Metadados de prosódia
+                    token.prosody.speech_rate = 120.0
+                    token.prosody.pitch_mean = 150.0
+                    token.prosody.energy = -20.0
+                    token.prosody.pitch_variance = 50.0
+                    token_count += 1
+                    self.total_tokens_generated += 1
+                    logger.debug(f"Token {token_count}: '{word}' para sessão {session_id}")
+                    yield token
+                    # Pequena pausa para simular streaming
+                    await asyncio.sleep(0.05)
+                # Token final
+                final_token = speech_pb2.TranscriptToken()
+                final_token.text = ""  # Token vazio indica fim
+                final_token.confidence = 1.0
+                final_token.is_final = True
+                final_token.timestamp_ms = int((time.time() - start_time) * 1000)
+                logger.info(f"✅ Processamento completo: {token_count} tokens, {inference_time*1000:.0f}ms")
+                yield final_token
+            except Exception as model_error:
+                logger.error(f"Erro no modelo Transformers: {model_error}")
+                # Retornar erro como token
+                error_token = speech_pb2.TranscriptToken()
+                error_token.text = f"Erro no processamento: {str(model_error)}"
+                error_token.confidence = 0.0
+                error_token.is_final = True
+                error_token.timestamp_ms = int((time.time() - start_time) * 1000)
+                yield error_token
+            # Limpar sessões antigas periodicamente
+            if self.total_requests % 10 == 0:
+                self._cleanup_old_sessions()
+        except Exception as e:
+            logger.error(f"Erro na transcrição para sessão {session_id}: {e}")
+            # Enviar token de erro
+            error_token = speech_pb2.TranscriptToken()
+            error_token.text = ""
+            error_token.confidence = 0.0
+            error_token.is_final = True
+            error_token.timestamp_ms = int((time.time() - start_time) * 1000)
+            yield error_token
+        finally:
+            if session_id:
+                self.active_sessions = max(0, self.active_sessions - 1)
+                processing_time = time.time() - start_time
+                logger.info(f"Sessão {session_id} concluída. Latência: {processing_time*1000:.2f}ms")
+    async def GetMetrics(self, request: speech_pb2.Empty,
+                        context: grpc.ServicerContext) -> speech_pb2.Metrics:
+        """Retorna métricas do serviço"""
+        import psutil
+        import torch
+        metrics = speech_pb2.Metrics()
+        metrics.total_requests = self.total_requests
+        metrics.active_sessions = self.active_sessions
+        # Latência média (placeholder)
+        metrics.average_latency_ms = 500.0
+        # Uso de GPU (sempre GPU conforme solicitado)
+        try:
+            metrics.gpu_usage_percent = float(torch.cuda.utilization())
+            metrics.memory_usage_mb = float(torch.cuda.memory_allocated() / (1024 * 1024))
+        except:
+            metrics.gpu_usage_percent = 0.0
+            metrics.memory_usage_mb = 0.0
+        # Tokens por segundo (deve ser int64 conforme protobuf)
+        metrics.tokens_per_second = int(self.total_tokens_generated / max(1, time.time() - self._start_time))
+        return metrics
+async def serve():
+    """Inicia servidor gRPC"""
+    # Configurar servidor
+    server = grpc.aio.server(
+        futures.ThreadPoolExecutor(max_workers=10),
+        options=[
+            ('grpc.max_send_message_length', 10 * 1024 * 1024),
+            ('grpc.max_receive_message_length', 10 * 1024 * 1024),
+            ('grpc.keepalive_time_ms', 30000),
+            ('grpc.keepalive_timeout_ms', 10000),
+            ('grpc.http2.min_time_between_pings_ms', 30000),
+        ]
+    )
+    # Adicionar serviço
+    speech_pb2_grpc.add_SpeechServiceServicer_to_server(
+        UltravoxServicer(), server
+    )
+    # Configurar porta
+    port = os.getenv('ULTRAVOX_PORT', '50051')
+    server.add_insecure_port(f'[::]:{port}')
+    logger.info(f"Ultravox Server iniciando na porta {port}...")
+    await server.start()
+    logger.info(f"Ultravox Server rodando na porta {port}")
+    try:
+        await server.wait_for_termination()
+    except KeyboardInterrupt:
+        logger.info("Parando servidor...")
+        await server.stop(grace_period=5)
+def main():
+    """Função principal"""
+    try:
+        asyncio.run(serve())
+    except Exception as e:
+        logger.error(f"Erro fatal: {e}")
+        sys.exit(1)
+if __name__ == "__main__":
+    main()

ultravox/speech.proto ADDED Viewed

	@@ -0,0 +1,94 @@

+syntax = "proto3";
+package speech;
+service SpeechService {
+    // Streaming bidirecional para reconhecimento de fala
+    rpc StreamingRecognize(stream AudioChunk) returns (stream TranscriptToken);
+    // Endpoint para métricas
+    rpc GetMetrics(Empty) returns (Metrics);
+}
+// Chunk de áudio enviado pelo cliente
+message AudioChunk {
+    bytes audio_data = 1;           // PCM float32
+    int32 sample_rate = 2;          // Taxa de amostragem (16000)
+    int64 timestamp_ms = 3;         // Timestamp em millisegundos
+    int32 sequence_number = 4;      // Número de sequência
+    bool is_final_chunk = 5;        // Indica fim do áudio
+    // Metadados opcionais
+    float voice_activity_probability = 6;  // Probabilidade de atividade de voz
+    string session_id = 7;                 // ID da sessão
+    string system_prompt = 8;              // Prompt do sistema para contexto dinâmico
+    string user_prompt = 9;                // Prompt do usuário (instrução específica)
+}
+// Token de transcrição retornado
+message TranscriptToken {
+    string text = 1;                // Texto transcrito
+    float confidence = 2;           // Confiança da transcrição
+    bool is_final = 3;              // Token final da frase
+    int64 timestamp_ms = 4;         // Timestamp
+    // Metadados contextuais
+    EmotionMetadata emotion = 5;   // Emoção detectada
+    ProsodyMetadata prosody = 6;   // Prosódia detectada
+    // Validação e diagnóstico
+    ValidationResult validation = 7; // Resultado da validação
+}
+// Resultado da validação com erros específicos
+message ValidationResult {
+    enum ValidationStatus {
+        VALID = 0;                  // Resposta válida
+        EMPTY_RESPONSE = 1;         // Resposta vazia ou muito curta
+        GENERIC_ERROR = 2;          // Resposta genérica de erro
+        AUDIO_QUALITY_ISSUE = 3;    // Problemas de qualidade do áudio
+        PROMPT_FORMAT_ERROR = 4;    // Formato de prompt inválido
+        MODEL_ERROR = 5;            // Erro interno do modelo
+        RETRY_SUCCESSFUL = 6;       // Retry foi bem-sucedido
+    }
+    ValidationStatus status = 1;    // Status da validação
+    string error_message = 2;       // Mensagem de erro específica
+    string diagnostic_info = 3;     // Informações técnicas de diagnóstico
+    bool retry_attempted = 4;       // Se foi tentado um retry
+}
+// Metadados de emoção
+message EmotionMetadata {
+    enum Emotion {
+        NEUTRAL = 0;
+        HAPPY = 1;
+        SAD = 2;
+        ANGRY = 3;
+        SURPRISED = 4;
+        FEARFUL = 5;
+    }
+    Emotion emotion = 1;
+    float confidence = 2;
+}
+// Metadados de prosódia
+message ProsodyMetadata {
+    float speech_rate = 1;     // Palavras por minuto
+    float pitch_mean = 2;      // Pitch médio em Hz
+    float energy = 3;          // Energia em dB
+    float pitch_variance = 4;  // Variância do pitch
+}
+// Métricas do serviço
+message Metrics {
+    int64 total_requests = 1;
+    int64 active_sessions = 2;
+    float average_latency_ms = 3;
+    float gpu_usage_percent = 4;
+    float memory_usage_mb = 5;
+    int64 tokens_per_second = 6;
+}
+// Mensagem vazia
+message Empty {}

ultravox/start_ultravox.sh ADDED Viewed

	@@ -0,0 +1,67 @@

+#!/bin/bash
+# Script para iniciar o servidor Ultravox com limpeza de processos órfãos
+# Evita problemas de memória GPU ocupada por processos vLLM antigos
+echo "🔧 Iniciando servidor Ultravox..."
+echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
+# 1. Limpar processos órfãos vLLM/EngineCore
+echo "🧹 Limpando processos órfãos..."
+pkill -f "VLLM::EngineCore" 2>/dev/null
+pkill -f "vllm.*engine" 2>/dev/null
+pkill -f "multiprocessing.resource_tracker.*ultravox" 2>/dev/null
+pkill -f "python.*server.py" 2>/dev/null
+sleep 2
+# 2. Verificar memória GPU antes de iniciar
+echo "📊 Verificando GPU..."
+GPU_FREE=$(nvidia-smi --query-gpu=memory.free --format=csv,noheader,nounits 2>/dev/null | head -1)
+GPU_TOTAL=$(nvidia-smi --query-gpu=memory.total --format=csv,noheader,nounits 2>/dev/null | head -1)
+if [ -n "$GPU_FREE" ] && [ -n "$GPU_TOTAL" ]; then
+    echo "   GPU: ${GPU_FREE}MB livres de ${GPU_TOTAL}MB"
+    # Verificar se tem pelo menos 20GB livres
+    if [ "$GPU_FREE" -lt "20000" ]; then
+        echo "⚠️  AVISO: Menos de 20GB livres na GPU!"
+        echo "   Tentando limpar mais processos..."
+        # Limpar mais agressivamente
+        pkill -9 -f "vllm" 2>/dev/null
+        pkill -9 -f "EngineCore" 2>/dev/null
+        sleep 3
+        # Verificar novamente
+        GPU_FREE=$(nvidia-smi --query-gpu=memory.free --format=csv,noheader,nounits 2>/dev/null | head -1)
+        echo "   GPU após limpeza: ${GPU_FREE}MB livres"
+    fi
+fi
+# 3. Verificar se a porta está livre
+if lsof -i :50051 >/dev/null 2>&1; then
+    echo "⚠️  Porta 50051 em uso. Matando processo..."
+    kill -9 $(lsof -t -i:50051) 2>/dev/null
+    sleep 2
+fi
+# 4. Ativar ambiente virtual
+echo "🐍 Ativando ambiente Python..."
+cd /workspace/ultravox-pipeline/ultravox
+source venv/bin/activate
+# 5. Iniciar servidor
+echo "🚀 Iniciando servidor Ultravox..."
+echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
+echo "   Modelo: Ultravox v0.5 Llama 3.1-8B"
+echo "   Porta: 50051"
+echo "   GPU: 90% utilization"
+echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
+echo ""
+echo "📝 Logs do servidor:"
+echo ""
+# Executar servidor com trap para limpeza ao sair
+trap 'echo "🛑 Parando servidor..."; pkill -f "VLLM::EngineCore"; pkill -f "python.*server.py"' INT TERM
+python server.py

ultravox/stop_ultravox.sh ADDED Viewed

	@@ -0,0 +1,60 @@

+#!/bin/bash
+# Script para parar o servidor Ultravox e limpar todos os processos relacionados
+echo "🛑 Parando servidor Ultravox..."
+echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
+# 1. Parar servidor principal
+echo "📍 Parando processo principal..."
+pkill -f "python.*server.py" 2>/dev/null
+# 2. Limpar processos vLLM
+echo "🧹 Limpando processos vLLM..."
+pkill -f "VLLM::EngineCore" 2>/dev/null
+pkill -f "vllm.*engine" 2>/dev/null
+pkill -f "multiprocessing.resource_tracker.*ultravox" 2>/dev/null
+# 3. Verificar porta 50051
+echo "🔍 Verificando porta 50051..."
+if lsof -i :50051 >/dev/null 2>&1; then
+    echo "   ⚠️  Porta ainda em uso, forçando encerramento..."
+    kill -9 $(lsof -t -i:50051) 2>/dev/null
+fi
+# 4. Limpar processos órfãos mais agressivamente
+echo "🔨 Limpeza final de processos..."
+pkill -9 -f "VLLM::EngineCore" 2>/dev/null
+pkill -9 -f "vllm" 2>/dev/null
+pkill -9 -f "ultravox.*python" 2>/dev/null
+# 5. Aguardar liberação
+sleep 3
+# 6. Verificar memória GPU
+echo ""
+echo "📊 Status da GPU após limpeza:"
+GPU_FREE=$(nvidia-smi --query-gpu=memory.free --format=csv,noheader,nounits 2>/dev/null | head -1)
+GPU_TOTAL=$(nvidia-smi --query-gpu=memory.total --format=csv,noheader,nounits 2>/dev/null | head -1)
+GPU_USED=$(nvidia-smi --query-gpu=memory.used --format=csv,noheader,nounits 2>/dev/null | head -1)
+if [ -n "$GPU_FREE" ]; then
+    echo "   ✅ GPU: ${GPU_FREE}MB livres / ${GPU_USED}MB usados / ${GPU_TOTAL}MB total"
+else
+    echo "   ❌ Não foi possível verificar GPU"
+fi
+# 7. Verificar processos restantes
+echo ""
+echo "🔍 Verificando processos restantes..."
+REMAINING=$(ps aux | grep -E "vllm|ultravox|EngineCore" | grep -v grep | wc -l)
+if [ "$REMAINING" -eq "0" ]; then
+    echo "   ✅ Todos os processos foram encerrados"
+else
+    echo "   ⚠️  Ainda existem $REMAINING processos relacionados:"
+    ps aux | grep -E "vllm|ultravox|EngineCore" | grep -v grep
+fi
+echo ""
+echo "✅ Servidor Ultravox parado!"
+echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"

ultravox/test-tts.py ADDED Viewed

	@@ -0,0 +1,121 @@

+#!/usr/bin/env python3
+"""
+Script de teste para Ultravox com TTS
+Envia uma pergunta via áudio sintetizado e verifica a resposta
+"""
+import grpc
+import numpy as np
+import asyncio
+import time
+from gtts import gTTS
+from pydub import AudioSegment
+import io
+import sys
+import os
+# Adicionar o path para os protobuffers
+sys.path.append('/workspace/ultravox-pipeline/ultravox')
+import speech_pb2
+import speech_pb2_grpc
+async def test_ultravox_with_tts():
+    """Testa o Ultravox enviando áudio TTS com a pergunta 'Quanto é 2 + 2?'"""
+    print("🎤 Iniciando teste do Ultravox com TTS...")
+    # 1. Gerar áudio TTS com a pergunta
+    print("🔊 Gerando áudio TTS: 'Quanto é dois mais dois?'")
+    tts = gTTS(text="Quanto é dois mais dois?", lang='pt-br')
+    # Salvar em buffer de memória
+    mp3_buffer = io.BytesIO()
+    tts.write_to_fp(mp3_buffer)
+    mp3_buffer.seek(0)
+    # Converter MP3 para PCM 16kHz
+    audio = AudioSegment.from_mp3(mp3_buffer)
+    audio = audio.set_frame_rate(16000).set_channels(1).set_sample_width(2)
+    # Converter para numpy array float32
+    samples = np.array(audio.get_array_of_samples()).astype(np.float32) / 32768.0
+    print(f"✅ Áudio gerado: {len(samples)} samples @ 16kHz")
+    print(f"   Duração: {len(samples)/16000:.2f} segundos")
+    # 2. Conectar ao servidor Ultravox
+    print("\n📡 Conectando ao Ultravox na porta 50051...")
+    try:
+        channel = grpc.aio.insecure_channel('localhost:50051')
+        stub = speech_pb2_grpc.UltravoxServiceStub(channel)
+        # 3. Criar request com o áudio
+        session_id = f"test_{int(time.time())}"
+        async def audio_generator():
+            """Gera chunks de áudio para enviar"""
+            request = speech_pb2.AudioRequest()
+            request.session_id = session_id
+            request.audio_data = samples.tobytes()
+            request.sample_rate = 16000
+            request.is_final_chunk = True
+            request.system_prompt = "Responda em português de forma simples e direta"
+            print(f"📤 Enviando áudio para sessão: {session_id}")
+            yield request
+        # 4. Enviar e receber resposta
+        print("\n⏳ Aguardando resposta do Ultravox...")
+        start_time = time.time()
+        response_text = ""
+        token_count = 0
+        async for response in stub.TranscribeStream(audio_generator()):
+            if response.text:
+                response_text += response.text
+                token_count += 1
+                print(f"   Token {token_count}: '{response.text.strip()}'")
+                if response.is_final:
+                    break
+        elapsed = time.time() - start_time
+        # 5. Verificar resposta
+        print(f"\n📝 Resposta completa: '{response_text.strip()}'")
+        print(f"⏱️  Tempo de resposta: {elapsed:.2f}s")
+        print(f"📊 Tokens recebidos: {token_count}")
+        # Verificar se a resposta contém "4" ou "quatro"
+        if "4" in response_text.lower() or "quatro" in response_text.lower():
+            print("\n✅ SUCESSO! O Ultravox respondeu corretamente!")
+        else:
+            print("\n⚠️  AVISO: A resposta não contém '4' ou 'quatro'")
+        await channel.close()
+    except grpc.RpcError as e:
+        print(f"\n❌ Erro gRPC: {e.code()} - {e.details()}")
+        return False
+    except Exception as e:
+        print(f"\n❌ Erro: {e}")
+        return False
+    return True
+if __name__ == "__main__":
+    print("=" * 60)
+    print("TESTE ULTRAVOX COM TTS")
+    print("=" * 60)
+    # Executar teste
+    success = asyncio.run(test_ultravox_with_tts())
+    if success:
+        print("\n🎉 Teste concluído com sucesso!")
+    else:
+        print("\n❌ Teste falhou!")
+    print("=" * 60)

ultravox/test_audio_coherence.py ADDED Viewed

	@@ -0,0 +1,193 @@

+#!/usr/bin/env python3
+"""
+Script de teste para verificar coerência das respostas do Ultravox
+Envia áudio sintético com perguntas específicas e verifica as respostas
+"""
+import grpc
+import numpy as np
+import sys
+import time
+from pathlib import Path
+# Adiciona o diretório ao path
+sys.path.append(str(Path(__file__).parent))
+import ultravox_service_pb2
+import ultravox_service_pb2_grpc
+def create_test_audio(text_prompt, duration=2.0, sample_rate=16000):
+    """
+    Cria áudio de teste sintético simulando fala
+    Em produção, isso seria áudio real gravado
+    """
+    # Simula padrão de fala com modulação
+    t = np.linspace(0, duration, int(sample_rate * duration))
+    # Frequências típicas da voz humana (100-300 Hz fundamental)
+    base_freq = 150 + 50 * np.sin(2 * np.pi * 0.5 * t)  # Modulação lenta
+    # Gera sinal complexo simulando voz
+    audio = np.zeros_like(t)
+    # Adiciona harmônicos
+    for harmonic in range(1, 8):
+        freq = base_freq * harmonic
+        amplitude = 1.0 / harmonic  # Harmônicos mais altos têm menor amplitude
+        audio += amplitude * np.sin(2 * np.pi * freq * t)
+    # Adiciona envelope de amplitude (simula palavras)
+    envelope = 0.5 + 0.5 * np.sin(2 * np.pi * 2 * t)
+    audio *= envelope
+    # Normaliza para float32 entre -1 e 1
+    audio = audio / (np.max(np.abs(audio)) + 1e-10)
+    return audio.astype(np.float32)
+def test_ultravox_coherence():
+    """Testa a coerência das respostas do Ultravox"""
+    print("=" * 60)
+    print("🎯 TESTE DE COERÊNCIA DO ULTRAVOX")
+    print("=" * 60)
+    # Conecta ao servidor
+    try:
+        channel = grpc.insecure_channel('localhost:50051')
+        stub = ultravox_service_pb2_grpc.UltravoxServiceStub(channel)
+        print("✅ Conectado ao Ultravox em localhost:50051")
+    except Exception as e:
+        print(f"❌ Erro ao conectar: {e}")
+        return False
+    # Define perguntas de teste e respostas esperadas (palavras-chave)
+    test_cases = [
+        {
+            "pergunta": "Qual é o seu nome?",
+            "audio_duration": 1.5,
+            "keywords_pt": ["nome", "assistente", "sou", "chamo"],
+            "keywords_wrong": ["今天", "আজ", "weather", "time"]  # Chinês, Bengali, Inglês
+        },
+        {
+            "pergunta": "Que horas são agora?",
+            "audio_duration": 1.8,
+            "keywords_pt": ["hora", "tempo", "agora", "momento"],
+            "keywords_wrong": ["名字", "নাম", "name", "call"]
+        },
+        {
+            "pergunta": "O que você fez hoje?",
+            "audio_duration": 2.0,
+            "keywords_pt": ["hoje", "fiz", "fez", "dia"],
+            "keywords_wrong": ["明天", "আগামীকাল", "tomorrow", "yesterday"]
+        }
+    ]
+    results = []
+    session_id = f"test_{int(time.time())}"
+    for i, test in enumerate(test_cases, 1):
+        print(f"\n📝 Teste {i}: '{test['pergunta']}'")
+        print("-" * 40)
+        # Cria áudio sintético
+        audio = create_test_audio(test['pergunta'], test['audio_duration'])
+        print(f"   🎤 Áudio criado: {len(audio)} samples @ 16kHz")
+        # Prepara requisição
+        request = ultravox_service_pb2.ProcessRequest(
+            session_id=session_id,
+            audio_data=audio.tobytes(),
+            system_prompt=""  # Deixa vazio para usar o prompt padrão
+        )
+        try:
+            # Envia e recebe resposta
+            response_text = ""
+            start_time = time.time()
+            for response in stub.ProcessAudioStream([request]):
+                if response.token:
+                    response_text += response.token
+            latency = (time.time() - start_time) * 1000
+            print(f"   📝 Resposta: '{response_text}'")
+            print(f"   ⏱️ Latência: {latency:.0f}ms")
+            # Analisa a resposta
+            response_lower = response_text.lower()
+            # Verifica se está em português
+            has_portuguese = any(kw in response_lower for kw in test['keywords_pt'])
+            has_wrong_lang = any(kw in response_text for kw in test['keywords_wrong'])
+            # Detecta idioma pela presença de caracteres específicos
+            has_chinese = any('\u4e00' <= char <= '\u9fff' for char in response_text)
+            has_bengali = any('\u0980' <= char <= '\u09ff' for char in response_text)
+            # Resultado do teste
+            if has_chinese:
+                status = "❌ FALHOU - Resposta em CHINÊS"
+                success = False
+            elif has_bengali:
+                status = "❌ FALHOU - Resposta em BENGALI"
+                success = False
+            elif not response_text:
+                status = "❌ FALHOU - Resposta vazia"
+                success = False
+            elif has_portuguese:
+                status = "✅ PASSOU - Resposta coerente em português"
+                success = True
+            else:
+                status = "⚠️ INCERTO - Resposta não identificada"
+                success = False
+            print(f"   {status}")
+            results.append({
+                "pergunta": test['pergunta'],
+                "resposta": response_text,
+                "success": success,
+                "status": status,
+                "latency": latency
+            })
+        except Exception as e:
+            print(f"   ❌ Erro no teste: {e}")
+            results.append({
+                "pergunta": test['pergunta'],
+                "resposta": f"ERRO: {e}",
+                "success": False,
+                "status": "❌ ERRO",
+                "latency": 0
+            })
+    # Resumo dos resultados
+    print("\n" + "=" * 60)
+    print("📊 RESUMO DOS TESTES")
+    print("=" * 60)
+    passed = sum(1 for r in results if r['success'])
+    total = len(results)
+    for r in results:
+        emoji = "✅" if r['success'] else "❌"
+        print(f"{emoji} '{r['pergunta']}' -> {r['status']}")
+        if r['resposta'] and not r['success']:
+            print(f"   Resposta recebida: '{r['resposta'][:100]}...'")
+    print(f"\n📈 Taxa de sucesso: {passed}/{total} ({100*passed/total:.0f}%)")
+    if passed == total:
+        print("🎉 TODOS OS TESTES PASSARAM! Ultravox respondendo coerentemente em português!")
+    elif passed > 0:
+        print("⚠️ PARCIAL: Alguns testes passaram, mas ainda há problemas de idioma")
+    else:
+        print("❌ FALHA TOTAL: Nenhum teste passou - respostas em idioma incorreto")
+    return passed == total
+if __name__ == "__main__":
+    success = test_ultravox_coherence()
+    sys.exit(0 if success else 1)