scam / PHASE_2_CHECKLIST.md
Gankit12's picture
Relative API URLs, docker-compose port fix, Phase 2 voice, HF deploy guide
6a4a552

Phase 2 Implementation Checklist

Track your progress implementing Phase 2 voice features.

Setup & Dependencies

  • Review PHASE_2_VOICE_IMPLEMENTATION_PLAN.md
  • Review PHASE_2_README.md
  • Install system dependencies (portaudio, ffmpeg)
  • Install Python dependencies: pip install -r requirements-phase2.txt
  • Copy Phase 2 settings from .env.phase2.example to .env
  • Set PHASE_2_ENABLED=true in .env
  • Verify Whisper model downloads successfully

Core Modules

ASR Module (app/voice/asr.py)

  • Create app/voice/asr.py
  • Implement ASREngine class
  • Implement transcribe() method
  • Add confidence calculation
  • Add language detection
  • Test with sample audio files
  • Test with Hindi audio
  • Test with English audio
  • Test with Gujarati audio
  • Verify latency <2s

TTS Module (app/voice/tts.py)

  • Create app/voice/tts.py
  • Implement TTSEngine class
  • Implement synthesize() method
  • Add language mapping (en, hi, gu, etc.)
  • Test with English text
  • Test with Hindi text
  • Test with Gujarati text
  • Verify audio quality
  • Verify latency <1s

Voice Fraud Detector (Optional) (app/voice/fraud_detector.py)

  • Create app/voice/fraud_detector.py
  • Implement VoiceFraudDetector class
  • Implement detect_synthetic_voice() method
  • Add resemblyzer integration (if enabled)
  • Test with synthetic audio
  • Test with real audio
  • Verify detection accuracy

API Layer

Voice Endpoints (app/api/voice_endpoints.py)

  • Create app/api/voice_endpoints.py
  • Implement POST /api/v1/voice/engage
  • Add file upload handling
  • Add ASR integration
  • Add Phase 1 pipeline integration
  • Add TTS integration
  • Add voice fraud integration (optional)
  • Implement GET /api/v1/voice/audio/{filename}
  • Implement GET /api/v1/voice/health
  • Add error handling
  • Add logging
  • Test with curl
  • Test with Postman

Voice Schemas (app/api/voice_schemas.py)

  • Create app/api/voice_schemas.py
  • Define VoiceEngageRequest
  • Define VoiceEngageResponse
  • Define TranscriptionMetadata
  • Define VoiceFraudMetadata
  • Add validation rules
  • Test schema validation

UI Layer

Voice HTML (ui/voice.html)

  • Create ui/voice.html
  • Add header and title
  • Add recording controls section
  • Add recording status indicator
  • Add start/stop buttons
  • Add upload button
  • Add session ID display
  • Add conversation section
  • Add message display area
  • Add metadata section
  • Add transcription display
  • Add detection display
  • Add voice fraud display (optional)
  • Add intelligence section
  • Test in Chrome
  • Test in Firefox
  • Test in Safari

Voice JavaScript (ui/voice.js)

  • Create ui/voice.js
  • Implement startRecording()
  • Implement stopRecording()
  • Implement uploadAudio()
  • Implement sendAudioToAPI()
  • Implement handleAPIResponse()
  • Implement addMessage()
  • Implement updateMetadata()
  • Implement updateIntelligence()
  • Add error handling
  • Test microphone access
  • Test file upload
  • Test API integration
  • Test audio playback

Voice CSS (ui/voice.css)

  • Create ui/voice.css
  • Style header
  • Style recording controls
  • Style recording status
  • Style buttons
  • Style conversation area
  • Style messages (user/ai/system)
  • Style metadata cards
  • Style intelligence display
  • Add responsive design
  • Test on desktop
  • Test on tablet
  • Test on mobile

Integration

Main App Integration

  • Update app/main.py to include voice router
  • Add conditional import (only if PHASE_2_ENABLED=true)
  • Add error handling for missing dependencies
  • Test server startup with Phase 2 enabled
  • Test server startup with Phase 2 disabled
  • Verify Phase 1 endpoints still work

Config Integration

  • Update app/config.py with Phase 2 settings
  • Add PHASE_2_ENABLED field
  • Add WHISPER_MODEL field
  • Add TTS_ENGINE field
  • Add VOICE_FRAUD_DETECTION field
  • Add AUDIO_SAMPLE_RATE field
  • Add AUDIO_CHUNK_DURATION field
  • Test config loading

Environment Variables

  • Update .env.example with Phase 2 variables
  • Create .env.phase2.example
  • Document all Phase 2 settings
  • Test with different configurations

Testing

Unit Tests

  • Create tests/unit/test_voice_asr.py
  • Test ASR transcription
  • Test language detection
  • Test confidence calculation
  • Create tests/unit/test_voice_tts.py
  • Test TTS synthesis
  • Test language mapping
  • Create tests/unit/test_voice_fraud.py (optional)
  • Test fraud detection
  • Run all unit tests: pytest tests/unit/test_voice_*.py

Integration Tests

  • Create tests/integration/test_voice_api.py
  • Test voice engage endpoint
  • Test audio file upload
  • Test transcription flow
  • Test Phase 1 integration
  • Test TTS flow
  • Test audio download
  • Test health endpoint
  • Run integration tests: pytest tests/integration/test_voice_api.py

End-to-End Tests

  • Test full voice loop (record → transcribe → process → TTS → play)
  • Test with English scam message
  • Test with Hindi scam message
  • Test with Gujarati scam message
  • Test multi-turn conversation
  • Test intelligence extraction from voice
  • Test session persistence
  • Verify latency <5s for full loop

Regression Tests

  • Run all Phase 1 tests: pytest tests/
  • Verify Phase 1 text endpoints work
  • Verify Phase 1 UI works
  • Verify no breaking changes

Performance

  • Measure ASR latency
  • Measure TTS latency
  • Measure total loop latency
  • Test with concurrent requests
  • Test with large audio files
  • Optimize if needed
  • Document performance metrics

Documentation

  • Review PHASE_2_VOICE_IMPLEMENTATION_PLAN.md
  • Review PHASE_2_README.md
  • Add inline code comments
  • Add docstrings to all functions
  • Update main README.md with Phase 2 info
  • Create API documentation for voice endpoints
  • Add troubleshooting guide
  • Add examples

Deployment

Docker

  • Update Dockerfile with Phase 2 dependencies
  • Add conditional installation
  • Test Docker build
  • Test Docker run with Phase 2 enabled
  • Test Docker run with Phase 2 disabled

Environment Setup

  • Document system dependencies
  • Document Python dependencies
  • Create setup script (optional)
  • Test on clean environment
  • Test on Windows
  • Test on Linux
  • Test on Mac

Production Readiness

  • Add monitoring for voice endpoints
  • Add logging for voice operations
  • Add error tracking
  • Add rate limiting
  • Add audio file cleanup
  • Add security headers
  • Test with production settings

Quality Assurance

Code Quality

  • Run linter: flake8 app/voice/
  • Run type checker: mypy app/voice/
  • Run formatter: black app/voice/
  • Fix all linting errors
  • Fix all type errors
  • Review code for best practices

Security

  • Validate audio file uploads
  • Add file size limits
  • Add file type validation
  • Sanitize file names
  • Add rate limiting
  • Test with malicious files
  • Review security best practices

Accessibility

  • Test keyboard navigation
  • Test screen reader compatibility
  • Add ARIA labels
  • Test with assistive technologies

Final Checks

  • All tests passing
  • No linting errors
  • Documentation complete
  • Performance acceptable
  • Security reviewed
  • Phase 1 unaffected
  • Ready for deployment

Post-Implementation

  • Demo video recorded
  • User guide created
  • Training materials prepared
  • Feedback collected
  • Issues documented
  • Future improvements planned

Progress Summary

Total Tasks: 200+

Completed: _____ / 200+

In Progress: _____

Blocked: _____

Estimated Time Remaining: _____ hours


Notes

Use this space to track issues, blockers, or important decisions:

[Date] [Note]
- 
- 
- 

Last Updated: [Date]

Status: 🚧 Not Started | 🟡 In Progress | ✅ Complete