Spaces:
Sleeping
π How to Run the VQA Mobile App
Quick Overview
You now have a complete React Native mobile app for Visual Question Answering! Here's what was created:
β What's Built
Backend API (
backend_api.py)- FastAPI server wrapping your ensemble VQA models
- Automatic routing between base and spatial models
- Image upload and question answering endpoints
Mobile App (
ui/folder)- Beautiful React Native app with Expo
- Google OAuth authentication
- Camera and gallery image picker
- Question input and answer display
- Model routing visualization
π― Running the App (3 Steps)
Step 1: Start the Backend Server
# Open PowerShell/Terminal
cd c:\Users\rdeva\Downloads\vqa_coes
# Install API dependencies (FIRST TIME ONLY)
# If you get import errors, run this:
pip install fastapi uvicorn python-multipart
# Start the server
python start_backend.py
# Or: python backend_api.py
Note: If you get "ModuleNotFoundError", see IMPORT_ERRORS_FIX.md for solutions.
β Keep this window open! The server must stay running.
You should see:
π INITIALIZING ENSEMBLE VQA SYSTEM
β
Ensemble ready!
Step 2: Configure the Mobile App
Find your local IP address:
ipconfigLook for "IPv4 Address" (e.g.,
192.168.1.100)Update the API URL:
- Open:
ui\src\config\api.js - Change line 8:
export const API_BASE_URL = 'http://YOUR_IP_HERE:8000'; - Example:
export const API_BASE_URL = 'http://192.168.1.100:8000';
- Open:
Step 3: Start the Mobile App
# Open a NEW PowerShell/Terminal window
cd c:\Users\rdeva\Downloads\vqa_coes\ui
# Start Expo
npm start
You'll see a QR code in the terminal.
Step 4: Run on Your Phone
Install Expo Go on your smartphone:
Scan the QR code:
- Android: Open Expo Go β Scan QR
- iOS: Open Camera β Scan QR β Tap notification
Wait for the app to load (first time takes ~1-2 minutes)
π± Using the App
Option A: Test Without Google Login
For quick testing, you can bypass Google authentication:
- Open
ui\App.js - Find line 23-27 and replace with:
<Stack.Screen name="Home" component={HomeScreen} /> - Save and reload the app (shake phone β Reload)
Option B: Set Up Google Login
- Go to Google Cloud Console
- Create a new project
- Enable Google+ API
- Create OAuth 2.0 credentials
- Update
ui\src\config\google.jswith your client IDs
Testing VQA Functionality
Select an image:
- Tap "Camera" to take a photo
- Tap "Gallery" to choose existing image
Ask a question:
- Type your question (e.g., "What color is the car?")
- Tap "Ask Question"
View the answer:
- See the AI-generated answer
- Check which model was used:
- π Base Model - General questions
- π Spatial Model - Spatial questions (left, right, above, etc.)
π§ͺ Example Questions to Try
General Questions (Base Model π)
- "What color is the car?"
- "How many people are in the image?"
- "What room is this?"
- "Is there a dog?"
Spatial Questions (Spatial Model π)
- "What is to the right of the table?"
- "What is above the chair?"
- "What is next to the door?"
- "What is on the left side?"
π§ Troubleshooting
"Cannot connect to server"
- β
Check backend is running (
python backend_api.py) - β
Verify IP address in
api.jsmatches your computer's IP - β Ensure phone and computer are on the same WiFi network
- β Check Windows Firewall isn't blocking port 8000
"Model not loaded"
- β
Ensure these files exist in
c:\Users\rdeva\Downloads\vqa_coes\:vqa_checkpoint.ptvqa_spatial_checkpoint.pt
- β Check backend terminal for error messages
App won't load on phone
- β Verify Expo Go is installed
- β Both devices on same WiFi
- β
Try restarting Expo: Press
Ctrl+C, thennpm start - β
Clear cache:
npm start -- --clear
Camera/Gallery not working
- β Grant permissions when prompted
- β Check phone Settings β App Permissions
π Project Structure
vqa_coes/
βββ backend_api.py # FastAPI backend server
βββ ensemble_vqa_app.py # Your existing ensemble system
βββ model_spatial.py # Spatial model
βββ models/model.py # Base model
βββ vqa_checkpoint.pt # Base model weights
βββ vqa_spatial_checkpoint.pt # Spatial model weights
βββ requirements_api.txt # Backend dependencies
βββ QUICK_START.md # This guide
βββ ui/ # Mobile app
βββ App.js # Main app component
βββ app.json # Expo configuration
βββ package.json # Dependencies
βββ src/
βββ config/
β βββ api.js # β οΈ UPDATE YOUR IP HERE
β βββ google.js # Google OAuth config
βββ contexts/
β βββ AuthContext.js # Authentication
βββ screens/
β βββ LoginScreen.js # Login UI
β βββ HomeScreen.js # Main VQA UI
βββ services/
β βββ api.js # API client
βββ styles/
βββ theme.js # Design system
βββ globalStyles.js
π Documentation
- Quick Start:
QUICK_START.md(this file) - Full README:
ui/README.md - Implementation Details: See walkthrough artifact
π¨ Customization
Change Colors
Edit ui/src/styles/theme.js:
colors: {
primary: '#6366F1', // Change to your color
secondary: '#EC4899', // Change to your color
// ...
}
Change App Name
Edit ui/app.json:
{
"expo": {
"name": "Your App Name",
"slug": "your-app-slug"
}
}
π’ Next Steps
Once everything works:
- Add Google OAuth for production
- Create custom icons (see
ui/assets/ICONS_README.md) - Build standalone app:
npx eas-cli build --platform android
π‘ Tips
- Backend must run first before starting the mobile app
- Same WiFi network is required for phone and computer
- First load is slow - subsequent loads are faster
- Shake phone to access Expo developer menu
- Check logs in both terminals for debugging
π Need Help?
- Check the troubleshooting section above
- Review backend terminal for errors
- Check Expo console in terminal
- Verify all configuration steps
Ready to test? Follow the 4 steps above and start asking questions about images! π