Spaces:

JatinAutonomousLabs
/

Research_AI_Assistant

Sleeping

App Files Files Community

Research_AI_Assistant / HF_TOKEN_SETUP.md

JatsTheAIGen

workflow errors debugging v14

fa57725 about 2 months ago

preview code

raw

history blame contribute delete

4.51 kB

Hugging Face Token Setup - Working Models

✅ Current Configuration

Model Selected: `facebook/blenderbot-400M-distill`

Why this model:

✅ Publicly available (no gating required)
✅ Works with HF Inference API
✅ Text generation task
✅ No special permissions needed
✅ Fast response times
✅ Stable and reliable

Fallback: gpt2 (guaranteed to work on HF API)

Setting Up Your HF Token

Step 1: Get Your Token

Go to https://huggingface.co/settings/tokens
Click "New token"
Name it: "Research Assistant"
Set role: Read (this is sufficient for inference)
Generate token
Copy it immediately (won't show again)

Step 2: Add to Hugging Face Space

In your HF Space settings:

Go to your Space: https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE
Click "Settings" (gear icon)
Under "Repository secrets" or "Space secrets"
Add new secret:
- Name: HF_TOKEN
- Value: (paste your token)
Save

Step 3: Verify Token Works

The code will automatically:

✅ Load token from environment: os.getenv('HF_TOKEN')
✅ Use it in API calls
✅ Log success/failure

Check logs for:

llm_router - INFO - Calling HF API for model: facebook/blenderbot-400M-distill
llm_router - INFO - HF API returned response (length: XXX)

Alternative Models (Tested & Working)

If you want to try different models:

Option 1: GPT-2 (Very Reliable)

"model_id": "gpt2"

⚡ Fast
✅ Always available
⚠️ Simple responses

Option 2: Flan-T5 Large (Better Quality)

"model_id": "google/flan-t5-large"

📈 Better quality
⚡ Fast
✅ Public access

Option 3: Blenderbot (Conversational)

"model_id": "facebook/blenderbot-400M-distill"

💬 Good for conversation
✅ Current selection
⚡ Fast

Option 4: DistilGPT-2 (Faster)

"model_id": "distilgpt2"

⚡ Very fast
✅ Guaranteed available
⚠️ Smaller, less capable

How the System Works Now

API Call Flow:

User question → Synthesis Agent
Synthesis Agent → Tries LLM call
LLM Router → Calls HF Inference API with token
HF API → Returns generated text
System → Uses real LLM response ✅

No More Fallbacks

❌ No knowledge base fallback
❌ No template responses
✅ Always uses real LLM when available
✅ GPT-2 fallback if model loading (503 error)

Verification

Test Your Setup:

Ask: "What is 2+2?"

Expected: Real LLM generated response (not template)

Check logs for:

llm_router - INFO - Calling HF API for model: facebook/blenderbot-400M-distill
llm_router - INFO - HF API returned response (length: XX)
src.agents.synthesis_agent - INFO - RESP_SYNTH_001 received LLM response

If You See 401 Error:

HF API error: 401 - Unauthorized

Fix: Token not set correctly in HF Space settings

If You See 404 Error:

HF API error: 404 - Not Found

Fix: Model ID not valid (very unlikely with current models)

If You See 503 Error:

Model loading (503), trying fallback

Fix: First-time model load, automatically retries with GPT-2

Current Models in Config

File: models_config.py

"reasoning_primary": {
    "model_id": "facebook/blenderbot-400M-distill",
    "max_tokens": 500,
    "temperature": 0.7
}

Performance Notes

Latency:

Blenderbot: ~2-4 seconds
GPT-2: ~1-2 seconds
Flan-T5: ~3-5 seconds

Quality:

Blenderbot: Good for conversational responses
GPT-2: Basic but coherent
Flan-T5: More factual, less conversational

Troubleshooting

Token Not Working?

Verify in HF Dashboard → Settings → Access Tokens
Check it has "Read" permissions
Regenerate if needed
Update in Space settings

Model Not Loading?

First request may take 10-30 seconds (cold start)
Subsequent requests are faster
503 errors auto-retry with fallback

Still Seeing Placeholders?

Restart your Space
Check logs for HF API calls
Verify token is in environment

Next Steps

✅ Add token to HF Space settings
✅ Restart Space
✅ Test with a question
✅ Check logs for "HF API returned response"
✅ Enjoy real LLM responses!

Summary

Model: facebook/blenderbot-400M-distill Fallback: gpt2
Status: ✅ Configured and ready Requirement: Valid HF token in Space settings No fallbacks: System always tries real LLM first