Groq API Setup Guide for HuggingClaw
⚡ Why Groq?
Groq is the FASTEST inference engine available - up to 500+ tokens/second!
| Feature | Groq | Others |
|---|---|---|
| Speed | ⚡⚡⚡⚡⚡ 500+ t/s | ⚡⚡ 50-100 t/s |
| Latency | <100ms | 500ms-2s |
| Free Tier | ✅ Yes, generous | ⚠️ Limited |
| Models | Llama 3/4, Qwen, Kimi, GPT-OSS | Varies |
⚠️ SECURITY WARNING
Never share your API key publicly! If you've shared it:
- Go to https://console.groq.com/api-keys
- Delete the compromised key
- Create a new one
- Store it securely (password manager, HF Spaces secrets)
Quick Start
Step 1: Get Your Groq API Key
- Go to https://console.groq.com
- Sign in or create account (free)
- Navigate to API Keys in left sidebar
- Click Create API Key
- Copy your key (starts with
gsk_...) - Keep it secret!
Step 2: Configure HuggingFace Spaces
In your Space Settings → Repository secrets, add:
GROQ_API_KEY=gsk_your-actual-api-key-here
OPENCLAW_DEFAULT_MODEL=groq/llama-3.3-70b-versatile
Step 3: Deploy
Push changes or redeploy the Space. Groq will be automatically configured.
Step 4: Use
- Open Space URL
- Enter gateway token (default:
huggingclaw) - Select "Llama 3.3 70B (Versatile)" from model dropdown
- Experience blazing fast responses! ⚡
Available Models (Verified 2025)
Chat Models
| Model ID | Name | Context | Speed | Best For |
|---|---|---|---|---|
llama-3.3-70b-versatile |
Llama 3.3 70B | 128K | ⚡⚡⚡⚡ | Best overall |
llama-3.1-8b-instant |
Llama 3.1 8B | 128K | ⚡⚡⚡⚡⚡ | Ultra-fast |
meta-llama/llama-4-maverick-17b-128e-instruct |
Llama 4 Maverick | 128K | ⚡⚡⚡⚡ | Latest Llama 4 |
meta-llama/llama-4-scout-17b-16e-instruct |
Llama 4 Scout | 128K | ⚡⚡⚡⚡ | Latest Llama 4 |
qwen/qwen3-32b |
Qwen3 32B | 128K | ⚡⚡⚡ | Alibaba model |
moonshotai/kimi-k2-instruct |
Kimi K2 | 128K | ⚡⚡⚡ | Moonshot AI |
openai/gpt-oss-20b |
GPT-OSS 20B | 128K | ⚡⚡⚡ | OpenAI open-source |
allam-2-7b |
Allam-2 7B | 4K | ⚡⚡⚡⚡ | Arabic/English |
Audio Models
| Model ID | Name | Purpose |
|---|---|---|
whisper-large-v3-turbo |
Whisper Large V3 Turbo | Speech-to-text |
whisper-large-v3 |
Whisper Large V3 | Speech-to-text |
Safety Models
| Model ID | Name | Purpose |
|---|---|---|
meta-llama/llama-guard-4-12b |
Llama Guard 4 | Content moderation |
meta-llama/llama-prompt-guard-2-86m |
Llama Prompt Guard 2 | Prompt injection detection |
Configuration Options
Basic Setup (Recommended)
GROQ_API_KEY=gsk_xxxxx
OPENCLAW_DEFAULT_MODEL=groq/llama-3.3-70b-versatile
Multiple Providers
Use Groq as primary with fallbacks:
# Groq (primary - fastest)
GROQ_API_KEY=gsk_xxxxx
# OpenRouter (fallback - more models)
OPENROUTER_API_KEY=sk-or-v1-xxxxx
# Local Ollama (free backup)
LOCAL_MODEL_ENABLED=true
LOCAL_MODEL_NAME=neuralnexuslab/hacking
Priority order:
- Groq (if
GROQ_API_KEYset) ← Fastest! - xAI (if
XAI_API_KEYset) - OpenAI (if
OPENAI_API_KEYset) - OpenRouter (if
OPENROUTER_API_KEYset) - Local (if
LOCAL_MODEL_ENABLED=true)
Model Recommendations
Best for General Use
OPENCLAW_DEFAULT_MODEL=groq/llama-3.3-70b-versatile
- Excellent quality
- 128K context window
- Fast (500+ tokens/s)
Fastest Responses
OPENCLAW_DEFAULT_MODEL=groq/llama-3.1-8b-instant
- Instant responses
- Good for simple Q&A
- Highest rate limits
Latest & Greatest
OPENCLAW_DEFAULT_MODEL=meta-llama/llama-4-maverick-17b-128e-instruct
- Llama 4 architecture
- Best reasoning
- cutting-edge performance
Long Documents
OPENCLAW_DEFAULT_MODEL=groq/llama-3.3-70b-versatile
- 128K context window
- Can process entire books
- Excellent summarization
Pricing
Free Tier (Generous!)
| Model | Rate Limit |
|---|---|
| Llama 3.1 8B | ~30 req/min |
| Llama 3.3 70B | ~30 req/min |
| Llama 4 Maverick | ~30 req/min |
| Llama 4 Scout | ~30 req/min |
| Qwen3 32B | ~30 req/min |
| Kimi K2 | ~30 req/min |
Perfect for personal bots! Most users never need paid tier.
Paid Plans
Check https://groq.com/pricing for enterprise pricing.
Performance Comparison
| Provider | Tokens/sec | Latency | Cost |
|---|---|---|---|
| Groq Llama 3.3 | 500+ | <100ms | Free |
| Groq Llama 4 | 400+ | <150ms | Free |
| xAI Grok | 100-200 | 200-500ms | $ |
| OpenAI GPT-4 | 50-100 | 500ms-1s | $$$ |
| Local Ollama | 20-50 | 100-200ms | Free |
Troubleshooting
"Invalid API key"
- Verify key starts with
gsk_ - No spaces or newlines
- Check key at https://console.groq.com/api-keys
- Regenerate if compromised
"Rate limit exceeded"
- Free tier: ~30 requests/minute
- Use
llama-3.1-8b-instantfor higher limits - Add delays between requests
- Consider paid plan for heavy usage
"Model not found"
- Use exact model ID from table above
- Check model is active in Groq console
- Some models may be region-restricted
Slow Responses
- Groq should be <100ms
- Check internet connection
- HF Spaces region matters (US = fastest)
Example: WhatsApp Bot with Groq
# HF Spaces secrets
GROQ_API_KEY=gsk_xxxxx
HF_TOKEN=hf_xxxxx
AUTO_CREATE_DATASET=true
# WhatsApp (configure in Control UI)
WHATSAPP_PHONE=+1234567890
WHATSAPP_CODE=ABC123
Result: Ultra-fast WhatsApp AI bot! ⚡
API Reference
Test Your Key
curl https://api.groq.com/openai/v1/models \
-H "Authorization: Bearer gsk_xxxxx"
Chat Completion
curl https://api.groq.com/openai/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer gsk_xxxxx" \
-d '{
"model": "llama-3.3-70b-versatile",
"messages": [
{"role": "user", "content": "Hello!"}
]
}'
Best Practices
1. Choose Right Model
- Chat:
llama-3.3-70b-versatile - Fast Q&A:
llama-3.1-8b-instant - Complex tasks:
meta-llama/llama-4-maverick-17b-128e-instruct - Long docs:
llama-3.3-70b-versatile(128K context)
2. Monitor Usage
Check https://console.groq.com/usage
3. Secure Your Key
- Never commit to git
- Use HF Spaces secrets
- Rotate keys periodically
4. Set Up Alerts
Configure usage alerts in Groq console.
Next Steps
- ✅ Get API key from https://console.groq.com
- ✅ Set
GROQ_API_KEYin HF Spaces secrets - ✅ Deploy and test in Control UI
- ✅ Configure WhatsApp/Telegram channels
- 🎉 Enjoy sub-second AI responses!
Speed Test
After setup, test Groq's speed:
1. Open Control UI
2. Select "Llama 3.3 70B (Versatile)"
3. Send: "Write a 100-word story about a robot"
4. Watch it generate in <0.5 seconds! ⚡⚡⚡
Support
- Groq Docs: https://console.groq.com/docs
- API Status: https://status.groq.com
- HuggingClaw: https://github.com/openclaw/openclaw/issues
Available via OpenAI-Compatible API
All Groq models work via OpenAI-compatible endpoint:
OPENAI_API_KEY=gsk_xxxxx
OPENAI_BASE_URL=https://api.groq.com/openai/v1
This allows using Groq with any OpenAI-compatible client!