Qwen FastAPI Inference API
This repository contains a secure, OpenAI-compatible REST API for the Qwen2.5-0.5B-Instruct model, built with FastAPI.
Features
- OpenAI Compatible: Implements the
/v1/chat/completionsendpoint. - Secure: Uses JWT Authentication (HS256) for all inference requests.
- Lightweight: Optimized to run on consumer hardware or free-tier cloud environments like Google Colab.
Structure
application.py: The main FastAPI server.client.py: A test client to verify the API.generate_token.py: Utility to generate JWT tokens for authentication.
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support