Qwen FastAPI Inference API

This repository contains a secure, OpenAI-compatible REST API for the Qwen2.5-0.5B-Instruct model, built with FastAPI.

Features

  • OpenAI Compatible: Implements the /v1/chat/completions endpoint.
  • Secure: Uses JWT Authentication (HS256) for all inference requests.
  • Lightweight: Optimized to run on consumer hardware or free-tier cloud environments like Google Colab.

Structure

  • application.py: The main FastAPI server.
  • client.py: A test client to verify the API.
  • generate_token.py: Utility to generate JWT tokens for authentication.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support