Fastapi / README.md
ayureze's picture
Rename HF_README.md to README.md
ca5d2cf verified
metadata
title: Astra - Ayurvedic AI Assistant
emoji: ๐ŸŒฟ
colorFrom: green
colorTo: green
sdk: docker
pinned: false
license: apache-2.0
app_port: 7860

Astra - Your Ayurvedic AI Assistant ๐ŸŒฟ

Meet Astra, an intelligent Ayurvedic AI Assistant powered by Llama 3.2 11B with specialized Ayurveda knowledge. Astra provides complete, thorough information about Ayurvedic medicine, herbs, wellness practices, and holistic health.

What Makes Astra Special

โœจ Complete Responses: Astra never gives partial answers - every response is thorough and comprehensive

๐ŸŒฟ Ayurvedic Expertise: Specialized knowledge about herbs, treatments, doshas, and traditional wellness

๐Ÿค– Advanced AI: Powered by Llama 3.2 11B with Ayurveda-specific LoRA adapters

๐Ÿ“š Comprehensive Information: Covers benefits, usage, precautions, and traditional wisdom

๐Ÿ’ก Clear Communication: Complex Ayurvedic concepts explained in accessible language

Features

  • Complete, thorough responses - never incomplete or partial information
  • Ayurvedic knowledge base covering herbs, treatments, and wellness practices
  • Dosha guidance - personalized insights for Vata, Pitta, and Kapha
  • RESTful API with FastAPI
  • Interactive documentation via Swagger UI
  • Optimized for production with 4-bit quantization

Quick Start

1. View API Documentation

Visit the interactive docs:

2. Check API Status

curl https://YOUR_USERNAME-YOUR_SPACE_NAME.hf.space/health

3. Chat with Astra

First, load the model:

curl -X POST https://YOUR_USERNAME-YOUR_SPACE_NAME.hf.space/load-model

Then ask Astra a question:

curl -X POST https://YOUR_USERNAME-YOUR_SPACE_NAME.hf.space/generate \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "What are the complete benefits and uses of Ashwagandha in Ayurveda?",
    "max_length": 1024,
    "temperature": 0.7
  }'

Note: Astra is configured to provide complete, thorough responses. The default max_length is 1024 tokens to ensure comprehensive answers.

API Endpoints

Method Endpoint Description
GET / API information
GET /health Health check
GET /status Model status
POST /load-model Load AI model
POST /generate Generate text
GET /docs Swagger UI

Models

  • Base: unsloth/llama-3.2-11b-vision-instruct-bnb-4bit
  • LoRA: ayureasehealthcare/llama3-ayurveda-lora-v3

Request Parameters

/generate endpoint:

{
  "prompt": "Your question or prompt",
  "max_length": 1024,
  "temperature": 0.7,
  "top_p": 0.9,
  "top_k": 50
}

Response Format

{
  "generated_text": "AI-generated response...",
  "prompt": "Your original prompt",
  "model_info": {
    "assistant": "Astra - Ayurvedic AI Assistant",
    "base_model": "unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit",
    "lora_model": "ayureasehealthcare/llama3-ayurveda-lora-v3",
    "parameters": {
      "max_length": 1024,
      "min_length": 100,
      "temperature": 0.7,
      "top_p": 0.9,
      "top_k": 50
    }
  }
}

Example Use Cases

  • Query Ayurvedic herb benefits
  • Ask about traditional wellness practices
  • Learn about doshas and body types
  • Discover natural remedies
  • Understand Ayurvedic nutrition

Hardware

This Space runs on:

  • CPU Basic (free tier) - for testing
  • Upgrade to GPU recommended for production use

Notes

โš ๏ธ First-time model loading: The first request may take 10-30 minutes as the model downloads from Hugging Face. Subsequent requests will be much faster.

๐Ÿ’ก Tip: For faster responses, consider upgrading to GPU hardware in Space settings.

Tech Stack

  • FastAPI
  • Uvicorn
  • PyTorch
  • Transformers (Hugging Face)
  • PEFT (LoRA adapters)
  • Unsloth (optimized inference)

Source Code

Full source code and documentation available in the repository.

License

Apache 2.0


Built with โค๏ธ using Hugging Face Spaces