Spaces:

CheeksTheGeek
/

temp

Sleeping

App Files Files Community

temp / README.md

CheeksTheGeek

Add Hugging Face Space YAML metadata to README

3b5dcfa unverified 6 months ago

preview code

raw

history blame contribute delete

12 kB

metadata

title: LLM Code Deployment API
emoji: 🚀
colorFrom: blue
colorTo: green
sdk: docker
app_port: 7860

LLM Code Deployment System

An automated system for building, deploying, and evaluating LLM-generated web applications with GitHub Pages integration.

Overview

This project implements a complete workflow for:

Students: Receive task requests, use LLMs to generate code, deploy to GitHub Pages, and submit for evaluation
Instructors: Generate task requests, receive submissions, run automated evaluations (static, dynamic, and LLM-based)

🚀 Quick Deployment for Students

Deploy to Hugging Face Spaces in 10 minutes:

Read DEPLOYMENT.md for complete step-by-step instructions
Read README_SPACES.md for Hugging Face Spaces configuration
Get your AIPipe token from https://aipipe.org/login ($2/month free for IIT Madras students)
Create a Space at https://huggingface.co/new-space
Configure environment variables and deploy!

Already deployed? Just submit your endpoint URL to the instructor's Google Form!

Architecture

┌─────────────┐         ┌──────────────┐         ┌─────────────┐
│  Instructor │         │   Student    │         │   GitHub    │
│   System    │────────▶│     API      │────────▶│    Pages    │
│             │  POST   │              │  Deploy │             │
└─────────────┘  Task   └──────────────┘         └─────────────┘
      │                        │                         │
      │                        │                         │
      │                        └────────POST─────────────┘
      │                         Submission               │
      │                                                  │
      ▼                                                  ▼
┌─────────────┐                                   ┌─────────────┐
│  Evaluation │◀──────────────────────────────────│  Validation │
│   Database  │                                   │   & Checks  │
└─────────────┘                                   └─────────────┘

Features

Student-Side

API Endpoint: Receives task requests via HTTP POST
LLM Code Generation: Uses Claude/GPT to generate complete web apps
GitHub Integration: Automatically creates repos, pushes code, enables Pages
Automatic Notification: Sends repo details to evaluation endpoint
Round 2 Support: Handles update requests for existing repos

Instructor-Side

Task Templates: YAML-based parametrizable task definitions
Round 1 & 2 Scripts: Automated task generation and distribution
Evaluation API: Receives and validates student submissions
Multi-Level Checks:
- Static: License, README, repo creation time, secrets detection
- LLM: Code quality, documentation quality
- Dynamic: Playwright-based functional testing
Database: PostgreSQL storage for tasks, repos, and results

Project Structure

tds-p1/
├── shared/                 # Shared utilities and models
│   ├── config.py          # Configuration management
│   ├── models.py          # Pydantic data models
│   ├── logger.py          # Logging setup
│   └── utils.py           # Utility functions
├── student/               # Student-side components
│   ├── api.py             # FastAPI endpoint
│   ├── code_generator.py  # LLM-based code generation
│   ├── github_manager.py  # GitHub operations
│   └── notification_client.py  # Evaluation notification
├── instructor/            # Instructor-side components
│   ├── api.py             # Evaluation endpoint
│   ├── database.py        # Database models and operations
│   ├── task_templates.py  # Template management
│   ├── round1.py          # Round 1 task generation
│   ├── round2.py          # Round 2 task generation
│   ├── evaluate.py        # Main evaluation script
│   └── checks/            # Evaluation modules
│       ├── static_checks.py   # Static analysis
│       ├── dynamic_checks.py  # Playwright tests
│       └── llm_checks.py      # LLM evaluations
├── templates/             # Task template YAML files
│   ├── sum-of-sales.yaml
│   ├── markdown-to-html.yaml
│   └── github-user-created.yaml
├── pyproject.toml         # Project dependencies
├── .env.example           # Environment variables template
└── README.md              # This file

Setup

Prerequisites

Python 3.10+
PostgreSQL database
GitHub account with personal access token
Anthropic or OpenAI API key

Installation

Clone the repository

git clone <your-repo-url>
cd tds-p1

Install dependencies

pip install -e .

Install Playwright browsers

playwright install chromium

Configure environment

cp .env.example .env
# Edit .env with your credentials

Set up database

# Create PostgreSQL database
createdb llm_deployment

# Initialize tables
python -c "from instructor.database import Database; Database().create_tables()"

Configuration

Edit .env with your settings:

Student Configuration

STUDENT_SECRET=your-secret-key
STUDENT_EMAIL=your-email@example.com
STUDENT_API_PORT=8000

GitHub

GITHUB_TOKEN=ghp_your_personal_access_token
GITHUB_USERNAME=your-username

LLM Provider

# Choose one
LLM_PROVIDER=anthropic  # or openai
ANTHROPIC_API_KEY=sk-ant-...
# OR
OPENAI_API_KEY=sk-...
LLM_MODEL=claude-3-5-sonnet-20241022

Instructor

DATABASE_URL=postgresql://user:password@localhost:5432/llm_deployment
EVALUATION_API_URL=http://your-server:8001/api/evaluate

Usage

For Students

Start the Student API

python -m student.api

The API will listen on http://localhost:8000/api/build

Test with a sample request

curl -X POST http://localhost:8000/api/build \
  -H "Content-Type: application/json" \
  -d '{
    "email": "your-email@example.com",
    "secret": "your-secret",
    "task": "test-task-abc",
    "round": 1,
    "nonce": "unique-nonce-123",
    "brief": "Create a simple Hello World page",
    "checks": ["Page displays Hello World"],
    "evaluation_url": "http://localhost:8001/api/evaluate",
    "attachments": []
  }'

For Instructors

Start the Evaluation API

python -m instructor.api

Prepare submissions.csv

timestamp,email,endpoint,secret
2025-01-15T10:00:00,student1@example.com,http://student1.com/api/build,secret1
2025-01-15T10:05:00,student2@example.com,http://student2.com/api/build,secret2

Run Round 1 task generation

python -m instructor.round1

This will:

Load submissions from CSV
Generate unique tasks from templates
POST tasks to student endpoints
Log results to database

Run evaluations

python -m instructor.evaluate

This will:

Fetch pending submissions
Clone repositories
Run static, LLM, and Playwright checks
Save results to database

Run Round 2 task generation

python -m instructor.round2

This will:

Find all Round 1 submissions
Generate Round 2 update tasks
POST to student endpoints

Task Templates

Task templates are YAML files in the templates/ directory. Example:

id: sum-of-sales
brief: Publish a single-page site that fetches data.csv from attachments...
attachments:
  - name: data.csv
    url: data:text/csv;base64,placeholder
checks:
  - "Page title equals 'Sales Summary {{ seed }}'"
  - "Bootstrap 5 CSS loaded from jsdelivr"
round2:
  - brief: Add a Bootstrap table #product-sales...
    checks:
      - "Table #product-sales exists"

Template Variables

{{ seed }}: Unique seed based on email and timestamp
{{ hash }}: Deterministic hash value
{{ result }}: Generated numeric value

API Endpoints

Student API

POST /api/build

Receives task request
Returns 200 on acceptance
Processes in background

GET /api/status/{task_id}

Returns task status

GET /health

Health check

Instructor API

POST /api/evaluate

Receives repo submission
Validates against task record
Returns 200 on success

GET /api/submissions/{email}

Returns all submissions for email

GET /api/results/{email}

Returns all evaluation results

Evaluation Criteria

Static Checks

✓ Repository created after task sent
✓ MIT LICENSE exists in root
✓ README.md present with good structure
✓ No secrets in git history

LLM Checks

✓ README.md professional quality (0-1 score)
✓ Code quality and best practices (0-1 score)

Dynamic Checks

✓ Task-specific requirements (from template)
✓ Page loads successfully
✓ JavaScript evaluations
✓ Element presence and content

Database Schema

Tasks Table

Task requests sent to students
Fields: email, task, round, nonce, brief, checks, etc.

Repos Table

Repository submissions from students
Fields: email, task, round, repo_url, commit_sha, pages_url

Results Table

Evaluation results
Fields: email, task, round, check, score, reason, logs

Troubleshooting

Common Issues

Student API not receiving requests

Check firewall settings
Ensure port 8000 is accessible
Verify endpoint URL in submissions.csv

GitHub Pages not deploying

Verify GITHUB_TOKEN has repo permissions
Check repository is public
Wait up to 60 seconds for Pages to activate

LLM generation fails

Check API key is valid
Verify API quota/credits
Review logs for error details

Playwright tests fail

Ensure chromium is installed: playwright install chromium
Check Pages URL is accessible
Increase timeout if needed

Database connection errors

Verify PostgreSQL is running
Check DATABASE_URL credentials
Ensure database exists

Development

Running Tests

pytest tests/

Code Formatting

black .
ruff check .

Type Checking

mypy .

License

MIT License

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Contributing

Fork the repository
Create a feature branch
Make your changes
Submit a pull request

Support

For issues and questions:

Check the troubleshooting section
Review logs in logs/app.log
Open an issue on GitHub