metadata

title: Resume Profile Extractor
emoji: 📚
colorFrom: yellow
colorTo: pink
sdk: docker
pinned: false

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

🚀 Resume Profile Extractor

A powerful AI-powered application that automatically extracts professional profiles from resumes in PDF format. This application uses LLMs (via Groq) to intelligently parse resume content and generates structured profile data that can be used for portfolio generation, professional websites, and more.

✨ Features

PDF Resume Parsing: Extract text from PDF resumes automatically
AI-Powered Information Extraction: Uses large language models to extract structured information
Interactive Web UI: Clean Streamlit interface for uploading and editing profiles
RESTful API: Access extracted profiles via a FastAPI backend
Grammar Correction: Clean up extracted text with AI grammar correction
Data Storage: Persistent SQLite storage for extracted profiles
Profile Image Support: Upload and store profile images
Docker Ready: Easy deployment with included Dockerfile

🛠️ Architecture

The application consists of two main components:

Streamlit Web UI: A user-friendly interface for uploading resumes, editing extracted information, and managing profiles
FastAPI Backend: A RESTful API service for accessing profiles programmatically

Both components run simultaneously in a single container when deployed.

📋 Technical Stack

Python 3.9+
Streamlit: Web interface framework
FastAPI: API framework
LangChain + Groq: AI language models for text extraction & processing
SQLite: Lightweight database for profile storage
PyPDF2: PDF parsing
Pydantic: Data validation and settings management
Uvicorn: ASGI server
Docker: Containerization

🏃‍♀️ Quick Start

Local Development

Clone the repository
Install dependencies:
```
pip install -r requirements.txt
```
Create a .env file from the sample:
```
cp .env.sample .env
```
Add your Groq API key to the .env file
Run the application:
```
python run_combined.py
```
Open http://localhost:7860 in your browser

Using Docker

# Build the Docker image
docker build -t profile-extractor .

# Run the container
docker run -p 7860:7860 -p 8000:8000 -e GROQ_API_KEY=your_key_here profile-extractor

🚀 Deployment on Hugging Face Spaces

This application is designed to be easily deployed on Hugging Face Spaces:

Create a new Space on Hugging Face
Select Docker as the Space SDK
Link your GitHub repository or upload the files directly
Add your GROQ_API_KEY in the Settings > Variables section
(Optional) Set EXTERNAL_API_URL to your Space's URL (e.g., https://your-username-your-space-name.hf.space)
Deploy the Space!

Required Environment Variables

Variable	Description	Required
`GROQ_API_KEY`	Your Groq API key for LLM access	Yes
`EXTERNAL_API_URL`	Public URL of your API (for production)	No
`DEBUG`	Enable debug logging (true/false)	No

🔄 API Endpoints

The API is available at port 8000 when running locally, or through the Hugging Face Space URL.

Endpoint	Method	Description
`/health`	GET	Health check endpoint
`/api/profile/{id}`	GET	Get a complete profile by ID
`/api/profile/{id}/image`	GET	Get just the profile image

📚 Usage Guide

Upload Resume: Start by uploading a PDF resume
Review & Edit: The system will extract information and allow you to review and edit
Save Profile: Save your profile to get a unique profile ID
Access API: Use the API endpoints to access your profile data
Build Portfolio: Use the structured data to build dynamic portfolios and websites

🧩 Project Structure

agentAi/
├── agents/            # AI agents for extraction and processing
├── services/          # Backend services (storage, etc.)
├── utils/             # Utility functions
├── app.py             # Streamlit web application
├── api.py             # FastAPI endpoints
├── models.py          # Pydantic data models
├── config.py          # Application configuration
├── run_combined.py    # Script to run both services
├── requirements.txt   # Python dependencies
├── Dockerfile         # For containerized deployment
└── README.md          # Documentation

📝 License

MIT License

🙏 Acknowledgements

Groq for the LLM API
Streamlit for the web framework
FastAPI for the API framework
LangChain for LLM interactions
Hugging Face for hosting capabilities