python-doc-convert / README.md
omthakur1's picture
v2.0: Add all PDF operations - PDF to Word, Image OCR, PDF Split/Merge
8e7152e
metadata
title: Document Conversion API
emoji: πŸ“„
colorFrom: blue
colorTo: purple
sdk: docker
pinned: false
license: apache-2.0
app_port: 7860

πŸ“„ Document Conversion API - Word to PDF

Free, self-hosted document conversion service using LibreOffice. Deploy on Hugging Face Spaces for unlimited FREE usage!

✨ Features

  • 100% FREE - No API keys, no limits, no credit card
  • High Quality - Uses LibreOffice for professional PDF conversion
  • Fast - Converts documents in seconds
  • Self-Hosted - Complete control and privacy
  • Multiple Formats - Supports DOCX, DOC, ODT, RTF, TXT β†’ PDF

πŸš€ Quick Deploy to Hugging Face Spaces

Step 1: Create a New Space

  1. Go to Hugging Face Spaces
  2. Click "Create new Space"
  3. Fill in:
    • Space name: nextools-doc-converter (or your choice)
    • License: Apache 2.0
    • Select the SDK: Docker
    • Space hardware: CPU basic (FREE)
    • Visibility: Public

Step 2: Upload Files

Upload these 3 files to your Space:

  1. Dockerfile
  2. app.py
  3. requirements.txt

Step 3: Wait for Build

  • Hugging Face will automatically build your Docker container
  • Takes about 5-10 minutes (first time only)
  • Watch the logs for "Application startup complete"

Step 4: Get Your API URL

Your API will be available at:

https://YOUR-USERNAME-nextools-doc-converter.hf.space

Step 5: Add to Your Vercel .env.local

# Document Conversion API
DOC_CONVERSION_API_URL=https://YOUR-USERNAME-nextools-doc-converter.hf.space

πŸ“‘ API Usage

Convert Document to PDF

Endpoint: POST /convert

cURL Example:

curl -X POST \
  https://YOUR-USERNAME-nextools-doc-converter.hf.space/convert \
  -F "file=@document.docx" \
  --output converted.pdf

JavaScript Example:

const formData = new FormData();
formData.append('file', file);

const response = await fetch('https://YOUR-API-URL/convert', {
  method: 'POST',
  body: formData
});

const pdfBlob = await response.blob();

Health Check

Endpoint: GET /health

curl https://YOUR-API-URL/health

Response:

{
  "status": "healthy",
  "libreoffice": true,
  "message": "Service is running"
}

πŸ”§ Test Locally (Optional)

Using Docker:

# Build
docker build -t doc-converter .

# Run
docker run -p 7860:7860 doc-converter

# Test
curl -X POST http://localhost:7860/convert \
  -F "file=@test.docx" \
  --output converted.pdf

Using Python (requires LibreOffice installed):

# Install LibreOffice first:
# Ubuntu/Debian: sudo apt install libreoffice
# Mac: brew install libreoffice
# Windows: Download from libreoffice.org

# Install dependencies
pip install -r requirements.txt

# Run
python app.py

# Test
curl -X POST http://localhost:7860/convert \
  -F "file=@test.docx" \
  --output converted.pdf

πŸ“Š Supported Formats

Input Formats:

  • .docx - Microsoft Word (2007+)
  • .doc - Microsoft Word (97-2003)
  • .odt - OpenDocument Text
  • .rtf - Rich Text Format
  • .txt - Plain Text

Output Format:

  • .pdf - PDF (Portable Document Format)

🎯 Why Hugging Face Spaces?

  1. FREE Forever - No billing, no credit card
  2. No Rate Limits - Unlimited conversions
  3. Always Online - 99.9% uptime
  4. Fast - Global CDN delivery
  5. Easy Deploy - Just upload files
  6. Auto-Scaling - Handles traffic spikes

πŸ”’ Security & Privacy

  • Files are processed in memory
  • Automatic cleanup after conversion
  • No data is stored or logged
  • CORS enabled for your domains
  • SSL/HTTPS encryption

πŸ› Troubleshooting

Build Failed?

  • Check Dockerfile syntax
  • Ensure all files are uploaded
  • Wait for LibreOffice installation to complete

Conversion Failed?

  • Check file format is supported
  • Verify file is not corrupted
  • Check logs in Hugging Face dashboard

Timeout?

  • Large files (>10MB) may take longer
  • Consider increasing timeout in Dockerfile
  • Split large documents

πŸ“ Notes

  • First conversion may take 5-10 seconds (LibreOffice startup)
  • Subsequent conversions are much faster (~1-2 seconds)
  • Maximum file size: 50MB (configurable)
  • Concurrent requests: Supported with workers

πŸ”— Integration with NexTools

Update your app/api/pdf-convert/route.ts:

// Use Hugging Face API for Word to PDF
async function wordToPdf(fileBuffer: Buffer) {
  const apiUrl = process.env.DOC_CONVERSION_API_URL;
  
  if (!apiUrl) {
    throw new Error('DOC_CONVERSION_API_URL not configured');
  }
  
  const formData = new FormData();
  formData.append('file', new Blob([fileBuffer]), 'document.docx');
  
  const response = await fetch(`${apiUrl}/convert`, {
    method: 'POST',
    body: formData,
  });
  
  if (!response.ok) {
    throw new Error('Conversion failed');
  }
  
  const pdfBuffer = Buffer.from(await response.arrayBuffer());
  
  return {
    content: pdfBuffer.toString('base64'),
    mimeType: 'application/pdf',
    fileName: 'converted.pdf',
    fileType: 'PDF',
    pages: 1, // Calculate if needed
  };
}

πŸ“ž Support

  • Issues: Report on GitHub
  • Questions: Ask in Hugging Face discussions
  • Updates: Watch this repository

πŸ“œ License

Apache 2.0 License - Free for commercial and personal use


Made with ❀️ for NexTools - Your All-in-One SaaS Platform