Font Classifier DINOv2 (Server-Side Preprocessing)

A fine-tuned DINOv2 model for font classification with built-in preprocessing.

๐ŸŽฏ Key Feature: No client-side preprocessing required!

Performance

  • Accuracy: ~86% on test set
  • Preprocessing: Automatic server-side pad-to-square + normalization

Usage

Simple API Usage (Recommended)

Clients can send raw images directly to inference endpoints:

import requests
import base64

# Load your image
with open("test_image.png", "rb") as f:
    image_data = base64.b64encode(f.read()).decode()

# Send to inference endpoint
response = requests.post(
    "https://your-endpoint.com",
    headers={"Authorization": "Bearer YOUR_TOKEN"},
    json={"inputs": image_data}
)

results = response.json()
print(f"Predicted font: {results[0]['label']} ({results[0]['score']:.2%})")

Standard HuggingFace Usage

from transformers import pipeline

# The model automatically handles preprocessing
classifier = pipeline("image-classification", model="dchen0/font-classifier-v4")
results = classifier("your_image.png")
print(f"Predicted font: {results[0]['label']}")

Direct Model Usage

from PIL import Image
import torch
from transformers import AutoImageProcessor
from font_classifier_with_preprocessing import FontClassifierWithPreprocessing

# Load model and processor
model = FontClassifierWithPreprocessing.from_pretrained("dchen0/font-classifier-v4")
processor = AutoImageProcessor.from_pretrained("dchen0/font-classifier-v4")

# Process image (model handles pad_to_square automatically)
image = Image.open("test.png")
inputs = processor(images=image, return_tensors="pt")
outputs = model(**inputs)

Model Architecture

  • Base Model: facebook/dinov2-base-imagenet1k-1-layer
  • Fine-tuning: LoRA on Google Fonts dataset
  • Labels: 394 font families
  • Preprocessing: Built-in pad-to-square + ImageNet normalization

Server-Side Preprocessing

This model automatically applies the following preprocessing in its forward pass:

  1. Pad to square preserving aspect ratio
  2. Resize to 224ร—224
  3. Normalize with ImageNet statistics

No client-side preprocessing required - just send raw images!

Deployment

HuggingFace Inference Endpoints

  1. Deploy this model to an Inference Endpoint
  2. Send raw images directly - preprocessing happens automatically
  3. Achieve ~86% accuracy out of the box

Custom Deployment

The model includes preprocessing in the forward pass, so any deployment (TorchServe, TensorFlow Serving, etc.) will automatically apply correct preprocessing.

Files

  • font_classifier_with_preprocessing.py: Custom model class with built-in preprocessing
  • Standard HuggingFace model files

Technical Details

The model inherits from Dinov2ForImageClassification but overrides the forward pass to include:

def forward(self, pixel_values=None, labels=None, **kwargs):
    # Automatic preprocessing happens here
    processed_pixel_values = self.preprocess_images(pixel_values)
    return super().forward(pixel_values=processed_pixel_values, labels=labels, **kwargs)

This ensures that whether clients send raw images or pre-processed tensors, the model receives correctly formatted input.

Downloads last month
62
Safetensors
Model size
87.2M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support