Font Classifier DINOv2 (Server-Side Preprocessing)

A fine-tuned DINOv2 model for font classification with built-in preprocessing.

🎯 Key Feature: No client-side preprocessing required!

Performance

Accuracy: ~86% on test set
Preprocessing: Automatic server-side pad-to-square + normalization

Usage

Simple API Usage (Recommended)

Clients can send raw images directly to inference endpoints:

import requests
import base64

# Load your image
with open("test_image.png", "rb") as f:
    image_data = base64.b64encode(f.read()).decode()

# Send to inference endpoint
response = requests.post(
    "https://your-endpoint.com",
    headers={"Authorization": "Bearer YOUR_TOKEN"},
    json={"inputs": image_data}
)

results = response.json()
print(f"Predicted font: {results[0]['label']} ({results[0]['score']:.2%})")

Standard HuggingFace Usage

from transformers import pipeline

# The model automatically handles preprocessing
classifier = pipeline("image-classification", model="dchen0/font-classifier-v4")
results = classifier("your_image.png")
print(f"Predicted font: {results[0]['label']}")

Direct Model Usage

from PIL import Image
import torch
from transformers import AutoImageProcessor
from font_classifier_with_preprocessing import FontClassifierWithPreprocessing

# Load model and processor
model = FontClassifierWithPreprocessing.from_pretrained("dchen0/font-classifier-v4")
processor = AutoImageProcessor.from_pretrained("dchen0/font-classifier-v4")

# Process image (model handles pad_to_square automatically)
image = Image.open("test.png")
inputs = processor(images=image, return_tensors="pt")
outputs = model(**inputs)

Model Architecture

Base Model: facebook/dinov2-base-imagenet1k-1-layer
Fine-tuning: LoRA on Google Fonts dataset
Labels: 394 font families
Preprocessing: Built-in pad-to-square + ImageNet normalization

Server-Side Preprocessing

This model automatically applies the following preprocessing in its forward pass:

Pad to square preserving aspect ratio
Resize to 224×224
Normalize with ImageNet statistics

No client-side preprocessing required - just send raw images!

Deployment

HuggingFace Inference Endpoints

Deploy this model to an Inference Endpoint
Send raw images directly - preprocessing happens automatically
Achieve ~86% accuracy out of the box

Custom Deployment

The model includes preprocessing in the forward pass, so any deployment (TorchServe, TensorFlow Serving, etc.) will automatically apply correct preprocessing.

Files

font_classifier_with_preprocessing.py: Custom model class with built-in preprocessing
Standard HuggingFace model files

Technical Details

The model inherits from Dinov2ForImageClassification but overrides the forward pass to include:

def forward(self, pixel_values=None, labels=None, **kwargs):
    # Automatic preprocessing happens here
    processed_pixel_values = self.preprocess_images(pixel_values)
    return super().forward(pixel_values=processed_pixel_values, labels=labels, **kwargs)

This ensures that whether clients send raw images or pre-processed tensors, the model receives correctly formatted input.

Downloads last month: 62

Safetensors

Model size

87.2M params

Tensor type

F32