Font Classifier DINOv2 (Server-Side Preprocessing)
A fine-tuned DINOv2 model for font classification with built-in preprocessing.
๐ฏ Key Feature: No client-side preprocessing required!
Performance
- Accuracy: ~86% on test set
- Preprocessing: Automatic server-side pad-to-square + normalization
Usage
Simple API Usage (Recommended)
Clients can send raw images directly to inference endpoints:
import requests
import base64
# Load your image
with open("test_image.png", "rb") as f:
image_data = base64.b64encode(f.read()).decode()
# Send to inference endpoint
response = requests.post(
"https://your-endpoint.com",
headers={"Authorization": "Bearer YOUR_TOKEN"},
json={"inputs": image_data}
)
results = response.json()
print(f"Predicted font: {results[0]['label']} ({results[0]['score']:.2%})")
Standard HuggingFace Usage
from transformers import pipeline
# The model automatically handles preprocessing
classifier = pipeline("image-classification", model="dchen0/font-classifier-v4")
results = classifier("your_image.png")
print(f"Predicted font: {results[0]['label']}")
Direct Model Usage
from PIL import Image
import torch
from transformers import AutoImageProcessor
from font_classifier_with_preprocessing import FontClassifierWithPreprocessing
# Load model and processor
model = FontClassifierWithPreprocessing.from_pretrained("dchen0/font-classifier-v4")
processor = AutoImageProcessor.from_pretrained("dchen0/font-classifier-v4")
# Process image (model handles pad_to_square automatically)
image = Image.open("test.png")
inputs = processor(images=image, return_tensors="pt")
outputs = model(**inputs)
Model Architecture
- Base Model: facebook/dinov2-base-imagenet1k-1-layer
- Fine-tuning: LoRA on Google Fonts dataset
- Labels: 394 font families
- Preprocessing: Built-in pad-to-square + ImageNet normalization
Server-Side Preprocessing
This model automatically applies the following preprocessing in its forward pass:
- Pad to square preserving aspect ratio
- Resize to 224ร224
- Normalize with ImageNet statistics
No client-side preprocessing required - just send raw images!
Deployment
HuggingFace Inference Endpoints
- Deploy this model to an Inference Endpoint
- Send raw images directly - preprocessing happens automatically
- Achieve ~86% accuracy out of the box
Custom Deployment
The model includes preprocessing in the forward pass, so any deployment (TorchServe, TensorFlow Serving, etc.) will automatically apply correct preprocessing.
Files
font_classifier_with_preprocessing.py
: Custom model class with built-in preprocessing- Standard HuggingFace model files
Technical Details
The model inherits from Dinov2ForImageClassification
but overrides the forward pass to include:
def forward(self, pixel_values=None, labels=None, **kwargs):
# Automatic preprocessing happens here
processed_pixel_values = self.preprocess_images(pixel_values)
return super().forward(pixel_values=processed_pixel_values, labels=labels, **kwargs)
This ensures that whether clients send raw images or pre-processed tensors, the model receives correctly formatted input.
- Downloads last month
- 62