Example Submission for the SAFE: Image Edit Detection and Localization Challenge 2025
This project provides a starting point for implementing a submission to the SAFE: Image Edit Detection and Localization Challenge 2025. You do not need to use this code to participate in the challenge.
Clone this repository
To use the code and tools in this repository, clone it with git:
git clone https://huggingface.co/safe-challenge-2025/example-submission
How to participate
To participate in the challenge, you need to do three things:
- Visit the challenge home page and sign up using the linked registration form. After verifying your email, you will receive access credentials for the submission platform.
- Implement your detector model. You can use this repository as a starting point, but you don't have to.
- Submit your detector model for evaluation. You can build your submission package yourself and submit it using a CLI tool (preferred), or you can build your submission in a HuggingFace Space and submit the Space using a web form.
How to make a submission
The infrastructure for the challenge runs on DSRI's Dyff platform. Submissions to the challenge must be in the form of a containerized web service that serves a simple JSON HTTP API.
If you're comfortable building a Docker image yourself, the preferred way to make a submission is to upload and submit a built image using the Dyff client.
Alternatively, you can create a Docker HuggingFace Space and create submissions from the space using a webform. The advantage of using an HF Space is that it builds the Docker image for you. However, HF Spaces also have some limitations that you'll need to account for.
General considerations
- Your submission will run without Internet access during evaluation. All of the files required to run your submission must be packaged along with it. You can either include files in the Docker image, or upload the files as a separate package and mount them in your application container during execution.
Submitting using the Dyff API
If you're able to build a Docker image for your submission yourself, the preferred way to make submissions is via the Dyff API. We provide a command line tool (challenge-cli.py) in this repository to simplify the submission process.
In the terminology of Dyff, the thing that you're submitting is an InferenceService. You can think of an InferenceService as a recipe for spinning up a Docker container that runs an HTTP server that serves an inference API. To create a new submission, you need to upload the Docker image that the service should run, and, optionally, a volume of files such as neural network weights that will be mounted in the container.
Install the Dyff SDK
You need Python 3.10+ (3.12 recommended). We recommend you install into a virtual environment. If you're using this repository, you can install dyff and a few other useful dependencies as described in the Quick Start section:
# After installing the `uv` tool:
make setup
source venv/bin/activate
Or, you can install just dyff into a venv like this:
python3 -m venv venv
source venv/bin/activate
python3 -m pip install --upgrade dyff
Install skopeo
To upload Docker images via the Dyff API, you need to have the skopeo tool in your PATH.
Prepare the submission data
Before creating a submission, you need to build the Docker image that you want to submit locally. For example, running the make docker-build command in this repository will build a Docker image in your local Docker daemon with the name safe-challenge-2025/example-submission:latest. You can check that the image exists using the docker images command:
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
safe-challenge-2025/example-submission latest b86a46d856f0 3 hours ago 1.86GB
...
If your submission includes large data files such as neural network weights, we recommend that you upload these separately from the Docker image and then arrange for them to be mounted in the running container at run-time. You can upload a local directory recursively to the Dyff platform. Once uploaded, you will get the ID of a Dyff Model resources that you can reference in your InferenceService.
If you're uploading your large files separately as a Model, you'll need to tell Dyff where to mount them in your container. When testing your system locally, you can use the -v/--volume flag with docker run to mount a local directory in the container. Then, just make sure to specify the same mount path when creating your InferenceService in Dyff.
Use the challenge-cli tool
This repository contains a CLI script that simplifies the submission process. Usage is like this:
$ python3 challenge-cli.py
Usage: challenge-cli.py [OPTIONS] COMMAND [ARGS]...
Options:
--help Show this message and exit.
Commands:
submit
upload-submission
You create a new submission in two steps. First, you upload the submission files and create an InferenceService. Then, you create the actual Submission resource to tell the Dyff platform that you want to submit the InferenceService for the challenge.
Upload submission files
To upload submission files, use the upload-submission command:
DYFF_API_TOKEN=<your token> python3 challenge-cli.py upload-submission [OPTIONS]
Notice that we're providing an access token via an environment variable.
This command creates a Dyff Artifact resource corresponding to your Docker image, an optional Dyff Model resource containing your uploaded model files, and a Dyff InferenceService resource that references the Artifact and the Model.
You need to provide your Account ID with --account, a name for your system with --name, and a Docker image name + tag with --image. The tool will create a Dyff Artifact resource representing the Docker image.
If your system serves the inference endpoint at a route other than predict, use the --endpoint flag to specify the correct route.
If you are uploading your large files in a separate data volume, use the --volume flag to specify the directory tree to upload, and use the --volume-mount flag to set the path where this direcctory should be mounted in the running container. When you use the flags, the tool creates a Dyff Model resource representing the uploaded files.
You can also use the --artifact and --model flags to provide the ID of an Artifact or Model that already exists instead of creating a new one. For example, if you always use the same Docker image but you mount different model weights in it for different submissions, you can create the Docker image Artifact once, and then reference its ID with --artifact to avoid uploading it again.
Submit your system for evaluation
After uploading the submission files, you create the actual Submission resource with:
DYFF_API_TOKEN=<your token> python3 challenge-cli.py submit [OPTIONS]
When submitting, you provide the ID of the InferenceService you created in the previous step in --service. You also provide your Account ID in --account, your Team ID for the challenge in --team, and the Task ID that you're submitting to in --task.
Submitting a Docker HuggingFace Space
If you can't build a Docker image yourself, or if the steps above seem too confusing, you can make a submission without interacting with the Dyff API directly by submitting from a HuggingFace Space. HF Spaces that use the Docker SDK will build a Docker image from the contents of the repository associated with the Space when the Space is run. You can then grant the Dyff platform permission to pull this image and use a web form to trigger a new submission.
These are the steps to prepare a HF Space for making submissions to the challenge:
- Create a new HuggingFace Organization (not a user account) for your challenge team. The length of your combined Organization name + Space name must be less than 47 characters due to a limitation of the HuggingFace API.
- Create a new
Spacewithin yourOrganization. The Space must use the Docker SDK. Private Spaces are OK and they will work with the submission process. The length of your combined Organization name + Space name must be less than 47 characters due to a limitation of the HuggingFace API. - Create a file called
DYFF_TEAMin the root directory of your HF Space. The contents of the file should be your Team ID (not your Account ID). This file allows our infrastructure to verify that your Team controls this HF Space. - Create a
Dockerfilein your Space that builds your challenge submission image. - Run the Space; this will build the Docker image.
To make a challenge submission from your Space:
- Add the official SAFE Challenge user account as a Member of your organization with
readpermissions. Make sure you are adding the correct user account; the account name issafe-challenge-2025-submissions. This grants permission to our infrastructure to pull the Docker image built by your Space. - When you're ready to submit, use the submission web form and enter the URL of your Space and the branch that you want to submit.
Handling large models
There is a size limitation on Space repositories. If your submission contains large files (such as neural network weights), it may be too large to store in the space. In this case, you need to fetch your files from somewhere else during the Docker build process.
This means that your Dockerfile should contain something like this:
COPY download-my-model.sh ./
RUN ./download-my-model.sh
One convenient option is to create a seperate HuggingFace Model repository and use git clone in your Dockerfile to fetch the repository files.
Handling private models
If access credentials are required to download your model files, you should provide them using the Secrets feature of HuggingFace Spaces. Do not hard-code credentials in your Dockerfile or anywhere else in your Space or Organization!
Access credentials are necessary if you want to clone a private HuggingFace Model repository during your Docker build process.
Access the secrets as described in the Secrets > Buildtime section. Remember that you can't download files at run-time because your system will not have access to the Internet.
How to implement a detector
To implement a new detector that you can submit to the challenge, you need to implement an HTTP server that serves the required JSON API for inference requests. This repository contains a template that you can use as a starting point for implementing a detector in Python. You should be able to adapt this template easily to support common model formats such as neural networks built with PyTorch.
You are also free to build detectors with any other technologies and software stacks that you want, but you may have to figure out packaging on your own. All that's required of your submission is that it runs in a Docker container and that it supports the required inference API.
Quick Start
Install uv:
https://docs.astral.sh/uv/getting-started/installation/
Local development:
# Install dependencies
make setup
source venv/bin/activate
# Download the example model
make download
# Run it
make serve
In a second terminal:
# Process an example input
./prompt.sh cat.json
The server runs on http://127.0.0.1:8000. Check /docs for the interactive API documentation.
Docker:
# Build
make docker-build
# Run
make docker-run
The Docker container also runs the server at http://127.0.0.1:8000.
What Happens When You Start the Server
INFO: Starting ML Inference Service...
INFO: Initializing ResNet service: models/microsoft/resnet-18
INFO: Loading model from models/microsoft/resnet-18
INFO: Model loaded: 1000 classes
INFO: Startup completed successfully
INFO: Uvicorn running on http://0.0.0.0:8000
If you see "Model directory not found", check that your model files exist at the expected path with the full org/model structure.
Testing the API
By default, the server serves the inference API at /predict:
# Using curl
curl -X POST http://localhost:8000/predict \
-H "Content-Type: application/json" \
-d '{
"image": {
"mediaType": "image/jpeg",
"data": "<base64-encoded-image-data>"
}
}'
Example response:
{
"logprobs": [-0.859380304813385,-1.2701971530914307,-2.1918208599090576,-1.69235098361969],
"localizationMask": {
"mediaType":"image/png",
"data":"iVBORw0KGgoAAAANSUhEUgAAA8AAAAKDAQAAAAD9Fl5AAAAAu0lEQVR4nO3NsREAMAgDMWD/nZMVKEwn1T5/FQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAMCl3g5f+HC24TRhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAj70gwKsTlmdBwAAAABJRU5ErkJggg=="
}
}
Project Structure
example-submission/
βββ main.py # Entry point
βββ app/
β βββ core/
β β βββ app.py # <= INSTANTIATE YOUR DETECTOR HERE
β β βββ logging.py
β βββ api/
β β βββ models.py # Request/response schemas
β β βββ controllers.py # Business logic
β β βββ routes/
β β βββ prediction.py # POST /predict
β βββ services/
β βββ base.py # <= YOUR DETECTOR IMPLEMENTS THIS INTERFACE
β βββ inference.py # Example service based on ResNet-18
βββ models/
β βββ microsoft/
β βββ resnet-18/ # Model weights and config
βββ scripts/
β βββ model_download.bash # Downloads resnet-18
β βββ generate_test_datasets.py # Creates test datasets
β βββ test_datasets.py # Runs inference on test datasets
βββ Dockerfile
βββ .env.example # Environment config template
βββ cat.json # An example /predict request object
βββ makefile
βββ prompt.sh # Script that makes a /predict request
βββ requirements.cpu.in
βββ requirements.cpu.txt
βββ requirements.torch.cpu.in
βββ requirements.torch.cpu.in
βββ response.json # An example /predict response object
βββ
How to Plug In Your Own Model
To integrate your model, implement the InferenceService abstract class defined in app/services/base.py. You can follow the example implementation in app/services/inference.py, which is based on ResNet-18. After implementing the required interface, instantiate your model in the lifespan() function in app/core/app.py, replacing the ResNetInferenceService instance.
Step 1: Create Your Service Class
# app/services/your_model_service.py
from app.services.base import InferenceService
from app.api.models import ImageRequest, PredictionResponse
class YourModelService(InferenceService[ImageRequest, PredictionResponse]):
def __init__(self, model_name: str):
self.model_name = model_name
self.model_path = f"models/{model_name}"
self.model = None
self._is_loaded = False
def load_model(self) -> None:
"""Load your model here. Called once at startup."""
self.model = load_your_model(self.model_path)
self._is_loaded = True
def predict(self, request: ImageRequest) -> PredictionResponse:
"""Actual inference happens here."""
image = decode_base64_image(request.image.data)
result = self.model(image)
logprobs = ...
mask = ...
return PredictionResponse(
logprobs=logprobs,
localizationMask=mask,
)
@property
def is_loaded(self) -> bool:
return self._is_loaded
Step 2: Register Your Service
Open app/core/app.py and find the lifespan function:
# Change this line:
service = ResNetInferenceService(model_name="microsoft/resnet-18")
# To this:
service = YourModelService(...)
That's it. The /predict endpoint now serves your model.
Model Files
Put your model files under the models/ directory:
models/
βββ your-org/
βββ your-model/
βββ config.json
βββ weights.bin
βββ (other files)
GPU inference
The default configuration in this repo runs the model on CPU and does not contain the necessary dependencies for using GPUs.
To enable GPU inference, you need to:
- Base your Docker image on an image that contains the CUDA system packages such as this one. If you're using the
nvidia/cudaimages, you probably want one of the-runtime-tags, as the-devel-versions contain dependencies you probably don't need. - Install the GPU version of PyTorch (or whichever framework you use).
- Use the PyTorch
.to()function (or its equivalent in your framework) in theload_model()andpredict()functions to move model weights and input data to and from the CUDA device.
Configuration
Settings are managed via environment variables or a .env file. See .env.example for all available options.
Default values:
APP_NAME: "ML Inference Service"APP_VERSION: "0.1.0"DEBUG: falseHOST: "0.0.0.0"PORT: 8000MODEL_NAME: "microsoft/resnet-18"
To customize:
# Copy the example
cp .env.example .env
# Edit values
vim .env
Or set environment variables directly:
export MODEL_NAME="google/vit-base-patch16-224"
uvicorn main:app --reload
API Reference
Endpoint: POST /predict
Request:
{
"image": {
"mediaType": "image/jpeg", // or "image/png"
"data": "<base64 string>"
}
}
To decode a request, first convert .image.data from a base64 string to binary data (i.e., a Python bytes string), then interpret the binary data as image data of the type specified in .image.mediaType. The .image.mediaType will be either image/jpeg or image/png.
Response:
{
"logprobs": [float], // Log-probabilities of each label (length 4)
"localizationMask": { // [Optional] binary mask
"mediaType": "image/png", // Must be 'image/png'
"data": "<base64 string>" // Image data
}
}
The .logprobs field must contain a list of floats of length 4. Each index in the list corresponds to the log-probability of the associated label. The possible labels are describe in the app.api.models.Labels enumeration:
Natural = 0
FullySynthesized = 1
LocallyEdited = 2
LocallySynthesized = 3
The Synthesized labels mean that the image was partially or fully synthesized by a tool such as a generative image model. The LocallyEdited label means that the image was manipulated in some way other than by synthesizing content, such as by copying and pasting content from another image using image editing software.
The .localizationMask field is optional, but you should populate it if your detector is capable of localizing its detections. The mask is a binary (0/1) bitmap encoded as a PNG image. A non-zero value for a pixel means that the detector thinks that that pixel has been manipulated. A Python function to convert a numpy array to a PNG mask is provided in app.api.models.BinaryMask.from_numpy().
Docs:
The server in this repository serves API docs at the following endpoints:
- Swagger UI:
http://localhost:8000/docs - ReDoc:
http://localhost:8000/redoc - OpenAPI JSON:
http://localhost:8000/openapi.json
PyArrow Test Datasets
We've included a test dataset system for validating your model. It generates 100 standardized test cases covering normal inputs, edge cases, performance benchmarks, and model comparisons.
Generate Datasets
python scripts/generate_test_datasets.py
This creates:
scripts/test_datasets/*.parquet- Test data (images, requests, expected responses)scripts/test_datasets/*_metadata.json- Human-readable descriptionsscripts/test_datasets/datasets_summary.json- Overview of all datasets
Run Tests
# Start your service first
make serve
In another terminal:
# Quick test (5 samples per dataset)
python scripts/test_datasets.py --quick
# Full validation
python scripts/test_datasets.py
# Test specific category
python scripts/test_datasets.py --category edge_case
Dataset Categories (25 datasets each)
1. Standard Tests (standard_test_*.parquet)
- Normal images: random patterns, shapes, gradients
- Common sizes: 224x224, 256x256, 299x299, 384x384
- Formats: JPEG, PNG
- Purpose: Baseline validation
2. Edge Cases (edge_case_*.parquet)
- Tiny images (32x32, 1x1)
- Huge images (2048x2048)
- Extreme aspect ratios (1000x50)
- Corrupted data, malformed requests
- Purpose: Test error handling
3. Performance Benchmarks (performance_test_*.parquet)
- Batch sizes: 1, 5, 10, 25, 50, 100 images
- Latency and throughput tracking
- Purpose: Performance profiling
4. Model Comparisons (model_comparison_*.parquet)
- Same inputs across different architectures
- Models: ResNet-18/50, ViT, ConvNext, Swin
- Purpose: Cross-model benchmarking
Test Output
DATASET TESTING SUMMARY
============================================================
Datasets tested: 100
Successful datasets: 95
Failed datasets: 5
Total samples: 1,247
Overall success rate: 87.3%
Test duration: 45.2s
Performance:
Avg latency: 123.4ms
Median latency: 98.7ms
p95 latency: 342.1ms
Max latency: 2,341.0ms
Requests/sec: 27.6
Category breakdown:
standard: 25 datasets, 94.2% avg success
edge_case: 25 datasets, 76.8% avg success
performance: 25 datasets, 91.1% avg success
model_comparison: 25 datasets, 89.3% avg success
Common Issues
Port 8000 already in use:
# Find what's using it
lsof -i :8000
# Or just use a different port
uvicorn main:app --port 8080
Model not loading:
- Check the path: models should be in
models/<org>/<model-name>/ - If you're trying to run the example ResNet-based model, make sure you ran
make downloadto fetch the model weights. - Check logs for the exact error
Slow inference:
- Inference runs on CPU by default
- For GPU: install CUDA PyTorch and modify service to use GPU device
- Consider using smaller models or quantization
Dyff Web Portal β Quick Start Guide
Signing In
- Obtain a Dyff API Key.
- Go to: https://app.dyff.io/home
- Click the Sign in button in the top-right corner.
- Select Sign in with key.
- Paste in your Dyff API Key.
- Click Verify.
Finding Your Submission
- After signing in, click Operator in the navigation bar.
- In the dropdown menu, click Submissions.
This will take you to the Submissions page, where you can see the status of all submissions associated with your account or team.
Using the Submissions Page
The Submissions page shows the detailed status of your submissions. You can:
Search by submission ID
Use the Submission ID filter at the top of the table to find a specific submission (1).Search by team ID
Use the Team ID filter to find all submissions associated with a particular team (2).
To view details for a specific submission:
- Find the row for your submission.
- Click on the Status value for that submission (3).
This opens a detailed view where you can see information about:
- Inference Service (1)
- Challenge (2)
- Evaluation (3)
- Safety Case (4)
Evaluations
In the Evaluation section of a submission, you can:
- View the raw JSON data in the Raw JSON tab.
Safety Case
In the Safety Case section, you can:
- View logs and details related to the safety case for the given submission ID.
TL;DR
- Use Submissions to see the overall status of your work.
- You can find your submission by:
- Entering your Submission ID, or
- Entering your Team ID.
- Click on the Status of a submission to see detailed information about its Inference Service, Challenge, Evaluation, and Safety Case.
License
Apache 2.0




