Token Classification
Frequently Asked Questions

Which tasks can I run?
All Transformers pipelines available: ASR, feature extraction, text classification, NER, question answering, translation, summarization, text generation, zero-shot classification, conversational AI, table question answering.
Do you speak NLP?
With over 10,000 models trained in over 160 languages, Hugging Face offers the largest and most diverse library of state of the art models, and the Inference API makes them all available to you via simple API calls.
What’s the latency?
We accelerate our models on CPU and GPU so your apps work faster. Read up on how we achieved 100x speedup on Transformers .
Can it scale?
We built our infrastructure to support real-time consumer use cases and scale automatically as usage grows to support up to 1,000 requests per second.
How is my data secure?
All data transfers are encrypted in transit with SSL. Hugging Face protects your inference data - no third-party access. Enterprise plans offer additional layers of security for log-less requests.
What’s your pricing?
Try it free with an account, then pick the plan that works for you - as low as $9/mo. We bill usage by inference input characters, and offer volume-based tiered pricing for high volumes.

Request and we shall serve

State of the Art as easy as HTTP requests

import requests

def query(payload, model_id, api_token):
	headers = {"Authorization": f"Bearer {api_token}"}
	API_URL = f"{model_id}"
	response =, headers=headers, json=payload)
	return response.json()

model_id = "distilbert-base-uncased"
api_token = "api_XXXXXXXX" # get yours at
data = query("The goal of life is [MASK].", model_id, api_token)

