Allen Institute for AI
non-profit • 63 models
Infinity is the on-prem containerized solution delivering Transformers accuracy at 1ms latency.
Test and evaluate 🤗 Infinity in your own infrastructure
Plug and Predict
Infinity Container is a hardware-optimized inference solution delivered as a container. The Infinity Container is built specifically to run optimally on a target hardware architecture and exposes an HTTP API to run inference. Currently, supported tasks are document embedding, re-ranking and sequence classification.
Infinity Multiverse is a model optimization service delivered as a container, so you can optimize your models within your own environment, toward a compatible target inference hardware. Supported architectures are BERT, BERT-Large, DistilBERT, RoBERTa, RoBERTa-large, DistilRoBERTa, and MiniLM.
One of the world's largest e-commerce companies
Feature extraction and ranking tasks
2.2 ms per request - 10 times faster than before
Hugging Face helped us solve one of our major challenges: scalable and high-performing transformer models stable enough for production. We reached about 2.2 ms per request for feature extraction and 2.4 ms per ranking tasks on one GPU. 10 times faster than our results in the months before! It’s huge fun to cooperate with the people from Hugging Face!
World-leading providers of outsourced business solutions
Call transcript classification
4ms per request to classify conversations
At Moneypenny we look for practical ways to leverage the latest advances in AI, making our next conversation better than the last. With Infinity, we were able to automate call transcript classification easily, predicting the topic of a call with a high level of accuracy and in just four milliseconds per call! Infinity turned this model into an optimized inference solution, ready to deploy on our infrastructure. The whole process was extremely simple.
We are the creators of Transformers, the leading open source library for data scientists and machine learning engineers to explore state-of-the-art models and build machine learning features. We are on a mission to democratize AI, one commit at a time!