🤗 Infinity Trial Request

Thanks for requesting a trial of Infinity! Our sales team will schedule a technical onboarding session as soon as possible.

Update: Due to high-demand, it may take a while to get back to you. Any questions, please email infinity@huggingface.co

Plug and Predict

Infinity comes as a single-container and can be deployed in any production environment. It can easily be scaled to thousands of requests every second using orchestration services like kubernetes.

Unmatched Performance

Infinity achieves unmatched performance for state-of-the-art transformer models. Infinity achieves 1ms latency for BERT-like models on GPU, and 4ms on CPU.

Enterprise Ready

Infinity meets the highest security requirements and can be integrated any-where from public clouds to air gapped environments. You control your models, your data, and the traffic.