HandleAtlas-166m-CPU

CPU-optimized ONNX INT8 variant of LumeData/HandleAtlas-166m. ~4ร— smaller and 4โ€“6ร— faster than the PyTorch float weights, intended for CPU inference.

What's in this repo

  • model.onnx โ€” fp32 ONNX export
  • model_quantized.onnx โ€” INT8 dynamic-quantized ONNX (load this for the fastest path)
  • Tokenizer + GLiNER config files

Usage (quantized + thread-tuned)

import os, torch
import onnxruntime as ort
from gliner import GLiNER

# Match physical (not logical) cores. 4โ€“8 is a good default on laptops.
N_THREADS = 8
os.environ["OMP_NUM_THREADS"] = str(N_THREADS)
torch.set_num_threads(N_THREADS)

model = GLiNER.from_pretrained(
    "LumeData/HandleAtlas-166m-CPU",
    load_onnx_model=True,
    onnx_model_file="model_quantized.onnx",
)

labels = ['instagram_username', 'snapchat_username', 'youtube_username', 'twitch_username', 'tiktok_username', 'discord_username', 'x_username', 'cashapp_username', 'onlyfans_username', 'tumblr_username', 'github_username', 'kofi_username', 'patreon_username', 'roblox_username', 'generic_username']

text = "Insta: foodgrammer | Snap: chefchef | DC: gamer420 | $cashtag"
for ent in model.predict_entities(text, labels, threshold=0.5):
    print(f"{ent['text']!r} -> {ent['label']} ({ent['score']:.2f})")

To use the unquantized ONNX (smaller accuracy delta, ~2ร— faster than PyTorch): swap onnx_model_file="model_quantized.onnx" for "model.onnx".

Recommended thresholds

  • Default: threshold=0.5
  • For generic_username, bump to 0.65 to reduce false positives.

Notes on quality

INT8 dynamic quantization typically costs <1 F1 point on this kind of task. For applications that require the absolute best precision, use the float variant LumeData/HandleAtlas-166m.

Labels

  • instagram_username
  • snapchat_username
  • youtube_username
  • twitch_username
  • tiktok_username
  • discord_username
  • x_username
  • cashapp_username
  • onlyfans_username
  • tumblr_username
  • github_username
  • kofi_username
  • patreon_username
  • roblox_username
  • generic_username
Downloads last month
5
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for LumeData/HandleAtlas-166m-CPU

Quantized
(1)
this model