Token Classification
GLiNER
ONNX
English
multilingual
ner
social-media
username-extraction
int8
quantized
cpu
Instructions to use LumeData/HandleAtlas-166m-CPU with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- GLiNER
How to use LumeData/HandleAtlas-166m-CPU with GLiNER:
from gliner import GLiNER model = GLiNER.from_pretrained("LumeData/HandleAtlas-166m-CPU") - Notebooks
- Google Colab
- Kaggle
HandleAtlas-166m-CPU
CPU-optimized ONNX INT8 variant of LumeData/HandleAtlas-166m. ~4ร smaller and 4โ6ร faster than the PyTorch float weights, intended for CPU inference.
What's in this repo
model.onnxโ fp32 ONNX exportmodel_quantized.onnxโ INT8 dynamic-quantized ONNX (load this for the fastest path)- Tokenizer + GLiNER config files
Usage (quantized + thread-tuned)
import os, torch
import onnxruntime as ort
from gliner import GLiNER
# Match physical (not logical) cores. 4โ8 is a good default on laptops.
N_THREADS = 8
os.environ["OMP_NUM_THREADS"] = str(N_THREADS)
torch.set_num_threads(N_THREADS)
model = GLiNER.from_pretrained(
"LumeData/HandleAtlas-166m-CPU",
load_onnx_model=True,
onnx_model_file="model_quantized.onnx",
)
labels = ['instagram_username', 'snapchat_username', 'youtube_username', 'twitch_username', 'tiktok_username', 'discord_username', 'x_username', 'cashapp_username', 'onlyfans_username', 'tumblr_username', 'github_username', 'kofi_username', 'patreon_username', 'roblox_username', 'generic_username']
text = "Insta: foodgrammer | Snap: chefchef | DC: gamer420 | $cashtag"
for ent in model.predict_entities(text, labels, threshold=0.5):
print(f"{ent['text']!r} -> {ent['label']} ({ent['score']:.2f})")
To use the unquantized ONNX (smaller accuracy delta, ~2ร faster than PyTorch):
swap onnx_model_file="model_quantized.onnx" for "model.onnx".
Recommended thresholds
- Default:
threshold=0.5 - For
generic_username, bump to0.65to reduce false positives.
Notes on quality
INT8 dynamic quantization typically costs <1 F1 point on this kind of task. For applications that require the absolute best precision, use the float variant LumeData/HandleAtlas-166m.
Labels
instagram_usernamesnapchat_usernameyoutube_usernametwitch_usernametiktok_usernamediscord_usernamex_usernamecashapp_usernameonlyfans_usernametumblr_usernamegithub_usernamekofi_usernamepatreon_usernameroblox_usernamegeneric_username
- Downloads last month
- 5