Lens (pretrained base)

Pretrained checkpoint of Lens, a knowledge-guided foundation model for network traffic (TMLR). The backbone is T5-v1.1-base (~0.25B params) with a network-specific BBPE tokenizer (vocab 32,112), pretrained on network-traffic flows with a knowledge-guided masked-span objective.

Files

pytorch_model.bin — pretrained weights (loads cleanly into T5ForConditionalGeneration).
config.json — model config (T5-v1.1-base, vocab_size=32112).
tokenizer.json, tokenizer_config.json, special_tokens_map.json — the network BBPE tokenizer.

How to load

from transformers import T5ForConditionalGeneration, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("Charles59/lens-pretrained")
model = T5ForConditionalGeneration.from_pretrained("Charles59/lens-pretrained")

The released Lens code adds the special tokens <SIP> / <DIP> (anonymized source/destination IP) at fine-tuning time and can run the optimized flash-attention variant (attention_type='flash'). For exact reproduction, load this checkpoint with the Lens training scripts and the corresponding downstream data.

Pretraining

Architecture: T5-v1.1-base (encoder-decoder, 12+12 layers, d_model 768, gated-GELU).
Objective: knowledge-guided masked-span prediction over packet/flow text.
Context length: up to 1,500 tokens.
Steps: 130,000 (10% warm-up), batch size 48, AdamW.
Pretraining data: the pretraining split of the NetBench source datasets, sampled without any downstream labels to avoid label leakage. The pretraining corpus itself is not released.

Downstream data

Classification: Charles59/lens-network-traffic
Generation: Charles59/lens-network-traffic-generation

License

CC-BY-NC-4.0. Underlying data comes from academic datasets via NetBench (Qian et al., 2024); their original terms also apply.

Citation

@article{li2026lens,
  title   = {Lens: A Knowledge-Guided Foundation Model for Network Traffic},
  author  = {Li, Xiaochang and Qian, Chen and Wang, Qineng and Kong, Jiangtao and Wang, Yuchen and Yao, Ziyu and Ji, Bo and Cheng, Long and Zhou, Gang and Shao, Huajie},
  journal = {Transactions on Machine Learning Research},
  issn    = {2835-8856},
  year    = {2026},
  url     = {https://openreview.net/forum?id=cGDwTgnJIR},
  note    = {arXiv:2402.03646}
}

Downloads last month: -

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Charles59/lens-pretrained

Base model

google/t5-v1_1-base

Finetuned

(59)

this model

Datasets used to train Charles59/lens-pretrained

Paper for Charles59/lens-pretrained

Lens: A Knowledge-Guided Foundation Model for Network Traffic

Paper • 2402.03646 • Published Jan 14 • 1