netFound-640M-base / README.md
maybehelloworld's picture
Update README.md
0f3738e verified
metadata
model-index:
  - name: netFound-640M-base
    results:
      - task:
          type: fill-mask
        metrics:
          - name: Macro MLM F1
            type: f1
            value: 0.4038
          - name: Weighted MLM F1
            type: f1
            value: 0.8451
          - name: MLM Accuracy
            type: accuracy
            value: 0.8514
          - name: Swapped Weighted F1
            type: f1
            value: 0.9605
          - name: Perplexity
            type: perplexity
            value: 6.5842

netFound-640M-base

Description

netFound is a network traffic foundation model that uses transformer architecture and includes a pretraining phase on unlabeled data to achieve high results.

Key features:

  • netFound takes raw PCAP data as input
  • netFound can (and need) be pretrained on the unlabeled dataset
  • netFound uses Hierarchical Transformer architecture to take into account packet burst and flow behavior
  • netFound uses burst metadata (inter arrival time, number of bytes per burst, etc)

Source code

https://github.com/SNL-UCSB/netfound

Pretraining dataset

For pretraining, we used a private real-world dataset consisting of more than 450mln network flows. The model was pretrained for approximately 1 epoch (iterated through ~480mln flows).

Checkpoint

Model: Large (16 heads, 24 hidden layers, 1024 hidden size)
Total params: 643,825,672
January 17, 2025

Paper

https://arxiv.org/abs/2310.17025