Australian PII Detection Model

Fine-tuned GLiNER for 18 Australian PII types.

Model Performance

  • Training F1 : 100%
  • Test Accuracy : 91.7%
  • Parameters : 195,175,936
  • Training Time : 17.9 minutes

Supported Entities

  • PERSON_NAME
  • TAX_FILE_NUMBER
  • MEDICARE_NUMBER
  • ABN
  • ACN
  • DRIVER_LICENSE
  • PASSPORT_NUMBER
  • PHONE_NUMBER
  • EMAIL_ADDRESS
  • PHYSICAL_ADDRESS
  • DATE_OF_BIRTH
  • BSB_ACCOUNT_NUMBER
  • CREDIT_CARD_NUMBER
  • SUPER_FUND_MEMBER_NUMBER
  • SALARY_AMOUNT
  • EMPLOYER_NAME
  • HEALTH_FUND_NUMBER
  • IP_ADDRESS

Document Types Supported

  • Tax Returns
  • Bank Statements
  • Superannuation Statements
  • Insurance Documents
  • Employment Records

Training Details

  • Base: urchade/gliner_medium-v2.1
  • Data: 2000 synthetic AU documents
  • Epochs: 10
  • Batch: 8
  • LR: 5e-6
  • GPU: Google Colab T4

Limitations

  • Trained on synthetic data only
  • Real world accuracy may vary
  • Needs OCR for scanned PDFs
  • Use with regex for best results

Legal Notice

Research purposes only. Always include human review for production PII redaction.

Downloads last month
8
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support