Gemma 2 9B IT Random Bijection

This repository contains the Gemma 2 9B IT Random Bijection weights used in the AlienLM experiments. AlienLM is a client-side text obfuscation approach for black-box LLM APIs: it maps natural text into an alienized token space, adapts the model with AAT, and recovers text on the client side.

Links:

Model Table

Uploaded model Base model Description HF Models
Gemma 2 9B IT AlienLM Full Gemma 2 9B IT Full AlienLM adaptation dsba-lab/gemma2-9b-it-alienlm-full
Gemma 2 9B IT Random Bijection Gemma 2 9B IT Random bijection baseline dsba-lab/gemma2-9b-it-random-bijection
Llama 3 8B Instruct AlienLM Full Llama 3 8B Instruct Full AlienLM adaptation dsba-lab/llama3-8b-instruct-alienlm-full
Llama 3 8B Instruct AlienLM Ratio 20 Llama 3 8B Instruct Partial alienization ratio 20 dsba-lab/llama3-8b-instruct-alienlm-ratio-20
Llama 3 8B Instruct AlienLM Ratio 40 Llama 3 8B Instruct Partial alienization ratio 40 dsba-lab/llama3-8b-instruct-alienlm-ratio-40
Llama 3 8B Instruct AlienLM Ratio 60 Llama 3 8B Instruct Partial alienization ratio 60 dsba-lab/llama3-8b-instruct-alienlm-ratio-60
Llama 3 8B Instruct AlienLM Ratio 80 Llama 3 8B Instruct Partial alienization ratio 80 dsba-lab/llama3-8b-instruct-alienlm-ratio-80
Llama 3 8B Instruct Random Bijection Llama 3 8B Instruct Random bijection baseline dsba-lab/llama3-8b-instruct-random-bijection
Qwen 2.5 14B Instruct AlienLM Full Qwen2.5 14B Instruct Full AlienLM adaptation dsba-lab/qwen25-14b-instruct-alienlm-full
Qwen 2.5 14B Instruct Random Bijection Qwen2.5 14B Instruct Random bijection baseline dsba-lab/qwen25-14b-instruct-random-bijection
Qwen 2.5 7B Instruct AlienLM Full Qwen2.5 7B Instruct Full AlienLM adaptation dsba-lab/qwen25-7b-instruct-alienlm-full
Qwen 2.5 7B Instruct Random Bijection Qwen2.5 7B Instruct Random bijection baseline dsba-lab/qwen25-7b-instruct-random-bijection

Example

Natural text Alien text
All happy families are alike; each unhappy family is unhappy in its own way.
 Dhaka בגCLS patriot Dude ブラウンanova 교neti estufa 교ಟ FestivalsDocumentation bekanntenroquia
Original token IDs Alien token IDs
[2430, 4915, 9160, 708, 28368, 235289, 1853, 42056, 2730, 603, 42056, 575, 1277, 1997, 1703, 235265]
[118082, 85241, 174135, 184646, 114599, 58746, 48064, 71689, 147487, 81724, 71689, 163116, 23867, 77693, 75944, 217666]

Variant

  • Variant: Random bijection baseline, seed 42
  • Base model: Gemma 2 9B IT
  • Upload source: /data2/AlienLM/outputs/Gemma2-9b-it-random42
  • Tokenizer check: The local tokenizer produced different token IDs from the base tokenizer for the test sentence. Base tokenizer ids: [2430, 4915, 9160, 708, 28368, 235289, 1853, 42056, 2730, 603, 42056, 575, 1277, 1997, 1703, 235265]

Notes

  • Served files only: weights, config, tokenizer, and README.
  • Training checkpoints and optimizer artifacts are excluded.
  • Intended for research evaluation, not production privacy guarantees.

BibTeX

@article{kim2026alienlm,
  title={AlienLM: Alienization of Language for API-Boundary Privacy in Black-Box LLMs},
  author={Kim, Jaehee and Kang, Pilsung},
  journal={arXiv preprint arXiv:2601.22710},
  year={2026}
}
Downloads last month
27
Safetensors
Model size
9B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for dsba-lab/gemma2-9b-it-random-bijection

Finetuned
(489)
this model

Datasets used to train dsba-lab/gemma2-9b-it-random-bijection

Collection including dsba-lab/gemma2-9b-it-random-bijection

Paper for dsba-lab/gemma2-9b-it-random-bijection