Qwen 2.5 14B Instruct Random Bijection

This repository contains the Qwen 2.5 14B Instruct Random Bijection weights used in the AlienLM experiments. AlienLM is a client-side text obfuscation approach for black-box LLM APIs: it maps natural text into an alienized token space, adapts the model with AAT, and recovers text on the client side.

Links:

Model Table

Uploaded model Base model Description HF Models
Gemma 2 9B IT AlienLM Full Gemma 2 9B IT Full AlienLM adaptation dsba-lab/gemma2-9b-it-alienlm-full
Gemma 2 9B IT Random Bijection Gemma 2 9B IT Random bijection baseline dsba-lab/gemma2-9b-it-random-bijection
Llama 3 8B Instruct AlienLM Full Llama 3 8B Instruct Full AlienLM adaptation dsba-lab/llama3-8b-instruct-alienlm-full
Llama 3 8B Instruct AlienLM Ratio 20 Llama 3 8B Instruct Partial alienization ratio 20 dsba-lab/llama3-8b-instruct-alienlm-ratio-20
Llama 3 8B Instruct AlienLM Ratio 40 Llama 3 8B Instruct Partial alienization ratio 40 dsba-lab/llama3-8b-instruct-alienlm-ratio-40
Llama 3 8B Instruct AlienLM Ratio 60 Llama 3 8B Instruct Partial alienization ratio 60 dsba-lab/llama3-8b-instruct-alienlm-ratio-60
Llama 3 8B Instruct AlienLM Ratio 80 Llama 3 8B Instruct Partial alienization ratio 80 dsba-lab/llama3-8b-instruct-alienlm-ratio-80
Llama 3 8B Instruct Random Bijection Llama 3 8B Instruct Random bijection baseline dsba-lab/llama3-8b-instruct-random-bijection
Qwen 2.5 14B Instruct AlienLM Full Qwen2.5 14B Instruct Full AlienLM adaptation dsba-lab/qwen25-14b-instruct-alienlm-full
Qwen 2.5 14B Instruct Random Bijection Qwen2.5 14B Instruct Random bijection baseline dsba-lab/qwen25-14b-instruct-random-bijection
Qwen 2.5 7B Instruct AlienLM Full Qwen2.5 7B Instruct Full AlienLM adaptation dsba-lab/qwen25-7b-instruct-alienlm-full
Qwen 2.5 7B Instruct Random Bijection Qwen2.5 7B Instruct Random bijection baseline dsba-lab/qwen25-7b-instruct-random-bijection

Example

Natural text Alien text
All happy families are alike; each unhappy family is unhappy in its own way.
(Input理論AccessorType checkpointsingendart-song hg bourbonraft hgท实景]\ дем⋌
Original token IDs Alien token IDs
[2403, 6247, 8521, 525, 25992, 26, 1817, 42151, 2997, 374, 42151, 304, 1181, 1828, 1616, 13]
[26430, 9244, 81484, 117800, 1086, 89842, 70268, 27147, 15693, 31326, 27147, 21062, 67902, 77163, 56354, 63835]

Variant

  • Variant: Random bijection baseline, seed 42
  • Base model: Qwen2.5 14B Instruct
  • Upload source: /data2/AlienLM/outputs/Qwen25-14b-Instruct-random-42
  • Tokenizer check: The local tokenizer produced different token IDs from the base tokenizer for the test sentence. Base tokenizer ids: [2403, 6247, 8521, 525, 25992, 26, 1817, 42151, 2997, 374, 42151, 304, 1181, 1828, 1616, 13]

Notes

  • Served files only: weights, config, tokenizer, and README.
  • Training checkpoints and optimizer artifacts are excluded.
  • Intended for research evaluation, not production privacy guarantees.

BibTeX

@article{kim2026alienlm,
  title={AlienLM: Alienization of Language for API-Boundary Privacy in Black-Box LLMs},
  author={Kim, Jaehee and Kang, Pilsung},
  journal={arXiv preprint arXiv:2601.22710},
  year={2026}
}
Downloads last month
19
Safetensors
Model size
15B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for dsba-lab/qwen25-14b-instruct-random-bijection

Base model

Qwen/Qwen2.5-14B
Finetuned
(422)
this model

Datasets used to train dsba-lab/qwen25-14b-instruct-random-bijection

Collection including dsba-lab/qwen25-14b-instruct-random-bijection

Paper for dsba-lab/qwen25-14b-instruct-random-bijection