File size: 1,567 Bytes
bde0882
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# PassGPT

PassGPT is a causal language model trained on password leaks. It was first introduced in [this paper](https://arxiv.org/abs/2306.01545). This version of the model was trained on passwords from the RockYou leak, which were at most 10 characters long.

### Usage and License Notices
[![License](https://img.shields.io/badge/License-CC%20By%20NC%204.0-yellow)](https://github.com/javirandor/passbert/blob/main/LICENSE)
PassGPT is intended and licensed for research use only. The model and code are CC BY NC 4.0 (allowing only non-commercial use) and should not be used outside of research purposes. This material should never be used to attack real systems.

### Model description

The model inherits the [GPT2LMHeadModel](https://huggingface.co/docs/transformers/model_doc/gpt2#transformers.GPT2LMHeadModel) architecture and implements a custom [BertTokenizer](https://huggingface.co/docs/transformers/model_doc/bert#transformers.BertTokenizer) that encodes each character in a password as a single token, avoiding merges. It was trained from a random initialization and the code for training can be found in the [official repository](https://github.com/javirandor/passgpt/).

### Password Generation

Passwords can be sampled from the model using the [built-in generation methods](https://huggingface.co/docs/transformers/v4.30.0/en/main_classes/text_generation#transformers.GenerationMixin.generate) provided by HuggingFace and using the "start of password token" as seed (i.e. `<s>`). This code can be used to generate one password with PassGPT:

```
```