|
--- |
|
license: mit |
|
language: |
|
- en |
|
--- |
|
|
|
|
|
This is a very basic pyTorch transformer model that sorts lists of numbers. It was trained with nanoGPT. |
|
|
|
The context window is 256 tokens, so the input list can be up to 127 tokens long. Numbers can be 0 to 99, separated by comma tokens. |
|
|
|
It was trained for about one day on a laptop with a single NVIDIA RTX 2070 eGPU, so don't expect anything amazing. |
|
In practice it sorts these lists correctly about 90% of the time, which is good enough to satisfy my curiosity. |
|
|
|
To run, I recommend cloning nanoGPT (https://github.com/karpathy/nanoGPT) and installing its prerequisites. |
|
Create a new branch and copy these files into the nanoGPT folder, overwriting the included sample.py and train.py. |
|
|
|
To run: |
|
|
|
> python sample.py --out_dir=out-sort-lists --start="(5,4,3,2,1): [" --num_samples=1 --temperature=0.0001 --max_new_tokens=127 |
|
|
|
To train: |
|
|
|
> python train.py config/train_sort.py |
|
|