sander-wood
/

bgpt

7 datasets

Model card Files Files and versions Community

sander-wood commited on Feb 29

Commit

c9ae4da

•

1 Parent(s): 50c7f7c

Update README.md

Browse files

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -83,7 +83,7 @@ The `config.py` file contains critical settings for training and inference, allo
 Generative modelling with bGPT is a flexible and powerful approach to learning and generating new data across various formats. bGPT segments byte sequences into patches, predicts next patch features with a patch-level decoder, and reconstructs bytes within patches using these features with a byte-level decoder. Here's how to get started:
 1. **Prepare Your Data**: Since bGPT models information at the byte level, it can work with any type of file that exists on a computer, regardless of format. This means your training and evaluation datasets can include text, images, audio, or any other file type.
-2.
 3. **Adjust Configuration Settings**: Modify the `config.py` file to tailor the training process to your needs. At a minimum, you should update the `TRAIN_FOLDERS` and `EVAL_FOLDERS` to point to your actual data directories. Also, specify where to save the trained model weights and logs by setting `WEIGHTS_PATH` and `LOGS_PATH`. You may adjust other parameters based on your specific requirements. For instance, with the default `PATCH_SIZE=16` and `PATCH_LENGTH=512`, bGPT can model byte sequences up to 8KB. If your training files are larger, and you have sufficient computational resources, consider increasing these parameters to accommodate the larger file sizes.
 4. **Leverage Pre-trained Weights (Optional)**: If you wish to fine-tune a pre-trained bGPT model, set `PRE_WEIGHTS_PATH` to the location of the pre-trained weights and ensure `LOAD_FROM_PRE_CHECKPOINT=True`. To train a model from scratch, simply set `LOAD_FROM_PRE_CHECKPOINT=False`.

 Generative modelling with bGPT is a flexible and powerful approach to learning and generating new data across various formats. bGPT segments byte sequences into patches, predicts next patch features with a patch-level decoder, and reconstructs bytes within patches using these features with a byte-level decoder. Here's how to get started:
 1. **Prepare Your Data**: Since bGPT models information at the byte level, it can work with any type of file that exists on a computer, regardless of format. This means your training and evaluation datasets can include text, images, audio, or any other file type.
 3. **Adjust Configuration Settings**: Modify the `config.py` file to tailor the training process to your needs. At a minimum, you should update the `TRAIN_FOLDERS` and `EVAL_FOLDERS` to point to your actual data directories. Also, specify where to save the trained model weights and logs by setting `WEIGHTS_PATH` and `LOGS_PATH`. You may adjust other parameters based on your specific requirements. For instance, with the default `PATCH_SIZE=16` and `PATCH_LENGTH=512`, bGPT can model byte sequences up to 8KB. If your training files are larger, and you have sufficient computational resources, consider increasing these parameters to accommodate the larger file sizes.
 4. **Leverage Pre-trained Weights (Optional)**: If you wish to fine-tune a pre-trained bGPT model, set `PRE_WEIGHTS_PATH` to the location of the pre-trained weights and ensure `LOAD_FROM_PRE_CHECKPOINT=True`. To train a model from scratch, simply set `LOAD_FROM_PRE_CHECKPOINT=False`.