sft-data / README.md
yuslzp's picture
Upload README.md
878a78f verified
# Instruction for downloading data from the sft-data repository.
First, you would want to log in and access the huggingface data through using
```py
from huggingface_hub import login
login()
```
Then, you could either download the zip file of the all the sft data folders, which would look like
```py
from huggingface_hub import hf_hub_download
hf_hub_download(repo_id="LEVI-Project/sft-data", filename="sft-data.zip")
```
Notice that the `sft-data.zip` file above has the following structure:
```
sft-data
β”œβ”€β”€ README.md # This README file.
β”œβ”€β”€ alf # Folder for ALFWORLD.
β”‚ β”œβ”€β”€ alfworld.json # The JSON file for ALFWORLD.
β”‚ └── alf_data_folder # Folder for the ALFWORLD environment.
β”‚ β”œβ”€β”€ alf_image_id_0 # Folder 0 for ALFWORLD image data.
β”‚ β”œβ”€β”€ alf_image_id_1 # Folder 1 for ALFWORLD image data.
β”‚ β”œβ”€β”€ alf_image_id_2 # Folder 2 for ALFWORLD image data.
β”‚ β”œβ”€β”€ alf_image_id_3 # Folder 3 for ALFWORLD image data.
β”‚ └── alf_image_id_4 # Folder 4 for ALFWORLD image data.
β”œβ”€β”€ blackjack # Folder for blackjack environment in the `gym_cards`.
β”‚ β”œβ”€β”€ blackjack_data_folder # Folder for blackjack image data.
β”‚ └── blackjack.json # The JSON file for blackjack.
β”œβ”€β”€ ezpoints # Folder for ezpoints environment in the `gym_cards`.
β”‚ β”œβ”€β”€ ezpoints_data_folder # Folder for ezpoints image data.
β”‚ └── ezpoints.json # The JSON file for ezpoints.
β”œβ”€β”€ points24 # Folder for points24 environment in the `gym_cards`.
β”‚ β”œβ”€β”€ points24_data_folder # Folder for points24 image data.
β”‚ └── points24.json # The JSON file for points24.
└── numberline # Folder for numberline environment in the `gym_cards`.
β”œβ”€β”€ numberline_data_folder # Folder for numberline image data.
└── numberline.json # The JSON file for numberline.
```
Also, you could choose to download the files for any environment out of the five ones. For example, you should be using the following code for downloading data from blackjack.
```py
from huggingface_hub import hf_hub_download
hf_hub_download(repo_id="LEVI-Project/sft-data", filename="blackjack.zip") # zip folder for image data folder
hf_hub_download(repo_id="LEVI-Project/sft-data", filename="blackjack.json") # JSON file
```
For ALFWORLD, notice that the zip file for the image data folder is `alf_data_folder.zip`.