Instruction for downloading data from the sft-data repository.
First, you would want to log in and access the huggingface data through using
from huggingface_hub import login
login()
Then, you could either download the zip file of the all the sft data folders, which would look like
from huggingface_hub import hf_hub_download
hf_hub_download(repo_id="LEVI-Project/sft-data", filename="sft-data.zip")
Notice that the sft-data.zip
file above has the following structure:
sft-data
βββ README.md # This README file.
βββ alf # Folder for ALFWORLD.
β βββ alfworld.json # The JSON file for ALFWORLD.
β βββ alf_data_folder # Folder for the ALFWORLD environment.
β βββ alf_image_id_0 # Folder 0 for ALFWORLD image data.
β βββ alf_image_id_1 # Folder 1 for ALFWORLD image data.
β βββ alf_image_id_2 # Folder 2 for ALFWORLD image data.
β βββ alf_image_id_3 # Folder 3 for ALFWORLD image data.
β βββ alf_image_id_4 # Folder 4 for ALFWORLD image data.
βββ blackjack # Folder for blackjack environment in the `gym_cards`.
β βββ blackjack_data_folder # Folder for blackjack image data.
β βββ blackjack.json # The JSON file for blackjack.
βββ ezpoints # Folder for ezpoints environment in the `gym_cards`.
β βββ ezpoints_data_folder # Folder for ezpoints image data.
β βββ ezpoints.json # The JSON file for ezpoints.
βββ points24 # Folder for points24 environment in the `gym_cards`.
β βββ points24_data_folder # Folder for points24 image data.
β βββ points24.json # The JSON file for points24.
βββ numberline # Folder for numberline environment in the `gym_cards`.
βββ numberline_data_folder # Folder for numberline image data.
βββ numberline.json # The JSON file for numberline.
Also, you could choose to download the files for any environment out of the five ones. For example, you should be using the following code for downloading data from blackjack.
from huggingface_hub import hf_hub_download
hf_hub_download(repo_id="LEVI-Project/sft-data", filename="blackjack.zip") # zip folder for image data folder
hf_hub_download(repo_id="LEVI-Project/sft-data", filename="blackjack.json") # JSON file
For ALFWORLD, notice that the zip file for the image data folder is alf_data_folder.zip
.