File size: 2,552 Bytes
dba1671
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
df63eae
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
dba1671
 
df63eae
dba1671
 
 
 
878a78f
 
dba1671
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
# Instruction for downloading data from the sft-data repository.

First, you would want to log in and access the huggingface data through using 

```py
from huggingface_hub import login
login()
```

Then, you could either download the zip file of the all the sft data folders, which would look like

```py
from huggingface_hub import hf_hub_download
hf_hub_download(repo_id="LEVI-Project/sft-data", filename="sft-data.zip")
```

Notice that the `sft-data.zip` file above has the following structure:

```
sft-data
β”œβ”€β”€ README.md            # This README file.
β”œβ”€β”€ alf                  # Folder for ALFWORLD.
β”‚   β”œβ”€β”€ alfworld.json    # The JSON file for ALFWORLD.
β”‚   └── alf_data_folder  # Folder for the ALFWORLD environment.
β”‚       β”œβ”€β”€ alf_image_id_0  # Folder 0 for ALFWORLD image data.
β”‚       β”œβ”€β”€ alf_image_id_1  # Folder 1 for ALFWORLD image data.
β”‚       β”œβ”€β”€ alf_image_id_2  # Folder 2 for ALFWORLD image data.
β”‚       β”œβ”€β”€ alf_image_id_3  # Folder 3 for ALFWORLD image data.
β”‚       └── alf_image_id_4  # Folder 4 for ALFWORLD image data.
β”œβ”€β”€ blackjack            # Folder for blackjack environment in the `gym_cards`.
β”‚   β”œβ”€β”€ blackjack_data_folder  # Folder for blackjack image data.
β”‚   └── blackjack.json         # The JSON file for blackjack.
β”œβ”€β”€ ezpoints             # Folder for ezpoints environment in the `gym_cards`.
β”‚   β”œβ”€β”€ ezpoints_data_folder  # Folder for ezpoints image data.
β”‚   └── ezpoints.json         # The JSON file for ezpoints.
β”œβ”€β”€ points24             # Folder for points24 environment in the `gym_cards`.
β”‚   β”œβ”€β”€ points24_data_folder  # Folder for points24 image data.
β”‚   └── points24.json         # The JSON file for points24.
└── numberline           # Folder for numberline environment in the `gym_cards`.
    β”œβ”€β”€ numberline_data_folder  # Folder for numberline image data.
    └── numberline.json         # The JSON file for numberline.
```


Also, you could choose to download the files for any environment out of the five ones. For example, you should be using the following code for downloading data from blackjack.

```py
from huggingface_hub import hf_hub_download
hf_hub_download(repo_id="LEVI-Project/sft-data", filename="blackjack.zip") # zip folder for image data folder
hf_hub_download(repo_id="LEVI-Project/sft-data", filename="blackjack.json") # JSON file 
```

For ALFWORLD, notice that the zip file for the image data folder is `alf_data_folder.zip`.