Spaces:

OFA-Sys
/

OFA-Image_Caption

Runtime error

JustinLin610

update

8437114 over 2 years ago

626 Bytes


	# Install dependency
	```bash
	pip install -r requirement.txt
	```

	# Download the data set
	```bash
	export WORKDIR_ROOT=<a directory which will hold all working files>

	```
	The downloaded data will be at $WORKDIR_ROOT/ML50

	# preprocess the data
	Install SPM [here](https://github.com/google/sentencepiece)
	```bash
	export WORKDIR_ROOT=<a directory which will hold all working files>
	export SPM_PATH=<a path pointing to sentencepice spm_encode.py>
	```
	* $WORKDIR_ROOT/ML50/raw: extracted raw data
	* $WORKDIR_ROOT/ML50/dedup: dedup data
	* $WORKDIR_ROOT/ML50/clean: data with valid and test sentences removed from the dedup data