multiple description candidates to facilitate DDR and HDR training

by Arsenever - opened Oct 11, 2024

Oct 11, 2024

•

Thanks for your good work!

I wonder when will the rep. release the multiple description candidates to facilitate DDR and HDR training?
In some json files, I find 5 long captions and they look like have little difference, so which I should use as the GT?

jpWang

Alibaba-PAI org Oct 12, 2024

Hi, thanks for your attention to our work.

As stated in the paper, we train HDR and DDR tasks by randomly synthesizing data samples online. So we don't design a fixed dataset for release.
These sentences are different expressions of the same semantic meaning, implemented by the step Desc. Rewrite mentioned in the paper. You can randomly choose one of them each time during training.

Oct 12, 2024

Thanks for your reply! Could you tell how long time and how many A100s it takes to complete the training?

jpWang

Alibaba-PAI org Oct 12, 2024

We use 16 A100-80G GPUs and it takes about 10 hours for pre-training.

jpWang changed discussion status to closed Oct 12, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment