Spaces:

clip-italian
/

clip-italian-demo

Running

App Files Files Community

vinid commited on Jul 24, 2021

Commit

0264d55

•

1 Parent(s): b29e94e

adding HP and training details

Browse files

Files changed (1) hide show

introduction.md +20 -0

introduction.md CHANGED Viewed

@@ -157,6 +157,26 @@ We split this section in two: we first provide a quantitative evaluation to ensu
 We then show some qualitative examples of images found by the model. **All the code we have written** to run our validation experiments (in combination with
 code made available by Nils Reimers and by the authors of the original CLIP) is available.
 ## Quantitative Evaluation
 Showing great images is definitely cool and interesting, but a model is nothing without validation.
 Since this is the first clip-based model in Italian, we decided to use the multilingual CLIP model as a comparison baseline.

 We then show some qualitative examples of images found by the model. **All the code we have written** to run our validation experiments (in combination with
 code made available by Nils Reimers and by the authors of the original CLIP) is available.
+## Training Details
+### Datasets Splits
+We tried different combinations of splits sizes for training and validation. Eventually, we focused on a 95% training split with 5% of data
+going into the validation, each dataset is split in training and validation data and then we concatenate the files.
+Note that the 5% means 70K validation samples, making this set almost as big as the MSCOCO dataset.
+### Hyper-parameters
+The hyper-parameters can be found in the [repository](https://github.com/clip-italian/clip-italian/tree/master/hybrid_clip).
+We have a maximum sequence length of 95 tokens. To compute this we look at the distribution of the captions in the various
+datasets and we eventually realized that 95 was an excellent compromise between training speed and data coverage.
+We use a batch size of 128 and a learning rate of 0.00001.
+### Training
+We usually train until we see the loss going up and we then pick the model with the best validation loss. We adjusted the number of training epochs
+as the project progressed: at first we run 100 epochs but after we replaced the optimizer we have been able to reduce this number.
 ## Quantitative Evaluation
 Showing great images is definitely cool and interesting, but a model is nothing without validation.
 Since this is the first clip-based model in Italian, we decided to use the multilingual CLIP model as a comparison baseline.