imirandam commited on
Commit
f6f52fa
·
verified ·
1 Parent(s): 321a3c1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -4,20 +4,20 @@ datasets:
4
  - imirandam/TROHN-Img
5
  ---
6
 
7
- # Model Card for CLIP_Detectos
8
  ## Model Description
9
  - **Homepage:** https://imirandam.github.io/BiVLC_project_page/
10
  - **Repository:** https://github.com/IMirandaM/BiVLC
11
  - **Paper:**
12
  - **Point of Contact:** [Imanol Miranda](mailto:imanol.miranda@ehu.eus)
13
  ### Model Summary
14
- CLIP_Detector is a model presented in the [BiVLC](https://github.com/IMirandaM/BiVLC) paper for experimentation. It has been trained with the OpenCLIP framework using the CLIP ViT-B-32 model pre-trained by 'openai' as a basis. The encoders are kept frozen, and a sigmoid neuron is added on top of each encoder (more details in the paper). The objective of the model is to classify text and images as natural or synthetic. Hyperparameters:
15
 
16
  * Learning rate: 1e-6.
17
  * Optimizer: Adam optimizer with beta1 = 0.9, beta2 = 0.999, eps = 1e-08 and without weight decay.
18
  * Loss function: Binary cross-entropy loss (BCELoss).
19
  * Batch size: We define a batch size of 400.
20
- * Epochs: We trained the text detector over 10 epochs and the image detectors over 1 epoch. We used validation accuracy as the model selection criterion, i.e. we selected the model with highest accuracy in the corresponding validation set.
21
  * Data: Then sigmoid neuron is trained with [TROHN-Img](https://huggingface.co/datasets/imirandam/TROHN-Img) dataset.
22
 
23
  ### Licensing Information
 
4
  - imirandam/TROHN-Img
5
  ---
6
 
7
+ # Model Card for CLIP_Detector
8
  ## Model Description
9
  - **Homepage:** https://imirandam.github.io/BiVLC_project_page/
10
  - **Repository:** https://github.com/IMirandaM/BiVLC
11
  - **Paper:**
12
  - **Point of Contact:** [Imanol Miranda](mailto:imanol.miranda@ehu.eus)
13
  ### Model Summary
14
+ CLIP_Detector is a model presented in the [BiVLC](https://github.com/IMirandaM/BiVLC) paper for experimentation. It has been trained with the OpenCLIP framework using the CLIP ViT-B-32 model pre-trained by 'openai' as a basis. For binary classification, the encoders are kept frozen. A sigmoid neuron is added over the CLS embedding for the image encoder and over the EOT embedding for the text encoder (more details in the paper). The objective of the model is to classify text and images as natural or synthetic. Hyperparameters:
15
 
16
  * Learning rate: 1e-6.
17
  * Optimizer: Adam optimizer with beta1 = 0.9, beta2 = 0.999, eps = 1e-08 and without weight decay.
18
  * Loss function: Binary cross-entropy loss (BCELoss).
19
  * Batch size: We define a batch size of 400.
20
+ * Epochs: We trained the text detector over 10 epochs and the image detector over 1 epoch. We used validation accuracy as the model selection criterion, i.e. we selected the model with highest accuracy in the corresponding validation set.
21
  * Data: Then sigmoid neuron is trained with [TROHN-Img](https://huggingface.co/datasets/imirandam/TROHN-Img) dataset.
22
 
23
  ### Licensing Information