Labbeti
/

conette

@@ -13,6 +13,8 @@ task_categories:
 # CoNeTTE (ConvNext-Transformer with Task Embedding) for Automated Audio Captioning
 This model generate a short textual description of any audio file.
 ## Installation
@@ -33,8 +35,18 @@ cands = outputs["cands"][0]
 print(cands)
 ```
-## Performance
-TODO
 ## Additional information

 # CoNeTTE (ConvNext-Transformer with Task Embedding) for Automated Audio Captioning
+<font color='red'>This model is currently in developement, and all the required files are not yet available.</font>
 This model generate a short textual description of any audio file.
 ## Installation
 print(cands)
 ```
+## Single model performance
+| Dataset | SPIDEr (%) | SPIDEr-FL (%) | FENSE (%) |
+| ------------- | ------------- | ------------- | ------------- |
+| AudioCaps | 44.14 | 43.98 | 60.81 |
+| Clotho | 30.97 | 30.87 | 51.72 |
+## Citation
+The preprint version of the paper describing CoNeTTE is available on arxiv: https://arxiv.org/pdf/2309.00454.pdf
+```
+```
 ## Additional information