Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
@@ -13,6 +13,8 @@ task_categories:
|
|
13 |
|
14 |
# CoNeTTE (ConvNext-Transformer with Task Embedding) for Automated Audio Captioning
|
15 |
|
|
|
|
|
16 |
This model generate a short textual description of any audio file.
|
17 |
|
18 |
## Installation
|
@@ -33,8 +35,18 @@ cands = outputs["cands"][0]
|
|
33 |
print(cands)
|
34 |
```
|
35 |
|
36 |
-
##
|
37 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
38 |
|
39 |
## Additional information
|
40 |
|
|
|
13 |
|
14 |
# CoNeTTE (ConvNext-Transformer with Task Embedding) for Automated Audio Captioning
|
15 |
|
16 |
+
<font color='red'>This model is currently in developement, and all the required files are not yet available.</font>
|
17 |
+
|
18 |
This model generate a short textual description of any audio file.
|
19 |
|
20 |
## Installation
|
|
|
35 |
print(cands)
|
36 |
```
|
37 |
|
38 |
+
## Single model performance
|
39 |
+
| Dataset | SPIDEr (%) | SPIDEr-FL (%) | FENSE (%) |
|
40 |
+
| ------------- | ------------- | ------------- | ------------- |
|
41 |
+
| AudioCaps | 44.14 | 43.98 | 60.81 |
|
42 |
+
| Clotho | 30.97 | 30.87 | 51.72 |
|
43 |
+
|
44 |
+
## Citation
|
45 |
+
The preprint version of the paper describing CoNeTTE is available on arxiv: https://arxiv.org/pdf/2309.00454.pdf
|
46 |
+
|
47 |
+
```
|
48 |
+
|
49 |
+
```
|
50 |
|
51 |
## Additional information
|
52 |
|