Labbeti commited on
Commit
86a524c
1 Parent(s): 52ae700

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +9 -6
README.md CHANGED
@@ -12,14 +12,14 @@ task_categories:
12
  - audio-captioning
13
  ---
14
 
 
 
 
 
15
  <a href="https://www.python.org/"><img alt="Python" src="https://img.shields.io/badge/-Python 3.10+-blue?style=for-the-badge&logo=python&logoColor=white"></a><a href="https://pytorch.org/get-started/locally/"><img alt="PyTorch" src="https://img.shields.io/badge/-PyTorch 1.10.1+-ee4c2c?style=for-the-badge&logo=pytorch&logoColor=white"></a><a href="https://black.readthedocs.io/en/stable/"><img alt="Code style: black" src="https://img.shields.io/badge/code%20style-black-black.svg?style=for-the-badge&labelColor=gray"></a>
16
  <a href="https://github.com/Labbeti/conette-audio-captioning/actions">
17
  <img alt="Build" src="https://img.shields.io/github/actions/workflow/status/Labbeti/conette-audio-captioning/python-package-pip.yaml?branch=main&style=for-the-badge&logo=github">
18
  </a>
19
-
20
- <div align="center">
21
-
22
- # CoNeTTE model source
23
  <!-- <a href='https://aac-metrics.readthedocs.io/en/stable/?badge=stable'>
24
  <img src='https://readthedocs.org/projects/aac-metrics/badge/?version=stable&style=for-the-badge' alt='Documentation Status' />
25
  </a> -->
@@ -88,11 +88,14 @@ conette-predict --audio "/your/path/to/audio.wav"
88
 
89
  | Test data | SPIDEr (%) | SPIDEr-FL (%) | FENSE (%) | Vocab | Outputs | Scores |
90
  | ------------- | ------------- | ------------- | ------------- | ------------- | ------------- | ------------- |
91
- | AC-test | 44.14 | 43.98 | 60.81 | 309 | [:clipboard:](results/conette/outputs_audiocaps_test.csv) | [:chart_with_upwards_trend:](results/conette/scores_audiocaps_test.yaml) |
92
- | CL-eval | 30.97 | 30.87 | 51.72 | 636 | [:clipboard:](results/conette/outputs_clotho_eval.csv) | [:chart_with_upwards_trend:](results/conette/scores_clotho_eval.yaml) |
93
 
94
  This model checkpoint has been trained for the Clotho dataset, but it can also reach a good performance on AudioCaps with the "audiocaps" task.
95
 
 
 
 
96
  ## Citation
97
  The preprint version of the paper describing CoNeTTE is available on arxiv: https://arxiv.org/pdf/2309.00454.pdf
98
 
 
12
  - audio-captioning
13
  ---
14
 
15
+ <div align="center">
16
+
17
+ # CoNeTTE model source
18
+
19
  <a href="https://www.python.org/"><img alt="Python" src="https://img.shields.io/badge/-Python 3.10+-blue?style=for-the-badge&logo=python&logoColor=white"></a><a href="https://pytorch.org/get-started/locally/"><img alt="PyTorch" src="https://img.shields.io/badge/-PyTorch 1.10.1+-ee4c2c?style=for-the-badge&logo=pytorch&logoColor=white"></a><a href="https://black.readthedocs.io/en/stable/"><img alt="Code style: black" src="https://img.shields.io/badge/code%20style-black-black.svg?style=for-the-badge&labelColor=gray"></a>
20
  <a href="https://github.com/Labbeti/conette-audio-captioning/actions">
21
  <img alt="Build" src="https://img.shields.io/github/actions/workflow/status/Labbeti/conette-audio-captioning/python-package-pip.yaml?branch=main&style=for-the-badge&logo=github">
22
  </a>
 
 
 
 
23
  <!-- <a href='https://aac-metrics.readthedocs.io/en/stable/?badge=stable'>
24
  <img src='https://readthedocs.org/projects/aac-metrics/badge/?version=stable&style=for-the-badge' alt='Documentation Status' />
25
  </a> -->
 
88
 
89
  | Test data | SPIDEr (%) | SPIDEr-FL (%) | FENSE (%) | Vocab | Outputs | Scores |
90
  | ------------- | ------------- | ------------- | ------------- | ------------- | ------------- | ------------- |
91
+ | AC-test | 44.14 | 43.98 | 60.81 | 309 | [Link](https://github.com/Labbeti/conette-audio-captioning/blob/main/results/conette/outputs_audiocaps_test.csv) | [Link](https://github.com/Labbeti/conette-audio-captioning/blob/main/results/conette/scores_audiocaps_test.yaml) |
92
+ | CL-eval | 30.97 | 30.87 | 51.72 | 636 | [Link](https://github.com/Labbeti/conette-audio-captioning/blob/main/results/conette/outputs_clotho_eval.csv) | [Link](https://github.com/Labbeti/conette-audio-captioning/blob/main/results/conette/scores_clotho_eval.yaml) |
93
 
94
  This model checkpoint has been trained for the Clotho dataset, but it can also reach a good performance on AudioCaps with the "audiocaps" task.
95
 
96
+ ## Limitations
97
+ The model has been trained on audio sampled at 32 kHz and lasting from 1 to 30 seconds. It can handle longer audio files, but it might give worse results.
98
+
99
  ## Citation
100
  The preprint version of the paper describing CoNeTTE is available on arxiv: https://arxiv.org/pdf/2309.00454.pdf
101