nielsr HF staff commited on
Commit
809be91
·
verified ·
1 Parent(s): 13aa17a

Add pipeline tag, library name and clarify license

Browse files

This PR adds the missing `pipeline_tag` and `library_name` metadata, making the model easier to discover on the Hugging Face Hub. It also clarifies the license, specifying the code is MIT licensed, but the checkpoints are for non-commercial use only.

Files changed (1) hide show
  1. README.md +8 -2
README.md CHANGED
@@ -1,10 +1,16 @@
 
 
 
 
 
 
1
  # PyTorch Implementation of Audio Flamingo 2
2
 
3
  **Sreyan Ghosh, Zhifeng Kong, Sonal Kumar, S Sakshi, Jaehyeon Kim, Wei Ping, Rafael Valle, Dinesh Manocha, Bryan Catanzaro**
4
 
5
  [[paper]](https://arxiv.org/abs/2503.03983) [[Demo website]](https://research.nvidia.com/labs/adlr/AF2/) [[GitHub]](https://github.com/NVIDIA/audio-flamingo)
6
 
7
- This repo contains the PyTorch implementation of [Audio Flamingo 2: An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Abilities](). Audio Flamingo 2 achieves the state-of-the-art performance across over 20 benchmarks, with only a 3B parameter small language model. It is improved from our previous [Audio Flamingo](https://arxiv.org/abs/2402.01831).
8
 
9
  - We introduce two datasets, AudioSkills for expert audio reasoning, and LongAudio for long audio understanding, to advance the field of audio understanding.
10
 
@@ -34,7 +40,7 @@ Audio Flamingo 2 uses a cross-attention architecture similar to [Audio Flamingo]
34
 
35
  ## License
36
 
37
- - The checkpoints are for non-commercial use only (see NVIDIA OneWay Noncommercial License). They are also subject to the [Qwen Research license](https://huggingface.co/Qwen/Qwen2.5-3B/blob/main/LICENSE), the [Terms of Use](https://openai.com/policies/terms-of-use) of the data generated by OpenAI, and the original licenses accompanying each training dataset.
38
  - Notice: Audio Flamingo 2 is built with Qwen-2.5. Qwen is licensed under the Qwen RESEARCH LICENSE AGREEMENT, Copyright (c) Alibaba Cloud. All Rights Reserved.
39
 
40
 
 
1
+ ---
2
+ pipeline_tag: audio-text-to-text
3
+ library_name: transformers
4
+ license: mit
5
+ ---
6
+
7
  # PyTorch Implementation of Audio Flamingo 2
8
 
9
  **Sreyan Ghosh, Zhifeng Kong, Sonal Kumar, S Sakshi, Jaehyeon Kim, Wei Ping, Rafael Valle, Dinesh Manocha, Bryan Catanzaro**
10
 
11
  [[paper]](https://arxiv.org/abs/2503.03983) [[Demo website]](https://research.nvidia.com/labs/adlr/AF2/) [[GitHub]](https://github.com/NVIDIA/audio-flamingo)
12
 
13
+ This repo contains the PyTorch implementation of [Audio Flamingo 2: An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Abilities](https://arxiv.org/abs/2503.03983). Audio Flamingo 2 achieves state-of-the-art performance across over 20 benchmarks, using only a 3B parameter small language model. It is improved from our previous [Audio Flamingo](https://arxiv.org/abs/2402.01831).
14
 
15
  - We introduce two datasets, AudioSkills for expert audio reasoning, and LongAudio for long audio understanding, to advance the field of audio understanding.
16
 
 
40
 
41
  ## License
42
 
43
+ The code in this repo is under MIT license. The checkpoints are for non-commercial use only (see NVIDIA OneWay Noncommercial License). They are also subject to the [Qwen Research license](https://huggingface.co/Qwen/Qwen2.5-3B/blob/main/LICENSE), the [Terms of Use](https://openai.com/policies/terms-of-use) of the data generated by OpenAI, and the original licenses accompanying each training dataset.
44
  - Notice: Audio Flamingo 2 is built with Qwen-2.5. Qwen is licensed under the Qwen RESEARCH LICENSE AGREEMENT, Copyright (c) Alibaba Cloud. All Rights Reserved.
45
 
46