WillHeld commited on
Commit
861f9b2
1 Parent(s): 7a69005

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -9
README.md CHANGED
@@ -1,20 +1,24 @@
1
  # Model Card for Diva Llama 3
2
 
3
  <!-- Provide a quick summary of what the model is/does. [Optional] -->
4
- This is an end-to-end Voice Assistant Model which can handle speech and text as inputs. It is trained using distillation loss. More details will be in a paper [COMING SOON]!
 
 
5
 
6
- See the model in action compared to SALMONN and Qwen-Audio at [diva-audio.github.io](https://diva-audio.github.io).
7
  ## Citation
8
- No Publication As of Yet, But If You Use Please Cite the Below
9
  **BibTeX:**
10
 
11
  ```
12
- @misc{held2024diva,
13
- author="Held, Will and Zhang, Yanzhe and Ryan, Michael and Shi, Weiyan and Li, Ella and Yang, Diyi",
14
- title="Distilling an End-to-End Voice Assistant from Speech Recognition Data",
15
- year="2024",
16
- publisher="HuggingFace",
17
- }
 
 
 
18
 
19
  ```
20
 
 
1
  # Model Card for Diva Llama 3
2
 
3
  <!-- Provide a quick summary of what the model is/does. [Optional] -->
4
+ This is an ablation of our Distilled Voice Assistant (DiVA) model which can handle speech and text as inputs. This ablation is trained using only token-alignment loss as described in the ablations here: https://huggingface.co/papers/2410.02678
5
+
6
+ Weights and Biases Run: https://wandb.ai/i18nlp/DiVA%20Training%20Runs/runs/4t0mvbcd?nw=nwuserheld
7
 
 
8
  ## Citation
9
+ This is the token-alignment only model from https://huggingface.co/papers/2410.02678
10
  **BibTeX:**
11
 
12
  ```
13
+ @misc{DiVA,
14
+ title={{D}istilling an {E}nd-to-{E}nd {V}oice {A}ssistant {W}ithout {I}nstruction {T}raining {D}ata},
15
+ author={William Held and Ella Li and Michael Ryan and Weiyan Shi and Yanzhe Zhang and Diyi Yang},
16
+ year={2024},
17
+ eprint={2410.02678},
18
+ archivePrefix={arXiv},
19
+ primaryClass={cs.CL},
20
+ url={https://arxiv.org/abs/2410.02678},
21
+ }
22
 
23
  ```
24