Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
pyf98
/
owsm_ctc_v3.1_1B
like
7
Automatic Speech Recognition
ESPnet
owsm_v3.1_ctc
multilingual
audio
speech-translation
language-identification
arxiv:
2402.12654
arxiv:
2401.16658
License:
cc-by-4.0
Model card
Files
Files and versions
Community
Deploy
Use this model
main
owsm_ctc_v3.1_1B
/
exp
/
s2t_train_s2t_multitask-ctc_ebf27_conv2d8_size1024_raw_bpe50000
/
images
1 contributor
History:
1 commit
pyf98
add model files
3b3dddc
4 months ago
backward_time.png
35.2 kB
add model files
4 months ago
cer_ctc.png
26.4 kB
add model files
4 months ago
cer_interctc_layer12.png
29 kB
add model files
4 months ago
cer_interctc_layer15.png
29.4 kB
add model files
4 months ago
cer_interctc_layer21.png
29.6 kB
add model files
4 months ago
cer_interctc_layer6.png
29.4 kB
add model files
4 months ago
clip.png
14.7 kB
add model files
4 months ago
forward_time.png
38.6 kB
add model files
4 months ago
gpu_max_cached_mem_GB.png
28.8 kB
add model files
4 months ago
grad_norm.png
25.5 kB
add model files
4 months ago
iter_time.png
40.3 kB
add model files
4 months ago
loss.png
32.5 kB
add model files
4 months ago
loss_ctc.png
33.9 kB
add model files
4 months ago
loss_interctc_layer12.png
36.5 kB
add model files
4 months ago
loss_interctc_layer15.png
36.3 kB
add model files
4 months ago
loss_interctc_layer21.png
38 kB
add model files
4 months ago
loss_interctc_layer6.png
37.1 kB
add model files
4 months ago
loss_scale.png
30.5 kB
add model files
4 months ago
optim0_lr0.png
33.8 kB
add model files
4 months ago
optim_step_time.png
32.3 kB
add model files
4 months ago
train_time.png
34.4 kB
add model files
4 months ago