JSALT 2026 โ€” CSR Lab: Pre-trained Character ASR Model

Pre-trained weights for the JSALT 2026 Continuous Speech Recognition lab.

Model

  • Encoder: facebook/hubert-base-ls960 (HuBERT-base, frozen for first 1000 steps)
  • Output head: linear 768 โ†’ 33 (character-level vocabulary)
  • Loss: CTC with blank token at index 0
  • Tokenizer: CharacterTokenizer (33-class: <blk>, <pad>, <unk>, <bos>, <eos>, Aโ€“Z, space, apostrophe)
  • Training data: LibriSpeech train-clean-5 (~5.4 h)

Files

File Description
speech_encoder_char.pt model.state_dict() โ€” load directly with model.load_state_dict(...)

Usage

from huggingface_hub import hf_hub_download
import torch

weights_path = hf_hub_download(repo_id="Borrison/jsalt26-csr-lab",
                                filename="speech_encoder_char.pt")
state_dict = torch.load(weights_path, map_location=device)
model.load_state_dict(state_dict)
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support