Reza2kn
/

visualears-fastconformer-fa-depoisoned-phaseB-streaming-torchscript-fp16

Automatic Speech Recognition

Model card Files Files and versions

VisualEars PhaseB Streaming TorchScript FP16

TorchScript FP16 streaming split for Reza2kn/visualears-fastconformer-fa-depoisoned-phaseB.

Files:

streaming_encoder_ctc_fp16_streaming_ts.pt
rnnt_predictor_fp16_streaming_ts.pt
rnnt_joint_fp16_streaming_ts.pt
streaming_native_fp16_manifest.json

Streaming contract:

feature chunk: 1 x 80 x 32
chunk/shift config: chunk_size=4, shift_size=3, left_chunks=64
feature shift window: [17, 24]
encoder output per chunk: 1 x 512 x 2
channel cache: 17 x 1 x 256 x 512
time cache: 17 x 1 x 512 x 4
heads: RNNT and CTC

Validation: runtime smoke passed for encoder+CTC, RNNT predictor, and RNNT joint with live streaming cache inputs/outputs.

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for Reza2kn/visualears-fastconformer-fa-depoisoned-phaseB-streaming-torchscript-fp16

Base model

Reza2kn/visualears-fastconformer-fa-depoisoned-phaseB

Quantized

(6)

this model