VisualEars PhaseB Streaming TorchScript FP16
TorchScript FP16 streaming split for Reza2kn/visualears-fastconformer-fa-depoisoned-phaseB.
Files:
streaming_encoder_ctc_fp16_streaming_ts.ptrnnt_predictor_fp16_streaming_ts.ptrnnt_joint_fp16_streaming_ts.ptstreaming_native_fp16_manifest.json
Streaming contract:
- feature chunk:
1 x 80 x 32 - chunk/shift config:
chunk_size=4,shift_size=3,left_chunks=64 - feature shift window:
[17, 24] - encoder output per chunk:
1 x 512 x 2 - channel cache:
17 x 1 x 256 x 512 - time cache:
17 x 1 x 512 x 4 - heads: RNNT and CTC
Validation: runtime smoke passed for encoder+CTC, RNNT predictor, and RNNT joint with live streaming cache inputs/outputs.