Missing tokenizer_config.json

by versae - opened Apr 18, 2023

Apr 18, 2023

•

edited Apr 18, 2023

It seems newer versions of the transformers library expect a tokenizer_config.json file. I wonder if the model https://huggingface.co/KBLab/wav2vec2-large-voxrex-swedish/ uses the same tokenizer_config.json, and if so, if the file there could be copied here.

versae

Apr 18, 2023

Just realizing this is a pure acoustic model without any CTC on top, so it does not make sense to have a tokenizer I guess.

Lauler

National Library of Sweden / KBLab org Apr 19, 2023

Correct, this model (KBLab/wav2vec2-large-voxrex) is provided so that people can continue pretraining on the acoustic model, or alternatively do their own finetuning on any downstream task they might be interested in.

versae changed discussion status to closed Apr 20, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment