How to fix "TypeError: expected str, bytes or os.PathLike object, not NoneType" when specifying the local whisper model
#65
by
BenjaminChu
- opened
Here is my modified code specifying the local path of whisper files:
import torch
from transformers import AutoModelForSpeechSeq2Seq, AutoProcessor, pipeline
device = "cuda:0" if torch.cuda.is_available() else "cpu"
torch_dtype = torch.float16 if torch.cuda.is_available() else torch.float32
model_id = "E:\LLM\whisper-large-v3\models"
model = AutoModelForSpeechSeq2Seq.from_pretrained(
model_id, torch_dtype=torch_dtype, low_cpu_mem_usage=True, use_safetensors=True
)
model.to(device)
processor = AutoProcessor.from_pretrained(model_id)
pipe = pipeline(
"automatic-speech-recognition",
model=model,
tokenizer=processor.tokenizer,
feature_extractor=processor.feature_extractor,
max_new_tokens=128,
chunk_length_s=30,
batch_size=16,
return_timestamps=True,
torch_dtype=torch_dtype,
device=device,
)
result = pipe("audio.mp3",return_timestamps=True)
print(result["text"])
And it shows:
raceback (most recent call last):
File "e:\LLM\whisper-large-v3\main.py", line 15, in <module>
processor = AutoProcessor.from_pretrained(model_id)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\LLM\chatglm2-6b\.venv\Lib\site-packages\transformers\models\auto\processing_auto.py", line 268, in from_pretrained
return processor_class.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\LLM\chatglm2-6b\.venv\Lib\site-packages\transformers\processing_utils.py", line 184, in from_pretrained
args = cls._get_arguments_from_pretrained(pretrained_model_name_or_path, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\LLM\chatglm2-6b\.venv\Lib\site-packages\transformers\processing_utils.py", line 228, in _get_arguments_from_pretrained
args.append(attribute_class.from_pretrained(pretrained_model_name_or_path, **kwargs))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\LLM\chatglm2-6b\.venv\Lib\site-packages\transformers\tokenization_utils_base.py", line 1825, in from_pretrained
return cls._from_pretrained(
^^^^^^^^^^^^^^^^^^^^^
File "E:\LLM\chatglm2-6b\.venv\Lib\site-packages\transformers\tokenization_utils_base.py", line 1988, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\LLM\chatglm2-6b\.venv\Lib\site-packages\transformers\models\whisper\tokenization_whisper.py", line 293, in __init__
with open(merges_file, encoding="utf-8") as merges_handle:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: expected str, bytes or os.PathLike object, not NoneType
How to fix that?
same here. already tried everything in the book to have the path be recognized as an os.PathLike but it appears nothing is working .-.
a little update from my part. i tried around a lot more and found smth that works. before the result=--- add either a model.save_pretrained(path) or pipe.save_pretrained(path) (as im not sure which of those two actually did smth, i just did both). Saves everything needed from the online repo to be used locally. just put the model id after as the path and delete the added lines.