Issue when trying to run Whisper offline from locally saved pretrained model
Hi,
I am trying to run Whisper locally, using the model's downloaded files from a folder.
I downloaded the model for offline use, following the instructions suggested here, see my code below:
from transformers import AutoTokenizer, AutoModelForSpeechSeq2Seq
MODEL_FROM_FILE = os.path.join('models', 'whisper-large-v3')
tokenizer = AutoTokenizer.from_pretrained("openai/whisper-large-v3")
model = AutoModelForSpeechSeq2Seq.from_pretrained("openai/whisper-large-v3")
tokenizer.save_pretrained(MODEL_FROM_FILE)
model.save_pretrained(MODEL_FROM_FILE)
The first, problem I encountered was a missing file, getting the following error:
models/whisper-large-v3 does not appear to have a file named preprocessor_config.json. Checkout 'https://huggingface.co./models/whisper-large-v3/main' for available files.
Downloading the file manually (from here) seems to help overcome this problem, but then another came up, see below:
probability tensor contains either `inf`, `nan` or element < 0
Searching online found that it might be related to the device I am running on or some misconfiguration of my model.
The first thing I tried was to run on cpu
instead of cuda:0
which is how I run normally (not when offline). See the original code below:
model: Whisper = None
DEVICE = "cuda:0" if torch.cuda.is_available() else "cpu"
torch_dtype = torch.float16 if torch.cuda.is_available() else torch.float32
MODEL_FROM_FILE = os.path.join('models', 'whisper-large-v3')
model = AutoModelForSpeechSeq2Seq.from_pretrained(
MODEL_FROM_FILE, torch_dtype=torch_dtype, low_cpu_mem_usage=True, use_safetensors=True, local_files_only=True)
model.to(DEVICE)
processor = AutoProcessor.from_pretrained(MODEL_FROM_FILE)
asr = pipeline(
task="automatic-speech-recognition",
model=model,
tokenizer=processor.tokenizer,
feature_extractor=processor.feature_extractor,
max_new_tokens=128,
torch_dtype=torch_dtype,
device=DEVICE,
)
temperature = 0.3
result = asr(audio_file,
chunk_length_s=30, # 30 seconds
batch_size=4,
return_timestamps=True,
generate_kwargs={"language":"english", "do_sample":True, "temperature":temperature})
Changing the DEVICE and torch_type (as shown below), seems to solve the problem.
DEVICE = "cpu"
torch_dtype = torch.float32
the version of torch
installed on my machine is the following
torch==1.13.1+cu117
torchvision==0.14.1+cu117
torchaudio==0.13.1+cu117
Even though this solves my problem it's not an acceptable solution.
Any ideas about what might be the problem?
what are the specifications of your device?
Processor: Intel(R) Core(TM) i7-10850H CPU @ 2.70GHz 2.71 GHz
Installed RAM: 32.0 GB (31.8 GB usable)
System type: 64-bit operating system, x64-based processor
Is there something else you might need?
It might be worth saying that the code runs fine when I download the model from Huggingface. My problem only occurs when I try to load it from local files.