YAML Metadata
Error:
"language" must only contain lowercase characters
YAML Metadata
Error:
"language" with value "Bengali" is not valid. It must be an ISO 639-1, 639-2 or 639-3 code (two/three letters), or a special value like "code", "multilingual". If you want to use BCP-47 identifiers, you can specify them in language_bcp47.
Wav2Vec2-Large-XLSR-Bengali
Fine-tuned facebook/wav2vec2-large-xlsr-53 Bengali using a subset of 40,000 utterances from Bengali ASR training data set containing ~196K utterances. Tested WER using ~4200 held out from training. When using this model, make sure that your speech input is sampled at 16kHz. Train Script can be Found at : train.py
Data Prep Notebook : https://colab.research.google.com/drive/1JMlZPU-DrezXjZ2t7sOVqn7CJjZhdK2q?usp=sharing
Inference Notebook : https://colab.research.google.com/drive/1uKC2cK9JfUPDTUHbrNdOYqKtNozhxqgZ?usp=sharing
Usage
The model can be used directly (without a language model) as follows:
import torch
import torchaudio
from transformers import Wav2Vec2ForCTC, Wav2Vec2Processor
processor = Wav2Vec2Processor.from_pretrained("arijitx/wav2vec2-large-xlsr-bengali")
model = Wav2Vec2ForCTC.from_pretrained("arijitx/wav2vec2-large-xlsr-bengali")
# model = model.to("cuda")
resampler = torchaudio.transforms.Resample(TEST_AUDIO_SR, 16_000)
def speech_file_to_array_fn(batch):
speech_array, sampling_rate = torchaudio.load(batch)
speech = resampler(speech_array).squeeze().numpy()
return speech
speech_array = speech_file_to_array_fn("test_file.wav")
inputs = processor(speech_array, sampling_rate=16_000, return_tensors="pt", padding=True)
with torch.no_grad():
logits = model(inputs.input_values).logits
predicted_ids = torch.argmax(logits, dim=-1)
preds = processor.batch_decode(predicted_ids)[0]
print(preds.replace("[PAD]",""))
Test Result: WER on ~4200 utterance : 32.45 %
- Downloads last month
- 119
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.