freds0
/

distil-whisper-large-v3-ptbr

Automatic Speech Recognition

Model card Files Files and versions Community

freds0 commited on Oct 16, 2024

Commit

9d31bcd

·

verified ·

1 Parent(s): 7da48cf

Create README.md

Files changed (1) hide show

README.md +52 -0

README.md ADDED Viewed

	@@ -0,0 +1,52 @@

+---
+license: mit
+language:
+- pt
+base_model:
+- distil-whisper/distil-large-v3
+pipeline_tag: automatic-speech-recognition
+tags:
+- asr
+- pt
+- ptbr
+- stt
+- speech-to-text
+- automatic-speech-recognition
+---
+# Distil-Whisper-Large-v3 for Brazilian Portuguese
+<!-- Provide a quick summary of what the model is/does. -->
+This model is a fine-tuned version of distil-whisper-large-v3 for automatic speech recognition (ASR) in Brazilian Portuguese. It was trained using the Common Voice 16 dataset in conjunction with a private dataset transcribed using Whisper Large v3.
+### Model Description
+<!-- Provide a longer summary of what this model is. -->
+The model aims to perform automatic speech transcription in Brazilian Portuguese with high accuracy. By combining data from Common Voice 16 with an automatically transcribed private dataset, the model achieved a Word Error Rate (WER) of 8.93% on the validation set of Common Voice 16.
+- **Model type:** Speech recognition model based on distil-whisper-large-v3
+- **Language(s) (NLP):** Brazilian Portuguese (pt-BR)
+- **License:** MIT
+- **Finetuned from model [optional]:** distil-whisper/distil-large-v3
+## How to Get Started with the Model
+You can use the model with the Transformers library:
+from transformers import WhisperForConditionalGeneration, WhisperProcessor
+```python
+processor = WhisperProcessor.from_pretrained("freds0/distil-whisper-large-v3-ptbr")
+model = WhisperForConditionalGeneration.from_pretrained("freds0/distil-whisper-large-v3-ptbr")
+# Load audio and process
+audio_input = ...  # your audio here
+input_features = processor(audio_input, sampling_rate=16000, return_tensors="pt").input_features
+# Generate transcription
+predicted_ids = model.generate(input_features)
+transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
+print(transcription[0])
+```