freds0 commited on
Commit
9d31bcd
·
verified ·
1 Parent(s): 7da48cf

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +52 -0
README.md ADDED
@@ -0,0 +1,52 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - pt
5
+ base_model:
6
+ - distil-whisper/distil-large-v3
7
+ pipeline_tag: automatic-speech-recognition
8
+ tags:
9
+ - asr
10
+ - pt
11
+ - ptbr
12
+ - stt
13
+ - speech-to-text
14
+ - automatic-speech-recognition
15
+ ---
16
+ # Distil-Whisper-Large-v3 for Brazilian Portuguese
17
+
18
+ <!-- Provide a quick summary of what the model is/does. -->
19
+
20
+ This model is a fine-tuned version of distil-whisper-large-v3 for automatic speech recognition (ASR) in Brazilian Portuguese. It was trained using the Common Voice 16 dataset in conjunction with a private dataset transcribed using Whisper Large v3.
21
+
22
+ ### Model Description
23
+
24
+ <!-- Provide a longer summary of what this model is. -->
25
+
26
+ The model aims to perform automatic speech transcription in Brazilian Portuguese with high accuracy. By combining data from Common Voice 16 with an automatically transcribed private dataset, the model achieved a Word Error Rate (WER) of 8.93% on the validation set of Common Voice 16.
27
+
28
+ - **Model type:** Speech recognition model based on distil-whisper-large-v3
29
+ - **Language(s) (NLP):** Brazilian Portuguese (pt-BR)
30
+ - **License:** MIT
31
+ - **Finetuned from model [optional]:** distil-whisper/distil-large-v3
32
+
33
+ ## How to Get Started with the Model
34
+
35
+ You can use the model with the Transformers library:
36
+ from transformers import WhisperForConditionalGeneration, WhisperProcessor
37
+
38
+ ```python
39
+ processor = WhisperProcessor.from_pretrained("freds0/distil-whisper-large-v3-ptbr")
40
+ model = WhisperForConditionalGeneration.from_pretrained("freds0/distil-whisper-large-v3-ptbr")
41
+
42
+ # Load audio and process
43
+ audio_input = ... # your audio here
44
+ input_features = processor(audio_input, sampling_rate=16000, return_tensors="pt").input_features
45
+
46
+ # Generate transcription
47
+ predicted_ids = model.generate(input_features)
48
+ transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
49
+ print(transcription[0])
50
+ ```
51
+
52
+