cwkeam
/

mctct-large

Automatic Speech Recognition

Inference Endpoints

Model card Files Files and versions Community

cwkeam commited on May 5, 2022

Commit

eaa3a8c

•

1 Parent(s): a48f800

update readme to working code

Files changed (1) hide show

README.md +5 -4

README.md CHANGED Viewed

@@ -45,18 +45,19 @@ For more information on how the model was trained, please take a look at the [of
 To transcribe audio files the model can be used as a standalone acoustic model as follows:
 ```python
-import torch
 import torchaudio
 from transformers import MCTCTForCTC, MCTCTProcessor
-model = MCTCTForCTC.from_pretrained("speechbrain/mctct-large")
-processor = MCTCTProcessor.from_pretrained("speechbrain/mctct-large")
  # load dummy dataset and read soundfiles
 ds = load_dataset("patrickvonplaten/librispeech_asr_dummy", "clean", split="validation")
 # tokenize
-input_features = processor(ds[0]["audio"]["array"], return_tensors="pt", padding="longest").input_features  # Batch size 1
 # retrieve logits
 logits = model(input_features).logits

 To transcribe audio files the model can be used as a standalone acoustic model as follows:
 ```python
+import torch
 import torchaudio
+from datasets import load_dataset
 from transformers import MCTCTForCTC, MCTCTProcessor
+model = MCTCTForCTC.from_pretrained("cwkeam/mctct-large")
+processor = MCTCTProcessor.from_pretrained("cwkeam/mctct-large")
  # load dummy dataset and read soundfiles
 ds = load_dataset("patrickvonplaten/librispeech_asr_dummy", "clean", split="validation")
 # tokenize
+input_features = processor(ds[0]["audio"]["array"], return_tensors="pt").input_features
 # retrieve logits
 logits = model(input_features).logits