Automatic Speech Recognition
Malayalam
ctranslate2
audio
vegam

vegam-whipser-medium-ml-int8_float16 (വേഗം)

This just support int8_float16 quantization only.

Note: Model file size is 773 MB.

This is a conversion of thennal/whisper-medium-ml to the CTranslate2 model format.

This model can be used in CTranslate2 or projects based on CTranslate2 such as faster-whisper.

Installation

pip install faster-whisper
  • Install git-lfs for using this project. Note that git-lfs is just for downloading model from hugging-face.
apt-get install git-lfs
  • Download the model weights
git lfs install
git clone https://huggingface.co./kurianbenoy/vegam-whisper-medium-ml-int8_float16

Usage

from faster_whisper import WhisperModel

model_path = "vegam-whisper-medium-ml-int8_float16"

model = WhisperModel(model_path, device="cuda", compute_type="int8_float16")

segments, info = model.transcribe("audio.mp3", beam_size=5)

print("Detected language '%s' with probability %f" % (info.language, info.language_probability))

for segment in segments:
    print("[%.2fs -> %.2fs] %s" % (segment.start, segment.end, segment.text))

Example

from faster_whisper import WhisperModel

model_path = "vegam-whisper-medium-ml-int8_float16"

model = WhisperModel(model_path, device="cuda", compute_type="int8_float16")


segments, info = model.transcribe("00b38e80-80b8-4f70-babf-566e848879fc.webm", beam_size=5)

print("Detected language '%s' with probability %f" % (info.language, info.language_probability))

for segment in segments:
    print("[%.2fs -> %.2fs] %s" % (segment.start, segment.end, segment.text))

Detected language 'ta' with probability 0.353516

[0.00s -> 4.74s] പാലം കടുക്കുവോളം നാരായണ പാലം കടന്നാലൊ കൂരായണ

Note: The audio file 00b38e80-80b8-4f70-babf-566e848879fc.webm is from Malayalam Speech Corpus and is stored along with model weights.

Conversion Details

This conversion was possible with wonderful CTranslate2 library leveraging the Transformers converter for OpenAI Whisper.The original model was converted with the following command:

ct2-transformers-converter --model thennal/whisper-medium-ml --output_dir vegam-whisper-medium-ml-int8_float16 \
--quantization int8_float16

Many Thanks to

  • Creators of CTranslate2 and faster-whisper
  • Thennal D K
  • Santhosh Thottingal
Downloads last month
15
Inference Examples
Inference API (serverless) does not yet support ctranslate2 models for this pipeline type.

Datasets used to train smcproject/vegam-whisper-medium-ml-int8_float16