Don't get pauses back

#13
by daniel8845 - opened

When i use return_timestamps='word',

i only get back word timestaps which all include pauses at the start timestamp. How can i achieve to get back pauses as seperate elements?

--- Debug Output for Chunk 0-9 ---
Original Word: 'Das', Start: 0.000 s, End: 0.340 s
Original Word: ' Interessante', Start: 0.340 s, End: 1.040 s
Original Word: ' war,', Start: 1.040 s, End: 1.320 s
Original Word: ' dass', Start: 1.320 s, End: 1.460 s
Original Word: ' ich', Start: 1.460 s, End: 1.740 s
Original Word: ' über', Start: 1.740 s, End: 2.320 s
Original Word: ' eine', Start: 2.320 s, End: 2.960 s
Original Word: ' [UH]', Start: 2.960 s, End: 3.720 s
Original Word: ' über', Start: 3.720 s, End: 4.460 s
Original Word: ' meine', Start: 4.460 s, End: 4.720 s
Original Word: ' Bachelor', Start: 4.720 s, End: 5.100 s
Original Word: ' Arbeit,', Start: 5.100 s, End: 5.460 s
Original Word: ' die', Start: 5.460 s, End: 5.780 s
Original Word: ' ich', Start: 5.780 s, End: 5.980 s
Original Word: ' im', Start: 5.980 s, End: 6.100 s
Original Word: ' Bereich', Start: 6.100 s, End: 6.480 s
Original Word: ' von', Start: 6.480 s, End: 6.840 s
Original Word: ' Raumkognition', Start: 6.840 s, End: 8.200 s
Original Word: ' mit', Start: 8.200 s, End: 8.560 s
Original Word: ' da.', Start: 8.560 s, End: 8.880 s

Which transformers version are you using?
Could you try installing this fork of transformers:

pip install git+https://github.com/nyrahealth/transformers.git@crisper_whisper

and try run it with that and see if that changes anything?

Thanks so much for your help, now the timestamps are spot on. Out of curiousity, whats different in this version?

nyra health org

I am glad this fixed it. Which transformers version did you use before? I did not look at the latest transformers version yet but they must have updated something in the timestamping logic.... This version additionally takes out some tokens that have no real acoustic representation like punctuation etc. and excludes them from the DTW alignment and makes sure words and pauses are handled seperately which they originally also were in transformers 4.37.2 atleast. :)

Sign up or log in to comment