[Notes] OpenAi Whisper for Video Subtitles

IN AI Bot Chat
  • Updated:2 years ago
  • Reading Time:10Minutes
  • Post Words:2558Words
Print Friendly, PDF & Email

This is just my notes that I’m taking for my own use while I was trying to figure out OpenAi Whisper to:

  1. Automatically transcribe videos that have no subtitles
  2. Works well, even if the speakers have accents or speak fast or hard to hear
  3. Translate videos from other languages to English

See this post for a less-cluttered list of useful tools

OpenAi Whisper via Google Drive > Colaboratory

April 5th, 2023 – I just found this tool from a YouTube video and I just followed his instructions and immediately transcribed a podcast while I was watching (confirming his instructions are accurate).

Easily transcribe videos (particularly from sites that have no subtitle files: Telegram/Rumble/Brighteon/BitChute and other censored platforms).
It can also help us automatically translate and transcribe all those awesome non-English truth videos, which opens up the world.

This guy explains exactly step-by-step how to use Google Drive (Colaboratory) to install and use Whisper AI right from your browser without anything to download.

Keep in mind:

  1. ) Download the text straight away (it times out and deletes your files after a certain timeframe).
    • So don’t load up and leave the house, just download the text/subtitles straight away.
    • If you time-out, you have to reload the !Whisper code (30seconds) and re-upload your file.
  2. ) Google can decide at any moment that it isn’t going to allow you to use GPU (without payment); thereby, alternatives to Google Colab will be listed as I come across them below.

Alternatives: Other Online versions (alternatives to Google Colab version)

Alternative: for YouTube or really crappy Subtitles:

I’ve been using chatGPT to clean-up crappy YouTube subtitles and make them more accurate:

When using chatGPT to clean up YouTube’s god-awful auto-generated subtitles, my “go-to” commands are generally one of the following:

  • Please correct punctuation and spelling of the following subtitles: “your crappy subtitle text here “
  • Please summarize the following text: ” “
  • Please provide key highlights of the following subtitles, in bulletpoint form: ” “
  • Please correct grammar for following subtitles: ” “

Alternative: (for Very Large files) VB-CABLE Virtual Audio & Dictation Pro

An alternative to Whisper AI for larger videos (for example if you don’t want to upload a 3-hour video file, the same guy (Kevin Stratvert) has another video on how to use Window’s own built-in Dictator Pro software with Virtual Audio Cable (both free), which you have to run on your own computer but might be a life-saver for someone. However it looks to me that it transcribes in real time i.e. a 3 hour video will take 3 hours playtime. All the other apps transcribe the video in minutes. but their downside is that you have to sometimes make the file smaller, and it might just be easier to use this method in some circumstances.

Alternative: Computer Apps using Whisper AI to get subtitles

Python & WhisperAI

From what I have learnt, I’ll use these commands the most using Python:

To transcribe a video and output just a text file:

whisper "test file.mp4" --model small.en --output_format txt --task transcribe --length_penalty 0 --device cuda

To transcribe & translate a Spanish mp4 into mp3 to translate into English subtitles & text:

ffmpeg -i "deadwhistleblower.mp4" -q:a 0 -map a deadwhistleblower.mp3

whisper "deadwhistleblower.mp3" --model tiny --language Spanish --task translate --patience 2 --device cuda

To transcribe a Spanish video into Spanish text: (to use a different program to translate if the whisper translation is rubbish):

whisper "deadwhistleblower.mp4" --model tiny --language Spanish --output_format txt --task transcribe --patience 2 --device cuda

Convert existing mp4 into mp3 in seconds

  • Browse to folder with mp4 file
  • CMD
  • basic use:
    • ffmpeg -i example.mp4 example.mp3

For my use:

ffmpeg -i "testvideo.mp4" -q:a 0 -map a testvideo.mp3

Split Videos in seconds (to less than 25mb for use with whisper)

Might as well learn how to do this too (as a faster option to my video editing software)

Instructions: timestamp 10:20 from first video (YouTube)

  1. Using https://github.com/mifi/lossless-cut/releases
    • For me: https://github.com/mifi/lossless-cut/releases/download/v3.54.0/LosslessCut-win-x64.7z
  2. Open LosslessCut exe
  3. Drag video into app
  4. Change timestamp
  5. Export the cut
    • (WOW, that was so fast… normally this takes a long time with video editing software, it split the video in seconds!)

WhisperAI Wishlist

chatGPT wrote a whisper script for me but it isn’t supported. but I’m keeping it on here as a must-have wish-list:

 --punctuate

--filter_outputs "uh, um, like, and and, I I, you know, like, basically, actually, sort of, kind of, hmm, mm, mhm, mmm, oh" 

Download Videos in seconds

Instructions timestamp 07:48 on this video (YouTube) or Basic instructions 01:35 (YouTube) and GUI instructions (YouTube) and you’ll need FFMPEG as well (listed further up the post in the !whisper instructions)

To use GUI version

  • Move all 3 to same folder
  • Launch yt-dlp-gui
  • Enter url > analyze
    • Select which video resolution you want
    • Select which audio quality you want
    • Select or uncheck thumbnail
    • Select output directory
    • Download

To use CMD version

  • Move all 3 to same folder
  • CMD from folder
  • yt-dlp https://youtu.be/7QPbfKDOkO4
  • Downloads the video in seconds

Filtering/adjusting CMD version – go to this post: [CMD] Download videos & snippets in seconds

Must-use resources for researchers are on these posts:

Penny (PennyButler.com)
Penny (PennyButler.com)

Truth-seeker, ever-questioning, ever-learning, ever-researching, ever delving further and deeper, ever trying to 'figure it out'. This site is a legacy of sorts, a place to collect thoughts, notes, book summaries, & random points of interests.