(Semi) Automatic Captioning Using YouTube

Automatic captioning (also known a sub-titling) of videos to make them accessible to Deaf or hearing impaired viewers – sounds great doesn’t it? Google has recently introduced a system on YouTube which claims to auto-caption any English speech using in-built speech recognition.

Captioning a video can be a long and expensive process, so an automatic method should ensure more videos can be captioned. One of BRITE’s assistive technologists, Fil McIntyre, reports his findings and provides tips to get started with captioning…

I tested a few videos but found the results were far from satisfactory and could not be relied upon for giving an accurate transcript of the speech.  It seems that unless the speaker has very clear and precise speech (Barrack Obama is a good example) the auto captioning is inaccurate.  Any background noise can also affect the accuracy. But all is not lost!  There is a semi-automatic method to caption videos.

For any video you upload, YouTube also allows you to upload a transcript of that video.  It will then use the in-built speech recognition and match the text of the transcript to the video.  This works very well and means you don’t need to work out all the timings for the transcript, but you still get an accurate and correctly timed representation of what is being said.  It can take some time to produce the transcript but nowhere near the time it would take to also input all the timings yourself.

In short – any video you can upload to YouTube, you can also easily provide captions for.  If using videos in education this is a simple way to provide access for students who are D/deaf or who have a hearing loss.  It also may be of benefit to students who have English as a second language.

When someone views the video they will have the option to click the CC at the bottom of the player and choose to view the captions.

Some further guidance on the practical aspects of captioning is provided below:

(Updated 14.01.14 due to YouTube interface changes)

  1. You must have uploaded the video to your YouTube account.
  2. Type out the transcription and save as a .txt file.  You can do this in any word processor (e.g. Microsoft Word) but be sure to change the file format to .txt or plain text.
  3. In your YouTube account, go to Video Manager then next to your video click edit.
  4. Under the Captions tab click Add a new track then Upload a file.
  5. Browse to your transcription file then click Open.
  6. Once uploaded click on English (United Kingdom) then under Actions select Activate.
  7. You may also want to disable the Automatic Captions


  • If there are significant non-verbal sounds then type them amongst the text at the appropriate points in square brackets e.g. [laughter], [crash].
  • If there is more than one speaker on (or off) screen you should label who is speaking  e.g.  Frank: We think we know how he did it.  Sally: Oh, Howie couldn’t have done it. He hasn’t been in for weeks.
  • Standard captioning will use colours to differentiate between speakers.  If there is lots of conversation in your video you may want to think about getting it professionally captioned rather than using this method.

Fil McIntyre 

Read Fil’s ICT Tip of the Month in the Equipment Loan Bank area of the BRITE website.

One thought on “(Semi) Automatic Captioning Using YouTube

  1. Pingback: Transforming support for d/Deaf students | The BRITE Blog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s