New YouTube Auto Captioning Aids Search, Usability, and the Hearing Impaired

MOUNTAIN VIEW, CA — YouTube recently added automatic captioning to its videos, in a move that has far-reaching implications for the deaf and hard-of-hearing, international users, and publishers who seek increase search optimization.

I traveled down to Google’s headquarters yesterday to meet with Ken Harrenstien, a software engineer at Google who helped develop the system.

While YouTube has offered captioning capabilities since 2006, the new features make the process of adding them easier on some videos and also lets creators time up the captions to the spoken words in a video, Harrenstien explained through an interpreter. (His wife Lori served as his interpreter during our visit.)

In addition, the captions can be translated into 51 languages making many English-language videos accessible around the world now.

But there’s also a strong business case to be made for captions because they improve the searchability of videos. “Because captions are text — guess what? You can search them. And at Google we can use that to find out where exactly in a video there is a short snippet,” he explained.

Adding captions can help grow views for publishers, he said. “Hopefully it will make it easier to add captions and I think very soon we will start to notice that it really makes a difference in how many people watch their videos.”

The technology works by linking Google’s automatic speech recognition technology with the YouTube caption system, Harrenstien explained. The automatic captioning is only available on a few channels, including PBS, National Geographic, and Duke University. In time, the goal is to expand that capability to all videos, we are told.

But any YouTube creator can use the automatic timing tool, which lets a user upload text when they post a video. YouTube then matches the spoken words to the written ones creating captions, Harrenstien explained.

From a personal standpoint, Harrenstien said he has dreamed about this technology for a long time. Because he has three young children, he doesn’t have much time to watch videos, but now he said he can understand the ones they are watching.

Here’s a story on the new captioning program by Miguel Helft of The New York Times.

Daisy Whitney, Senior Producer

Update 12/21:  We have created this transcript below and uploaded it to YouTube which created the automated captions in the YouTube video we have posted here.  Came out really well.

Video Transcript

Daisy Whitney:  Hey there, I’m Daisy Whitney reporting for Beet.TV and NBC Bay Area here at Google’s headquarters. Google recently introduced automatic captioning for its videos on YouTube and we’re going to talk about what that means and what it means for users with a software engineer.

So Ken, you have some interesting updates with what’s happening with YouTube and captions. Can you walk us through some of the new products, updates that you have?

Ken Harrenstien:  Sure. Two things we just announced a few weeks ago: 1) automatic captioning and 2) automatic timing. And to me it’s some of the most important things that we’ve done so far. Obviously the captions themselves are very important for many reasons. Would you like me to explain the reasons?

Daisy Whitney: Yes I would.

Ken Harrenstien:
  Sure. Okay. Well obviously we need captions for someone like myself when I watch video. Two, there’s other people who speak a different language but can understand English if they read it or another language, and the captions allow us to translate the language to theirs. So it’s not just access to me, a deaf person, but it’s access for the world. The third, we have people who, for whatever reason, like to have the sound off. Maybe their busy, maybe they’re on a train, on a bus, maybe it’s at night and they don’t want to wake up somebody.

And forth, because captions are text, guess what, you can search on that text. And at Google, we can use that, for example, to find exactly where in the video you have a special snippet and jump straight to that part of the video. And for a short video maybe it doesn’t matter, but for a movie, ooh wow, it really helps a lot. And if you’re searching for a class lecture on physics, maybe you want to skip the boring parts and go straight to what you need to know.

Daisy Whitney:  So it improves searchability on YouTube?

Ken Harrenstien:  Yes. Now, I need to follow up. Two things I want to add. Basically it makes it easier to create captions for videos because we have so much content now. The number is every minute we have 20 hours of more video being uploaded to YouTube. That’s a lot. I mean, a lot of videos are already captioned, but it’s just a very small amount compared to what’s coming in.

So I figured, fortunately the timing is right to hook up a lot of important things at Google and hook them together to make this happen. We linked speech recognition to the technology. Suppose you have a video; it doesn’t yet have captions. The captions are uploaded by the people who own the video. I think it’s easy, but for many people it’s not easy. They have to type the words, they have to find the time, they have to upload the file, so some people don’t bother. But now, if we can recognize the speech in the video, we can generate captions for you. That’s early–it’s only available on a few educational channels and government channels and we’re trying to roll it out as fast as we can.

But the second thing, automatic timing. Suppose okay you’re willing to type in the words for the transcript–that’s all you have to do. You upload that file and we will use speech recognition to find out when those words were spoken and change them into captions. We actually use that technique ourselves at Google for a lot of YouTube videos.

Daisy Whitney:
  So if I upload this video to YouTube, what do I need to do as a user? Do I need to create a transcript and upload that transcript so that there can be captions for this video for instance?

Ken Harrenstien:
  Okay, there’s two things you can do. First, if you were one of the partners for whom this is enabled, then we will do speech recognition for you. It won’t be perfect, but you can take that, fix it up, and upload it as real captions and it will be perfect. Or the second thing we can do. Often you can write down the transcript of what people say in the video, upload that file, and we will do the rest. And I hope you do!

Daisy Whitney:
  Who are your partners right now for whom you do the automatic captioning? And is your goal at some point to do automatic captioning for every video? Is that even possible?

Ken Harrenstien:  Obviously because of our, I mean Google’s, mission statement, we want to make everything accessible to everyone. And we will get there someday, maybe. But for now, the list of partners, we have that on our blog post. We just recently added a lot more. Sorry I forget the list.

Daisy Whitney:
We’ll get it. We’ll look it up. What do you see as the value for publishers and content owners?

Ken Harrenstien:  Hopefully we’ll make it so much easier for them to add captions and then they will. And if they can’t, for whatever reason, we will do the best we can. And I think very soon we will start to notice that it really makes a difference in how many people watch their videos.

Daisy Whitney:  More views. What does it mean for you personally?

Ken Harrenstien: Well I’ve dreamed about this for many years of course. It’s a funny thing, I don’t have a lot of time to watch videos myself because I have three small children, but if they want to watch something, I like to understand what they’re watching too. So it’s great.

Daisy Whitney:
That is great. Thank you so much.

Posted on 12/17/2009 at 10:40 PM by Daisy Whitney

RECENT VIDEOS
AAA
Lifestyle Video Producer Gaiam TV Finds Upside with Paid Model

BOSTON –  Gaiam TV,  an online video lifestyle source , has seen success using a paid subscription model due to the variety of highly-produced, long-form content available, says Michal Lebowitsch, product manager at Gaiam TV. After a 10-day free trial, Lebowitsch says most users convert to the ...

Posted on 05/20/2013 at 9:32 PM by Katy Charles

AAA
Tremor’s VideoHub Unit Gets Video “Viewability” Tool Certified by MRC

Tremor Video’s VideoHub unit has gotten industry certification from the Media Ratings Council (MRC) for its viewability measurement tool, which determines the  placement of videos on Web pages, plus engagement and other metrics, the company announced today. We spoke with VideoHub general manager ...

Posted on 05/20/2013 at 9:51 AM by Andy Plesser

AAA
Video Search at a Crossroads with Increased Adoption of ‘Schema’

BOSTON – Effective video search has been limited, largely because search engines don’t index images. But things are changing with the implementation of Schema, a technology which allows video producers to “wrap” a transcript as an “object” into a video. The Schema initiative ...

Posted on 05/19/2013 at 9:42 PM by Andy Plesser

AAA
mDialog Has Dynamic Ad Insertion for Live, Linear Programming

BOSTON – mDialog, the Toront0-based video advertising technology company, has a new product to insert video advertising dynamically into live programming.  The product, launched in collaboration with Brightcove, was announced last month at NAB. Last week at the Brightcove global customer event, we spoke ...

Posted on 05/19/2013 at 9:01 PM by Andy Plesser

AAA
Forbes Ups its Native Advertising with “BrandVoice”

LONDON – With the same publishing platform used by its staff and its network of 1,000 contributors, Forbes.com is increasing branded content in the form of articles published on brands’ microsites.   The company is publishing about eight such articles a day and is expanding into branded video ...

Posted on 05/19/2013 at 3:25 PM by Andy Plesser

517559930_11.jpg
Tumblr Cozies up to Madison Avenue with Agency Program, David Karp Explains

MONACO (originally published 11/20/2012) – Tumblr, which has been a non-commerical blog platform, has announced an agency program for brands who want to share their messages in the form of “storytelling,” explains founder and CEO David Karp in this interview with Beet.TV We interviewed  him in ...

Posted on 05/19/2013 at 8:25 AM by Andy Plesser

AAA
Adaptive And Addressable: How Connected TV Could Totally Target Ads

LONDON – GroupM’s Mindshare already targets many conventional TV ads using online data – but the full internet delivery of TV advertising could unleash even greater targetability. Speaking at Beet.TV’s recent London Video Ad Strategy Summit, the group’s chief digital officer Norm ...

Posted on 05/17/2013 at 11:22 AM by Robert Andrews

AAA
Video Recommendation Engine Taboola Goes Mobile with Hearst, Time.com

BOSTON – Taboola, the video recommendation platform widely deployed on many publisher news sites, is now on mobile devices, on the apps of Time.com and several Hearst U.K. publications, says Adam Singolda, CEO, in this interview with Beet.TV We spoke with him about the growth of Taboola and the ...

Posted on 05/16/2013 at 3:20 PM by Andy Plesser

AAA
‘Joyus,’ Video-Centric Women’s Shopping Site Finds High Views-to-Sales Conversions

BOSTON – Joyus, the San Francisco-based video shopping site for women, which raised $11.5 million in a new venture round earlier this month from Time Warner and others, is finding extremely high conversion rates from video views to product sales, says co-founder Diana Williams in this interview with ...

Posted on 05/16/2013 at 12:05 PM by Andy Plesser

AAA
The Smithsonian Channel Finds an Audience on the Roku

BOSTON – Launched on Roku in October, the Smithsonian Channel is ranked among the most popular apps, attracting 2 million video views a month, says Carlos Zambrano, senior producer at Smithsonian Networks, in this this interview with Beet.TV Also in this this interview, he talks about the channel’s ...

Posted on 05/16/2013 at 10:18 AM by Andy Plesser

AAA
Brightcove CEO Sees Expanding Opportunities with Xbox

BOSTON – While the Xbox grows as a pervasive digital video platform, the number of video apps is quite limited, controlled by Microsoft, but that will likely expand says Brightcove CEO David Mendels.  He says that Brightcove is readying several implementations for the game console. We spoke with him ...

Posted on 05/16/2013 at 9:10 AM by Andy Plesser

AAA
Akamai Has Diagnostic Tool for Individual Viewer Performance

BOSTON – Akamai has expanded its analytics offering to include new functionality to track the individual viewer experience and consumption history, explains Noreen Hafez, senior product marketing manager, in this interview with Beet.TV.  We spoke with her at the Brightcove PLAY customer conference earlier ...

Posted on 05/16/2013 at 5:54 AM by Andy Plesser