Here's a screen shot taken while playing the video in the previous post.
Note the menu, which appeared when I clicked on the CC button. And after clicking the Transcribe Audio option, just like that, subtitles started appearing. (Added on the fly, in real-time? Or computed once and stored in a separate channel of sorts? Gotta be the latter.)
You can see an error right there in the screen grab, as it happens -- "eighty" should have been any. [Added: and actually, the number of errors throughout is not acceptable by any standard.] Still, though, be interesting to see where this goes, and how well it does with time. Speech recognition is a really hard nut to crack, but I'm curious to see what the Google crew will come up with. Maybe instead of coming up with the genius AI algorithm, the problem will succumb (with high-90s accuracy, at least) to a massive statistical approach, who knows? I wonder, also, if they have some plan like reCAPTCHA or Image Labeler to get the online crowd to help?