|
From: Isaac Koi <isaackoi.nul> Date: Fri, 9 Nov 2012 13:12:55 +0000 Archived: Fri, 09 Nov 2012 22:33:52 -0500 Subject: Transcribing UFO Podcasts And Documentaries A while ago I mentioned I was interested in including PDF transcripts of podcasts in my growing library of digitised UFO material. I thought I'd report back to this List about a bit of experimenting I conducted with a few options. The results were rather disappointing. After a bit of searching, it seemed that the two most promising pieces of software were Dragon Naturally Speaking and Adobe Soundbooth's Speech Search transcription function. (1) Dragon Naturally Speaking Dragon Naturally Speaking is discussed on various websites, including at: http://tinyurl.com/b8zoskx http://www.youtube.com/watch?v=kqEB7ju9HrI In short, after doing some reading regarding Dragon Naturally Speaking, it seems that software needs to be trained to the voice of a particular individual, by that individual reading out various passages of text. This is, of course, a considerable problem when I was seeking a means to transcribe large archives of podcasts featuring various presenters and guests. (2) Adobe Soundbooth: I therefore turned to Adobe Soundbooth's "Speech Search" transcription function which is discussed online at, amongst other places, this link: http://tinyurl.com/y9an2tk I used that software to create a transcript of Episode 400 of EBK's Strange Days Indeed... podcast. That sample transcript is at: http://minus.com/mbqBwo3Ipc/ As can be seen from that sample, the results are basically unusable and I'm not convinced that seeking to correct that transcript would be quicker than starting from scratch and simply transcribing a podcast manually. (3) Using Youtube's automated "closed caption" system: While looking into the above two pieces of software, I can across some material on Youtube's automatic "closed caption" system. Basically, it is possible to upload any video or podcast to Youtube and use Youtube's automated system to create a transcript of that material. The link above includes a sample of second transcript of Episode 400 of EBK's "Strange Days Indeed" podcast created using this method. (The relevant episode of Strange Days Indeed was uploaded to a private folder so it was not made publically available. Errol is happy for me to share the sample transcripts created by the various methods outlined above). While this method has the advantage of being free and very easy to use, the results are - once again - very disappointing and almost unusable as can be seen the from Youtube sample transcript at the link above. (4) Downloading Youtube's "Closed Captions": One of the more interesting incidental discoveries I made during the above exercises regarding transcribing podcasts is that the closed captions that can be displayed on almost all Youtube videos (by clicking on the "CC" at the bottom right of each video and selecting "Transcribe Auto") can be downloaded quickly and easily using various methods. The simplest and most reliable that I've found so far being the one at this link: http://tinyurl.com/ahbb4wo i.e. Open the video page in Chrome browser (or any other browser that provides HTTP debugging/Developer Tools) and pause the video Right click anywhere on the page, and click on Inspect Element OR hit the F12 function key. Click on Network tab Under the Network tab, look for an item called timedtext. Right click on it and open that file in a new tab. An xml file containing subtitles with their timestamps(the stuff inside of <>) opens up. You can then convert the transcript into numerous formats using free software, e.g.: "Subtitle Edit" at: http://www.nikse.dk/ Since many UFO documentaries are already on Youtube, I thought this could be a great way of quickly building up a library of transcripts of documentaries. Unfortunately, the automated transcripts on Youtube videos are generally _very_ poor and hence not much use. They can, however, be quite amusing. For example, if you turn on the automated captions on Youtube's copy of de Caro's "Special Assignment" on CNN at the link below, you get treated to mis-transcriptions such as "that silly UFO" (4:47) and "crap landing" (5:16), plus my favourite - when Airman Greg Battram ("Airman Greg") states on the video that "I think I saw a UFO, some kind of spaceship from someplace not of this Earth" the automated captions for the final bit say "someplace closer" (4:41). Preparing my own transcript for ease of future reference and searches using that transcript as a starting point turned out to be rather time consuming due to the errors and missing sections in the automated transcript: http://www.youtube.com/watch?v=sPkoqYXhyik Many of the comments on Youtube's automated closed captions include the word "FAIL". In fact, the weakness of Youtube's automated transcriptions is the source of some humour in the Youtube video at the link below: http://www.youtube.com/watch?v=hVNrkXM3TTI In a small number of cases, the transcripts on UFO documentaries on Youtube are very good because someone has uploaded a human transcription. But a bit of random sampling of UFO videos on Youtube suggests this applies to, oh, less than 1 percent of videos so this only applies to a very small number of UFO documentaries (as does - so far as I've seen - downloading transcripts of UFO documentaries from Livedash.com and torrent websites). All-in-all, I found these results very disappointing. I'll post something based on the above on the AboveTopSecret.com forums shortly, since there are many technically capable members on there that may be aware of some service or software that can improve on the poor results mentioned above. All the best, Isaac Listen to 'Strange Days... Indeed' - The PodCast At: http://www.virtuallystrange.net/ufo/sdi/program/ These contents above are copyright of the author and UFO UpDates - Toronto. They may not be reproduced without the express permission of both parties and are intended for educational use only.
[ Next Message | This Day's Messages ]
This Month's Index |
UFO UpDates - Toronto - Operated by Errol Bruce-Knapp