Lamb, Andrew and Hernandez, Jose and Ullman, Jeffrey and Paepcke, Andreas (2016) Portrait of an Indexer—Computing Pointers Into Instructional Videos. Technical Report. Stanford InfoLab.
This is the latest version of this item.
|PDF (Automated creation of indexes into videos via closed caption files)|
We examine algorithms for creating indexes into ordered series of instructional lecture video transcripts. The goal is for students and industry practitioners to use the indexes towards review or reference. Lecture videos differ from often-examined document collections such as newspaper articles in that the transcript ordering generally reflects pedagogical intent. One challenge is therefore to identify where a concept is primarily introduced, and where the resulting index should thus direct students. The typically applied TF-IDF approach gets tricked in this context by artifacts such as worked examples whose associated vocabulary may dominate a lecture, but should not be included in a good index. We contrast the TF-IDF approach with algorithms that consult Wikipedia documents to vouch for term importance. This method helps filter the harmful artifacts. We measure the algorithms against three human-created indexes over the 90 lecture videos of a popular database course. We found that (i) humans have low inter-rater reliability, whether they are experts in the field or not, and that (ii) one of the examined algorithms approaches the inter-rater reliability with humans.
|Item Type:||Techreport (Technical Report)|
|Deposited By:||Andreas Paepcke|
|Deposited On:||05 Mar 2016 09:31|
|Last Modified:||05 Mar 2016 09:31|
Available Versions of this Item
- Portrait of an Indexer—Computing Pointers Into Instructional Videos. (deposited 05 Mar 2016 09:25)
- Portrait of an Indexer—Computing Pointers Into Instructional Videos. (deposited 05 Mar 2016 09:31) [Currently Displayed]
Repository Staff Only: item control page