Stanford InfoLab Publication Server

Portrait of an Indexer—Computing Pointers Into Instructional Videos

Lamb, Andrew and Hernandez, Jose and Ullman, Jeffrey and Paepcke, Andreas (2016) Portrait of an Indexer—Computing Pointers Into Instructional Videos. Technical Report. Stanford InfoLab.

WarningThere is a more recent version of this item available.

[img]PDF (Automated creation of indexes into videos via closed caption files) - Submitted for Publication


We examine algorithms for creating indexes into ordered series of instructional lecture video transcripts. The goal is for students and industry practitioners to use the indexes towards review or reference. Lecture videos differ from often-examined document collections such as newspaper articles in that the transcript ordering generally reflects pedagogical intent. One challenge is therefore to identify where a concept is <i>primarily</i> introduced, and where the resulting index should thus direct students. The typically applied TF-IDF approach gets tricked in this context by artifacts such as worked examples whose associated vocabulary may dominate a lecture, but should not be included in a good index. We contrast the TF-IDF approach with algorithms that consult Wikipedia documents to vouch for term importance. This method helps filter the harmful artifacts. We measure the algorithms against three human-created indexes over the 90 lecture videos of a popular database course. We found that <i>(i)</i> humans have low inter-rater reliability, whether they are experts in the field or not, and that <i>(\em ii)</i> one of the examined algorithms approaches the inter-rater reliability with humans.

Item Type:Techreport (Technical Report)
Projects:Digital Libraries
ID Code:1138
Deposited By:Andreas Paepcke
Deposited On:05 Mar 2016 09:25
Last Modified:05 Mar 2016 09:31

Available Versions of this Item

Download statistics

Repository Staff Only: item control page