Automatic Keyphrase Extraction and Segmentation of Video Lectures
Arun Balagopalan, Lalitha Lakshmi. B, Vidhya Balasubramanian, Nithin Chandrasekharan,
Ashwin Damodar
Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India
Abstract—Keyphrases are essential meta-data that summarize the contents of an instructional video. In this paper, we present a domain independent, statistical approach for automatic keyphrase extraction from audio transcripts of video lectures.
We identify new features in audio transcripts, that capture key patterns characterizing keyphrases in lecture videos. A system for keyphrase extraction is designed that uses a supervised machine learning algorithm, based on a Naive-Bayes classifier to extract relevant keyphrases. Our extensive experimental studies show that our system extracts more relevant keywords than existing approaches. The paper also studies how automatic keyphrase extraction method works better than existing systems for different categories of lectures. In addition we demonstrate the utility of the extracted keyphrases by using them as features for topic based segmentation of the lecture. As a result we are able to provide an automated section-wise annotated video lecture in a lecture browser.
I. I NTRODUCTION
Universities around the world are increasingly using multimedia based mediums of instruction to augment classroom learning. Several major universities have taken one step further, by making portions of their courses in digital form available to the public over the Internet. Video lectures in university channels, hosted on video sharing websites have become immensely popular, garnering millions of viewers.
The proliferation of such content has led to increasing interest in research into developing lecture browsers that improve the overall learning experience of the student.
To improve the lecture browser utility to students, video lectures are often accompanied by