five

SSLectures: Abstractive Summaries and Topic Segments of Lecture Videos

收藏
Mendeley Data2024-05-10 更新2024-06-28 收录
下载链接:
https://zenodo.org/records/10498680
下载链接
链接失效反馈
官方服务:
资源简介:
SSLectures: Abstractive Summaries and Topics Segments of Lecture Videos SSLectures is a dataset containing abstractive summaries of lecture videos from AK Lectures website and MIT OCW repository. It also contains topic segments (chapters) for the MIT lectures. The dataset was scraped from free publicly available material and is published under a Creative Commons License that allows re-distribution and re-use. The dataset is split into 3 files explained below: mit_chapters_summarized.csv: Contains the transcript and other details of 14.8K chapters (segments) from the MIT lectures along with abstractive summaries generated with GPT-3.5. Each row is one chapter from one lecture video. Suitable to train summarization to summarize parts of lecture videos. (Not full lectures). ak_lectures_summarized.csv: Contains the transcript and other details of 1.8k lecture videos from aklectures.com. Each lecture video comes with the abstractive summary that was published on the website. Most videos of this dataset are short, between 5-15 minutes on average. Suitable to train summarization models to summarize full short lecture videos. (~ 15 min. in length for most) mit_videos_all_courses_segmentations.csv: Contains details of the chaptering (segmentation) of each lecture video from MIT. Each row is for one lecture video, and comes with the timing (end times) and titles of each chapter in the video. Suitable to train and/or evaluate segmentation algorithms and models for both short and long lecture videos. Please cite this page if you use this dataset in your research or in other projects. Copyright Notice: All rights of the lecture videos, the transcripts the have been scraped, the chapters and titles, the human-written summaries and all other related details belong to the respective owners of the MIT OCW or the AK Lectures websites. Our work here is for research and educational purposes.
创建时间:
2024-02-08
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作