Top 250 Korean Dramas (KDrama) Dataset
收藏www.kaggle.com2023-01-21 更新2025-01-21 收录
下载链接:
https://www.kaggle.com/ahbab911/top-250-korean-dramas-kdrama-dataset
下载链接
链接失效反馈官方服务:
资源简介:
This dataset contains data from the top-ranked 250 Korean Dramas as per the MyDramaList website. The data has been collected and uploaded in the form of a CSV file and can be used to work on various Data Science Projects.
The CSV file has 17 columns and 251 rows containing mostly textual data.
Most of the data were collected from the MyDramaList website (https://mydramalist.com), and the data for the names of Production Companies was collected from Wikipedia (https://www.wikipedia.org). I wasn't sure how to scrape the data at the time, and hence I went all manual; copying and pasting the data using the cursor. (Yes it was very tedious to manually copy and paste the data!)
I was working on a Content-based Recommender System for Korean Dramas and I needed data to work with. The datasets available on Kaggle had up to only 100 k-drama titles. Not only that, but quite a few of the features deemed essential were also missing; Synopsis, Tags, Director's name, Cast names, Production Companies' names, and such data weren't available with the pre-existing datasets.
本数据集收录了来自 MyDramaList 网站排名前250位的韩剧数据。数据以 CSV 文件的形式收集并上传,可用于多种数据科学项目的开发。CSV 文件包含17列,251行,主要包含文本数据。
其中大部分数据来源于 MyDramaList 网站(https://mydramalist.com),而制作公司名称的数据则来自维基百科(https://www.wikipedia.org)。当时我对如何抓取数据并不确定,因此选择了全部手动操作;使用鼠标光标进行数据的复制和粘贴。(是的,手动复制粘贴数据非常繁琐!)
我正在开发一个基于内容的韩剧推荐系统,并需要数据来进行相关工作。Kaggle 上的数据集最多只包含100,000部韩剧标题。不仅如此,许多被认为是关键特征的属性也缺失;如剧情简介、标签、导演姓名、演员姓名、制作公司名称等数据均未包含在现有数据集中。
提供机构:
Kaggle



