jrtec/Superheroes

Name: jrtec/Superheroes
Creator: jrtec
Published: 2023-01-08 06:18:48
License: 暂无描述

Hugging Face2023-01-08 更新2024-03-04 收录

下载链接：

https://hf-mirror.com/datasets/jrtec/Superheroes

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: cc0-1.0 task_categories: - summarization language: - en tags: - superheroes - heroes - anime - manga - marvel size_categories: - 1K<n<10K --- # Dataset Card for Superheroes ## Dataset Description 1400+ Superheroes history and powers description to apply text mining and NLP [Original source](https://www.kaggle.com/datasets/jonathanbesomi/superheroes-nlp-dataset/code?resource=download) ## Context The aim of this dataset is to make text analytics and NLP even funnier. All of us have dreamed to be like a superhero and save the world, yet we are still on Kaggle figuring out how python works. Then, why not improve our NLP competences by analyzing Superheros' history and powers? The particularity of this dataset is that it contains categorical and numerical features such as overall_score, intelligence_score, creator, alignment, gender, eye_color but also text features history_text and powers_text. By combining the two, a lot of interesting insights can be gathered! ## Content We collected all data from superherodb and cooked for you in a nice and clean tabular format. The dataset contains 1447 different Superheroes. Each superhero row has: * overall_score - derivated by superherodb from the power stats features. Can you find the relationship? * history_text - History of the Superhero (text features) * powers_text - Description of Superheros' powers (text features) * intelligence_score, strength_score, speed_score, durability_score, power_score and combat_score. (power stats features) * "Origin" (full_name, alter_egos, …) * "Connections" (occupation, base, teams, …) * "Appareance" (gender, type_race, height, weight, eye_color, …) ## Acknowledgements The following [Github repository](https://github.com/jbesomi/texthero/tree/master/dataset/Superheroes%20NLP%20Dataset) contains the code used to scrape this Dataset.

提供机构：

jrtec

原始信息汇总

数据集概述

基本信息

许可证: CC0-1.0
任务类别: 摘要
语言: 英语
标签: 超级英雄, 英雄, 动漫, 漫画, 漫威
大小类别: 1K<n<10K

数据集描述

名称: Superheroes
描述: 包含1400多个超级英雄的历史和能力描述，用于文本挖掘和自然语言处理。
原始来源: Kaggle链接

内容

数据来源: 从superherodb收集并整理成表格格式。
数据集大小: 包含1447个不同的超级英雄。
数据字段:
- 整体评分 (overall_score)
- 历史文本 (history_text)
- 能力描述 (powers_text)
- 能力评分 (intelligence_score, strength_score, speed_score, durability_score, power_score, combat_score)
- 起源信息 (full_name, alter_egos, …)
- 关联信息 (occupation, base, teams, …)
- 外观信息 (gender, type_race, height, weight, eye_color, …)

数据集用途

目的: 使文本分析和自然语言处理更加有趣，通过分析超级英雄的历史和能力来提高NLP技能。

数据集特点

特点: 结合了分类和数值特征以及文本特征，通过这些特征可以获得丰富的洞察。

致谢

代码来源: Github链接，包含用于抓取此数据集的代码。

5,000+

优质数据集

54 个

任务类型

进入经典数据集