h-alice/chat-cooking-master-boy-100k
收藏Hugging Face2024-04-24 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/h-alice/chat-cooking-master-boy-100k
下载链接
链接失效反馈官方服务:
资源简介:
---
license: mit
task_categories:
- text-classification
- zero-shot-classification
- text-generation
tags:
- meme
language:
- zh
- en
pretty_name: Cooking Master Boy Chat Records
size_categories:
- 1M<n<10M
configs:
- config_name: default
data_files:
- split: train
path: "cookingmaster_medium.csv"
---
# Cooking Master Boy Chat Records
Chat record dataset from Twitch channel "muse_tw" during the "Cooking Master Boy" (中華一番) marathon event.
# Introduction
This is a chat dataset collected from Twitch channel "muse_tw", while the channel is hosting a marathon anime event featuring "Cooking Master Boy" (中華一番).
The featured anime "Cooking Master Boy" is a Japanese manga series written and illustrated by Etsushi Ogawa. And has a big impact on meme culture, and has a cult following in Taiwan.

# Dataset Description
The dataset is in CSV format, with the following columns:
* `datetime`: The timestamp of the chat message. In the format of `YYYY-MM-DD HH:MM:SS`. (UTC+8)
* `user_id`: The user ID of the chat message.
* `user_name`: The user name of the chat message.
* `display_name`: The user name of the chat message.
* `channel`: The id of chatroom. In this case, it's always "muse_tw".
* `message`: The content of the chat message.
* `token_len`: The length of tokenized sentence, useful if user want to filter certain length messages.
### Sample Data
||datetime|user_id|user_name|display_name|channel|message|token_len|
|---|---|---|---|---|---|---|---|
|0|2022-01-20 22:57:16|24255794|jimandal|jimandal|muse_tw|生米煮成熟飯|6|
# Disclaimer
This dataset is unfiltered, and may contain vulgar, offensive, or inappropriate content. Please use it with caution.
This dataset is for research purposes only, and the dataset provider does not assume any responsibility for any legal or other consequences resulting from the use of this dataset.
提供机构:
h-alice
原始信息汇总
Cooking Master Boy Chat Records 数据集概述
基本信息
- 许可证: MIT
- 任务类别:
- 文本分类
- 零样本分类
- 文本生成
- 标签: meme
- 语言: 中文、英文
- 数据集名称: Cooking Master Boy Chat Records
- 大小类别: 1M<n<10M
配置信息
- 配置名称: default
- 数据文件:
- 分割: train
- 路径: cookingmaster_medium.csv
数据集描述
- 格式: CSV
- 列信息:
datetime: 聊天消息的时间戳,格式为YYYY-MM-DD HH:MM:SS(UTC+8)user_id: 聊天消息的用户IDuser_name: 聊天消息的用户名display_name: 聊天消息的显示名称channel: 聊天室ID,始终为 "muse_tw"message: 聊天消息内容token_len: 分词后句子的长度,用于过滤特定长度的消息
样本数据
| datetime | user_id | user_name | display_name | channel | message | token_len |
|---|---|---|---|---|---|---|
| 2022-01-20 22:57:16 | 24255794 | jimandal | jimandal | muse_tw | 生米煮成熟飯 | 6 |
免责声明
本数据集未经筛选,可能包含粗俗、冒犯或不适当的内容。请谨慎使用。本数据集仅供研究使用,数据集提供者不承担任何因使用本数据集而产生的法律或其他后果的责任。



