ManBib/Discord-Unveiled-Extracted
收藏Hugging Face2025-08-27 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/ManBib/Discord-Unveiled-Extracted
下载链接
链接失效反馈官方服务:
资源简介:
DiscordUnveiled过滤数据集包含从DiscordUnveiled数据集中表面过滤和处理的消息数据。数据集已经过处理,转换为CSV格式,移除了机器人发送的消息,并过滤掉了只包含URL、提及、频道或Discord表情的消息。此外,还使用FastText语言识别模型过滤掉了非英文消息。数据集包含的字段有时间戳、用户ID、用户名和清理后的消息内容。不过,过滤过程并不完美,数据集中可能仍包含一些不相关或非英文的消息。
The Discord Unveiled - Filtered Dataset contains superficially filtered and processed message data from the Discord Unveiled dataset. The dataset has been processed to convert JSON data to CSV format, remove messages from bots, filter out messages containing only URLs, mentions, channels, or discord emojis, and messages that are not in English using a FastText language identification model. The dataset includes fields such as timestamp, user_id, username, and the cleaned and filtered message content. However, the filtering process is not perfect, and some irrelevant or non-English messages may still be present.
提供机构:
ManBib



