FrancophonIA/TwiSty
收藏Hugging Face2025-03-30 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/FrancophonIA/TwiSty
下载链接
链接失效反馈官方服务:
资源简介:
TwiSty是一个多语言Twitter风格计量语料库,用于作者个性特征和性别分析。该语料库包含18,168位作者的数据,这些作者的母语包括荷兰语、意大利语、德语、法语、葡萄牙语和西班牙语。每个作者的Twitter资料中包括自我评估的MBTI性格类型和标注的性别,以及两个推文ID列表:一个是确认属于该语料库语言的推文ID,另一个是所有其他语言的推文ID。
TwiSty is a multilingual Twitter stylometry corpus for author profiling in terms of personality (MBTI) and gender. It contains data for 18,168 authors whose native languages are Dutch, Italian, German, French, Portuguese, and Spanish. Each authors Twitter profile includes their self-assessed MBTI personality type and annotated gender, along with two lists of tweet IDs: one confirmed to belong to the language of the corpus, and the other containing all other tweets that were mined.
提供机构:
FrancophonIA



