five

atoomic/emoticonnect-sample

收藏
Hugging Face2023-07-20 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/atoomic/emoticonnect-sample
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: artistic-2.0 task_categories: - text-classification language: - fr --- # Description Data is using `.jsonl` format (each line is self isolated .json and can be parsed on its own). Each row contains a text indexed by the key `content:` and some ratings split into groups. * csp * feeling * gen * persona * sex At this stage only the `feeling` group is filled. Note: for now all vectors are filled with `0` value when mising. This could change over time to save some space. ## Sample Row example (pretty) ```json { "content": "...some text...", "metadata": { "lng": "fr" }, "rating": { }, "ratings": { "csp": { "c1": 0, "c2": 0, "c3": 0, "c4": 0, "c5": 0, "c6": 0, "c7": 0, "c8": 0 }, "feeling": { "f1": 0, "f2": 100, "f3": 0, "f4": 0, "f5": 0, "f6": 0, "f7": 0, "f8": 0 }, "gen": { "g1": 0, "g2": 0, "g3": 0, "g4": 0 }, "persona": { "p1": 0, "p2": 0, "p3": 0, "p4": 0, "p5": 0, "p6": 0, "p7": 0, "p8": 0 }, "sex": { "s1": 0, "s2": 0 } } } ``` Note: more than a field can be set for a group ```json { "content": "...some text...", "metadata": { "lng": "fr" }, "rating": { }, "ratings": { "csp": { "c1": 0, "c2": 0, "c3": 0, "c4": 0, "c5": 0, "c6": 0, "c7": 0, "c8": 0 }, "feeling": { "f1": 0, "f2": 0, "f3": 0, "f4": 0, "f5": 33.33, "f6": 66.67, "f7": 0, "f8": 0 }, "gen": { "g1": 0, "g2": 0, "g3": 0, "g4": 0 }, "persona": { "p1": 0, "p2": 0, "p3": 0, "p4": 0, "p5": 0, "p6": 0, "p7": 0, "p8": 0 }, "sex": { "s1": 0, "s2": 0 } } } ```
提供机构:
atoomic
原始信息汇总

数据集概述

数据集基本信息

  • 许可证: artistic-2.0
  • 任务类别: 文本分类
  • 语言: 法语 (fr)

数据格式与结构

  • 文件格式: .jsonl,每行是一个独立的JSON格式数据。
  • 数据内容:
    • 每条记录包含一个文本内容,索引为content:
    • 包含多个评分组,目前仅feeling组有数据填充。
    • 评分组包括:csp, feeling, gen, persona, sex

数据示例

  • 文本内容: 示例中未具体展示文本内容。
  • 评分数据:
    • feeling组示例:f2值为100,其他为0;另一示例中f5值为33.33,f6值为66.67,其他为0。
    • 其他评分组(csp, gen, persona, sex)当前值均为0。

数据集特点

  • 缺失值处理: 目前缺失的评分值用0填充,未来可能优化以节省空间。
  • 评分组多值: 一个评分组可以包含多个评分值。
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作