davidadamczyk/agree

Name: davidadamczyk/agree
Creator: davidadamczyk
Published: 2024-06-04 08:54:30
License: 暂无描述

Hugging Face2024-06-04 更新2024-06-12 收录

下载链接：

https://hf-mirror.com/datasets/davidadamczyk/agree

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: mit dataset_info: features: - name: text dtype: string splits: - name: train num_bytes: 919425780 num_examples: 9900000 - name: eval_expanded_char num_bytes: 3160159 num_examples: 17940 - name: valid_q num_bytes: 9253945 num_examples: 99000 - name: valid num_bytes: 9202105 num_examples: 99000 - name: eval num_bytes: 91270 num_examples: 996 - name: eval_q num_bytes: 91801 num_examples: 996 download_size: 703700973 dataset_size: 941225060 --- # Dataset Card for Dataset Agree [link](https://nlp.fi.muni.cz/~xbaisa/agree/) This dataset card aims to be a base template for new datasets. It has been generated using [this raw template](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/datasetcard_template.md?plain=1). ## Dataset Details ### Dataset Description  - **Curated by:** [More Information Needed] - **Funded by [optional]:** [More Information Needed] - **Shared by [optional]:** [More Information Needed] - **Language(s) (NLP):** [More Information Needed] - **License:** [More Information Needed] ### Dataset Sources [optional]  - **Repository:** [More Information Needed] - **Paper [optional]:** [More Information Needed] - **Demo [optional]:** [More Information Needed] ## Uses  ### Direct Use  [More Information Needed] ### Out-of-Scope Use  [More Information Needed] ## Dataset Structure  [More Information Needed] ## Dataset Creation ### Curation Rationale  [More Information Needed] ### Source Data  #### Data Collection and Processing  [More Information Needed] #### Who are the source data producers?  [More Information Needed] ### Annotations [optional]  #### Annotation process  [More Information Needed] #### Who are the annotators?  [More Information Needed] #### Personal and Sensitive Information  [More Information Needed] ## Bias, Risks, and Limitations  [More Information Needed] ### Recommendations  Users should be made aware of the risks, biases and limitations of the dataset. More information needed for further recommendations. ## Citation [optional]  **BibTeX:** [More Information Needed] **APA:** [More Information Needed] ## Glossary [optional]  [More Information Needed] ## More Information [optional] [More Information Needed] ## Dataset Card Authors [optional] [More Information Needed] ## Dataset Card Contact [More Information Needed]

--- license: 麻省理工学院许可证（MIT） dataset_info: features: - name: 文本（text） dtype: 字符串（string） splits: - name: 训练集（train） num_bytes: 919425780 num_examples: 9900000 - name: 扩展字符评估集（eval_expanded_char） num_bytes: 3160159 num_examples: 17940 - name: 验证集Q（valid_q） num_bytes: 9253945 num_examples: 99000 - name: 验证集（valid） num_bytes: 9202105 num_examples: 99000 - name: 评估集（eval） num_bytes: 91270 num_examples: 996 - name: 评估集Q（eval_q） num_bytes: 91801 num_examples: 996 download_size: 703700973 dataset_size: 941225060 --- # 数据集卡片：Agree数据集 [链接](https://nlp.fi.muni.cz/~xbaisa/agree/) 本数据集卡片旨在作为新建数据集的基础模板，其基于[该原始模板](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/datasetcard_template.md?plain=1)生成。 ## 数据集详情 ### 数据集概述  - **数据整理方：** [需补充更多信息] - **资助方（可选）：** [需补充更多信息] - **共享方（可选）：** [需补充更多信息] - **自然语言处理所用语言：** [需补充更多信息] - **许可证：** [需补充更多信息] ### 数据集来源（可选）  - **代码仓库：** [需补充更多信息] - **相关论文（可选）：** [需补充更多信息] - **演示项目（可选）：** [需补充更多信息] ## 数据集用途  ### 直接用途  [需补充更多信息] ### 超出适用范围的用途  [需补充更多信息] ## 数据集结构  [需补充更多信息] ## 数据集构建 ### 整理动因  [需补充更多信息] ### 源数据  #### 数据收集与处理  [需补充更多信息] #### 源数据生产者是谁？  [需补充更多信息] ### 标注信息（可选）  #### 标注流程  [需补充更多信息] #### 标注人员是谁？  [需补充更多信息] #### 个人与敏感信息  [需补充更多信息] ## 偏差、风险与局限性  [需补充更多信息] ### 建议  使用者应知晓该数据集存在的风险、偏差与局限性。进一步的建议需补充更多信息后方可提出。 ## 引用信息（可选）  **BibTeX格式：** [需补充更多信息] **APA格式：** [需补充更多信息] ## 术语表（可选）  [需补充更多信息] ## 更多信息（可选） [需补充更多信息] ## 数据集卡片撰写者（可选） [需补充更多信息] ## 数据集卡片联系人 [需补充更多信息]

提供机构：

davidadamczyk

原始信息汇总

数据集概述

数据集基本信息

名称: Dataset Agree
许可证: MIT

数据集特征

特征名称: text
数据类型: string

数据集分割

训练集 (train):
- 样本数量: 9900000
- 数据大小: 919425780 bytes
评估集 (eval_expanded_char):
- 样本数量: 17940
- 数据大小: 3160159 bytes
验证集 (valid_q):
- 样本数量: 99000
- 数据大小: 9253945 bytes
验证集 (valid):
- 样本数量: 99000
- 数据大小: 9202105 bytes
评估集 (eval):
- 样本数量: 996
- 数据大小: 91270 bytes
评估集 (eval_q):
- 样本数量: 996
- 数据大小: 91801 bytes

数据集大小

下载大小: 703700973 bytes
数据集总大小: 941225060 bytes

5,000+

优质数据集

54 个

任务类型

进入经典数据集