MPEP_SPANISH
收藏魔搭社区2025-12-05 更新2025-07-12 收录
下载链接:
https://modelscope.cn/datasets/data-is-better-together/MPEP_SPANISH
下载链接
链接失效反馈官方服务:
资源简介:
# Dataset Card for MPEP_SPANISH
This dataset has been created with [Argilla](https://docs.argilla.io).
As shown in the sections below, this dataset can be loaded into Argilla as explained in [Load with Argilla](#load-with-argilla), or used directly with the `datasets` library in [Load with `datasets`](#load-with-datasets).
## Dataset Description
- **Homepage:** https://argilla.io
- **Repository:** https://github.com/argilla-io/argilla
- **Paper:**
- **Leaderboard:**
- **Point of Contact:**
### Dataset Summary
This dataset contains:
* A dataset configuration file conforming to the Argilla dataset format named `argilla.yaml`. This configuration file will be used to configure the dataset when using the `FeedbackDataset.from_huggingface` method in Argilla.
* Dataset records in a format compatible with HuggingFace `datasets`. These records will be loaded automatically when using `FeedbackDataset.from_huggingface` and can be loaded independently using the `datasets` library via `load_dataset`.
* The [annotation guidelines](#annotation-guidelines) that have been used for building and curating the dataset, if they've been defined in Argilla.
### Load with Argilla
To load with Argilla, you'll just need to install Argilla as `pip install argilla --upgrade` and then use the following code:
```python
import argilla as rg
ds = rg.FeedbackDataset.from_huggingface("DIBT/MPEP_SPANISH")
```
### Load with `datasets`
To load this dataset with `datasets`, you'll just need to install `datasets` as `pip install datasets --upgrade` and then use the following code:
```python
from datasets import load_dataset
ds = load_dataset("DIBT/MPEP_SPANISH")
```
### Supported Tasks and Leaderboards
This dataset can contain [multiple fields, questions and responses](https://docs.argilla.io/en/latest/conceptual_guides/data_model.html#feedback-dataset) so it can be used for different NLP tasks, depending on the configuration. The dataset structure is described in the [Dataset Structure section](#dataset-structure).
There are no leaderboards associated with this dataset.
### Languages
[More Information Needed]
## Dataset Structure
### Data in Argilla
The dataset is created in Argilla with: **fields**, **questions**, **suggestions**, **metadata**, **vectors**, and **guidelines**.
The **fields** are the dataset records themselves, for the moment just text fields are supported. These are the ones that will be used to provide responses to the questions.
| Field Name | Title | Type | Required | Markdown |
| ---------- | ----- | ---- | -------- | -------- |
| source | Source | text | True | True |
The **questions** are the questions that will be asked to the annotators. They can be of different types, such as rating, text, label_selection, multi_label_selection, or ranking.
| Question Name | Title | Type | Required | Description | Values/Labels |
| ------------- | ----- | ---- | -------- | ----------- | ------------- |
| target | Target | text | True | Translate the text. | N/A |
The **suggestions** are human or machine generated recommendations for each question to assist the annotator during the annotation process, so those are always linked to the existing questions, and named appending "-suggestion" and "-suggestion-metadata" to those, containing the value/s of the suggestion and its metadata, respectively. So on, the possible values are the same as in the table above, but the column name is appended with "-suggestion" and the metadata is appended with "-suggestion-metadata".
The **metadata** is a dictionary that can be used to provide additional information about the dataset record. This can be useful to provide additional context to the annotators, or to provide additional information about the dataset record itself. For example, you can use this to provide a link to the original source of the dataset record, or to provide additional information about the dataset record itself, such as the author, the date, or the source. The metadata is always optional, and can be potentially linked to the `metadata_properties` defined in the dataset configuration file in `argilla.yaml`.
| Metadata Name | Title | Type | Values | Visible for Annotators |
| ------------- | ----- | ---- | ------ | ---------------------- |
The **guidelines**, are optional as well, and are just a plain string that can be used to provide instructions to the annotators. Find those in the [annotation guidelines](#annotation-guidelines) section.
### Data Instances
An example of a dataset instance in Argilla looks as follows:
```json
{
"external_id": "165",
"fields": {
"source": "Given the text: An experienced and enthusiastic innovator...you want on your team.\nMargaret Hines is the founder and Principal Consultant of Inspire Marketing, LLC, investing in local businesses, serving the community with business brokerage and marketing consulting. She has an undergraduate degree from Washington University in St. Louis, MO, and an MBA from the University of Wisconsin-Milwaukee.\nMargaret offers consulting in marketing, business sales and turnarounds and franchising. She is also an investor in local businesses.\nPrior to founding Inspire Marketing in 2003, Margaret gained her business acumen, sales and marketing expertise while working at respected Fortune 1000 companies.\nSummarize the background and expertise of Margaret Hines, the founder of Inspire Marketing."
},
"metadata": {
"evolved_from": null,
"kind": "synthetic",
"source": "ultrachat"
},
"responses": [
{
"status": "submitted",
"user_id": "8581ce44-b17e-40a8-81a0-e20b63074c9d",
"values": {
"target": {
"value": "Dado el texto: Una innovadora experimentada y entusiasta... que quieres en tu equipo.\nMargaret Hines es la fundadora y Consultora Principal de Inspire Marketing, LLC, que invierte en negocios locales, sirviendo a la comunidad con consultor\u00eda de negocios y marketing. Ella tiene un t\u00edtulo universitario de la Universidad de Washington en St. Louis, MO, y un MBA de la Universidad de Wisconsin-Milwaukee.\nMargaret ofrece consultor\u00eda en marketing, ventas de negocios, transformaciones de negocios y franquicias. Tambi\u00e9n es inversora en negocios locales.\nAntes de fundar Inspire Marketing en 2003, Margaret adquiri\u00f3 su habilidad para los negocios, experiencia en ventas y marketing mientras trabajaba en respetadas empresas de Fortune 1000.\nResume la formaci\u00f3n y experiencia de Margaret Hines, la fundadora de Inspire Marketing."
}
}
}
],
"suggestions": [
{
"agent": null,
"question_name": "target",
"score": null,
"type": null,
"value": "Dado el texto: Una innovadora experimentada y entusiasta... que quieres en tu equipo.\nMargaret Hines es la fundadora y Consultora Principal de Inspire Marketing, LLC, invirtiendo en negocios locales, sirviendo a la comunidad con consultor\u00eda de negocios y marketing. Ella tiene un t\u00edtulo universitario de la Universidad de Washington en St. Louis, MO, y un MBA de la Universidad de Wisconsin-Milwaukee.\nMargaret ofrece consultor\u00eda en marketing, ventas de negocios, transformaciones de negocios y franquicias. Tambi\u00e9n es inversora en negocios locales.\nAntes de fundar Inspire Marketing en 2003, Margaret adquiri\u00f3 su habilidad para los negocios, experiencia en ventas y marketing mientras trabajaba en respetadas empresas Fortune 1000.\nResumen de la formaci\u00f3n y experiencia de Margaret Hines, la fundadora de Inspire Marketing."
}
],
"vectors": {}
}
```
While the same record in HuggingFace `datasets` looks as follows:
```json
{
"external_id": "165",
"metadata": "{\"source\": \"ultrachat\", \"kind\": \"synthetic\", \"evolved_from\": null}",
"source": "Given the text: An experienced and enthusiastic innovator...you want on your team.\nMargaret Hines is the founder and Principal Consultant of Inspire Marketing, LLC, investing in local businesses, serving the community with business brokerage and marketing consulting. She has an undergraduate degree from Washington University in St. Louis, MO, and an MBA from the University of Wisconsin-Milwaukee.\nMargaret offers consulting in marketing, business sales and turnarounds and franchising. She is also an investor in local businesses.\nPrior to founding Inspire Marketing in 2003, Margaret gained her business acumen, sales and marketing expertise while working at respected Fortune 1000 companies.\nSummarize the background and expertise of Margaret Hines, the founder of Inspire Marketing.",
"target": [
{
"status": "submitted",
"user_id": "8581ce44-b17e-40a8-81a0-e20b63074c9d",
"value": "Dado el texto: Una innovadora experimentada y entusiasta... que quieres en tu equipo.\nMargaret Hines es la fundadora y Consultora Principal de Inspire Marketing, LLC, que invierte en negocios locales, sirviendo a la comunidad con consultor\u00eda de negocios y marketing. Ella tiene un t\u00edtulo universitario de la Universidad de Washington en St. Louis, MO, y un MBA de la Universidad de Wisconsin-Milwaukee.\nMargaret ofrece consultor\u00eda en marketing, ventas de negocios, transformaciones de negocios y franquicias. Tambi\u00e9n es inversora en negocios locales.\nAntes de fundar Inspire Marketing en 2003, Margaret adquiri\u00f3 su habilidad para los negocios, experiencia en ventas y marketing mientras trabajaba en respetadas empresas de Fortune 1000.\nResume la formaci\u00f3n y experiencia de Margaret Hines, la fundadora de Inspire Marketing."
}
],
"target-suggestion": "Dado el texto: Una innovadora experimentada y entusiasta... que quieres en tu equipo.\nMargaret Hines es la fundadora y Consultora Principal de Inspire Marketing, LLC, invirtiendo en negocios locales, sirviendo a la comunidad con consultor\u00eda de negocios y marketing. Ella tiene un t\u00edtulo universitario de la Universidad de Washington en St. Louis, MO, y un MBA de la Universidad de Wisconsin-Milwaukee.\nMargaret ofrece consultor\u00eda en marketing, ventas de negocios, transformaciones de negocios y franquicias. Tambi\u00e9n es inversora en negocios locales.\nAntes de fundar Inspire Marketing en 2003, Margaret adquiri\u00f3 su habilidad para los negocios, experiencia en ventas y marketing mientras trabajaba en respetadas empresas Fortune 1000.\nResumen de la formaci\u00f3n y experiencia de Margaret Hines, la fundadora de Inspire Marketing.",
"target-suggestion-metadata": {
"agent": null,
"score": null,
"type": null
}
}
```
### Data Fields
Among the dataset fields, we differentiate between the following:
* **Fields:** These are the dataset records themselves, for the moment just text fields are supported. These are the ones that will be used to provide responses to the questions.
* **source** is of type `text`.
* **Questions:** These are the questions that will be asked to the annotators. They can be of different types, such as `RatingQuestion`, `TextQuestion`, `LabelQuestion`, `MultiLabelQuestion`, and `RankingQuestion`.
* **target** is of type `text`, and description "Translate the text.".
* **Suggestions:** As of Argilla 1.13.0, the suggestions have been included to provide the annotators with suggestions to ease or assist during the annotation process. Suggestions are linked to the existing questions, are always optional, and contain not just the suggestion itself, but also the metadata linked to it, if applicable.
* (optional) **target-suggestion** is of type `text`.
Additionally, we also have two more fields that are optional and are the following:
* **metadata:** This is an optional field that can be used to provide additional information about the dataset record. This can be useful to provide additional context to the annotators, or to provide additional information about the dataset record itself. For example, you can use this to provide a link to the original source of the dataset record, or to provide additional information about the dataset record itself, such as the author, the date, or the source. The metadata is always optional, and can be potentially linked to the `metadata_properties` defined in the dataset configuration file in `argilla.yaml`.
* **external_id:** This is an optional field that can be used to provide an external ID for the dataset record. This can be useful if you want to link the dataset record to an external resource, such as a database or a file.
### Data Splits
The dataset contains a single split, which is `train`.
## Dataset Creation
### Curation Rationale
[More Information Needed]
### Source Data
#### Initial Data Collection and Normalization
[More Information Needed]
#### Who are the source language producers?
[More Information Needed]
### Annotations
#### Annotation guidelines
This is a translation dataset that contains texts. Please translate the text in the text field.
#### Annotation process
[More Information Needed]
#### Who are the annotators?
[More Information Needed]
### Personal and Sensitive Information
[More Information Needed]
## Considerations for Using the Data
### Social Impact of Dataset
[More Information Needed]
### Discussion of Biases
[More Information Needed]
### Other Known Limitations
[More Information Needed]
## Additional Information
### Dataset Curators
[More Information Needed]
### Licensing Information
[More Information Needed]
### Citation Information
[More Information Needed]
### Contributions
[More Information Needed]
# MPEP_SPANISH 数据集卡片
本数据集基于[Argilla标注工具 (Argilla)](https://docs.argilla.io)构建。
如下章节所述,本数据集可按照[通过Argilla加载](#load-with-argilla)中的说明加载至Argilla,也可直接通过`datasets`库按照[通过datasets加载](#load-with-datasets)中的方式使用。
## 数据集说明
- **主页:** https://argilla.io
- **代码仓库:** https://github.com/argilla-io/argilla
- **论文:**
- **排行榜:**
- **联系人:**
### 数据集概览
本数据集包含以下内容:
* 一份符合Argilla数据集格式的数据集配置文件,名为`argilla.yaml`。该配置文件将在Argilla中使用`FeedbackDataset.from_huggingface`方法时,用于配置数据集。
* 兼容HuggingFace `datasets` (HuggingFace datasets)格式的数据集记录。当使用`FeedbackDataset.from_huggingface`时,这些记录会自动加载;也可通过`datasets`库的`load_dataset`方法独立加载。
* 用于构建和整理数据集的[标注指南](#annotation-guidelines)(若已在Argilla中定义)。
### 通过Argilla加载
若要通过Argilla加载本数据集,只需执行`pip install argilla --upgrade`安装Argilla,然后运行以下代码:
python
import argilla as rg
ds = rg.FeedbackDataset.from_huggingface("DIBT/MPEP_SPANISH")
### 通过datasets加载
若要通过`datasets`库加载本数据集,只需执行`pip install datasets --upgrade`安装`datasets`,然后运行以下代码:
python
from datasets import load_dataset
ds = load_dataset("DIBT/MPEP_SPANISH")
### 支持的任务与排行榜
本数据集包含[多个字段、问题与回复](https://docs.argilla.io/en/latest/conceptual_guides/data_model.html#feedback-dataset),因此可根据配置用于多种自然语言处理任务。数据集结构详见[数据集结构小节](#dataset-structure)。本数据集暂无关联排行榜。
### 语言:[需补充更多信息]
## 数据集结构
### Argilla中的数据
本数据集在Argilla中通过以下元素创建:**字段(fields)**、**问题(questions)**、**建议(suggestions)**、**元数据(metadata)**、**向量(vectors)**与**指南(guidelines)**。
**字段(fields)**即数据集记录本身,目前仅支持文本字段,用于接收针对问题的回复。
| 字段名称 | 标题 | 类型 | 必填 | 支持Markdown |
| ---------- | ----- | ---- | -------- | -------- |
| source | 来源 | text | 是 | 是 |
**问题(questions)**是向标注人员提出的问题,支持多种类型,包括评分、文本、标签选择、多标签选择或排序任务。
| 问题名称 | 标题 | 类型 | 必填 | 描述 | 值/标签 |
| ------------- | ----- | ---- | -------- | ----------- | ------------- |
| target | 目标 | text | 是 | 翻译该文本。 | N/A |
**建议(suggestions)**是为辅助标注人员完成标注流程而由人工或模型生成的、针对各问题的推荐结果,因此始终与现有问题关联,命名方式为在问题名称后追加`-suggestion`与`-suggestion-metadata`后缀,分别存储建议值及其元数据。其可选值与上述表格一致,但列名称需追加`-suggestion`后缀,元数据则追加`-suggestion-metadata`后缀。
**元数据(metadata)**是可用于提供数据集记录额外信息的字典,可用于向标注人员提供额外上下文,或为数据集记录本身补充信息(例如提供数据集记录的原始来源链接,或作者、日期、来源等信息)。元数据始终为可选字段,可与`argilla.yaml`中数据集配置文件定义的`metadata_properties`关联。
| 元数据名称 | 标题 | 类型 | 可选值 | 对标注人员可见 |
| ------------- | ----- | ---- | ------ | ---------------------- |
**指南(guidelines)**同样为可选字段,是可用于向标注人员提供说明的纯文本字符串,详见[标注指南](#annotation-guidelines)章节。
### 数据实例
Argilla中的数据集示例如以下JSON格式所示:
json
{
"external_id": "165",
"fields": {
"source": "Given the text: An experienced and enthusiastic innovator...you want on your team.
Margaret Hines is the founder and Principal Consultant of Inspire Marketing, LLC, investing in local businesses, serving the community with business brokerage and marketing consulting. She has an undergraduate degree from Washington University in St. Louis, MO, and an MBA from the University of Wisconsin-Milwaukee.
Margaret offers consulting in marketing, business sales and turnarounds and franchising. She is also an investor in local businesses.
Prior to founding Inspire Marketing in 2003, Margaret gained her business acumen, sales and marketing expertise while working at respected Fortune 1000 companies.
Summarize the background and expertise of Margaret Hines, the founder of Inspire Marketing."
},
"metadata": {
"evolved_from": null,
"kind": "synthetic",
"source": "ultrachat"
},
"responses": [
{
"status": "submitted",
"user_id": "8581ce44-b17e-40a8-81a0-e20b63074c9d",
"values": {
"target": {
"value": "Dado el texto: Una innovadora experimentada y entusiasta... que quieres en tu equipo.
Margaret Hines es la fundadora y Consultora Principal de Inspire Marketing, LLC, que invierte en negocios locales, sirviendo a la comunidad con consultoría de negocios y marketing. Ella tiene un título universitario de la Universidad de Washington en St. Louis, MO, y un MBA de la Universidad de Wisconsin-Milwaukee.
Margaret ofrece consultoría en marketing, ventas de negocios, transformaciones de negocios y franquicias. También es inversora en negocios locales.
Antes de fundar Inspire Marketing en 2003, Margaret adquirió su habilidad para los negocios, experiencia en ventas y marketing mientras trabajaba en respetadas empresas de Fortune 1000.
Resume la formación y experiencia de Margaret Hines, la fundadora de Inspire Marketing."
}
}
}
],
"suggestions": [
{
"agent": null,
"question_name": "target",
"score": null,
"type": null,
"value": "Dado el texto: Una innovadora experimentada y entusiasta... que quieres en tu equipo.
Margaret Hines es la fundadora y Consultora Principal de Inspire Marketing, LLC, invirtiendo en negocios locales, sirviendo a la comunidad con consultoría de negocios y marketing. Ella tiene un título universitario de la Universidad de Washington en St. Louis, MO, y un MBA de la Universidad de Wisconsin-Milwaukee.
Margaret ofrece consultoría en marketing, ventas de negocios, transformaciones de negocios y franquicias. También es inversora en negocios locales.
Antes de fundar Inspire Marketing en 2003, Margaret adquirió su habilidad para los negocios, experiencia en ventas y marketing mientras trabajaba en respetadas empresas Fortune 1000.
Resumen de la formación y experiencia de Margaret Hines, la fundadora de Inspire Marketing."
}
],
"vectors": {}
}
而该记录在HuggingFace `datasets`中的格式如下:
json
{
"external_id": "165",
"metadata": "{"source": "ultrachat", "kind": "synthetic", "evolved_from": null}",
"source": "Given the text: An experienced and enthusiastic innovator...you want on your team.
Margaret Hines is the founder and Principal Consultant of Inspire Marketing, LLC, investing in local businesses, serving the community with business brokerage and marketing consulting. She has an undergraduate degree from Washington University in St. Louis, MO, and an MBA from the University of Wisconsin-Milwaukee.
Margaret offers consulting in marketing, business sales and turnarounds and franchising. She is also an investor in local businesses.
Prior to founding Inspire Marketing in 2003, Margaret gained her business acumen, sales and marketing expertise while working at respected Fortune 1000 companies.
Summarize the background and expertise of Margaret Hines, the founder of Inspire Marketing.",
"target": [
{
"status": "submitted",
"user_id": "8581ce44-b17e-40a8-81a0-e20b63074c9d",
"value": "Dado el texto: Una innovadora experimentada y entusiasta... que quieres en tu equipo.
Margaret Hines es la fundadora y Consultora Principal de Inspire Marketing, LLC, que invierte en negocios locales, sirviendo a la comunidad con consultoría de negocios y marketing. Ella tiene un título universitario de la Universidad de Washington en St. Louis, MO, y un MBA de la Universidad de Wisconsin-Milwaukee.
Margaret ofrece consultoría en marketing, ventas de negocios, transformaciones de negocios y franquicias. También es inversora en negocios locales.
Antes de fundar Inspire Marketing en 2003, Margaret adquirió su habilidad para los negocios, experiencia en ventas y marketing mientras trabajaba en respetadas empresas de Fortune 1000.
Resume la formación y experiencia de Margaret Hines, la fundadora de Inspire Marketing."
}
],
"target-suggestion": "Dado el texto: Una innovadora experimentada y entusiasta... que quieres en tu equipo.
Margaret Hines es la fundadora y Consultora Principal de Inspire Marketing, LLC, invirtiendo en negocios locales, sirviendo a la comunidad con consultoría de negocios y marketing. Ella tiene un título universitario de la Universidad de Washington en St. Louis, MO, y un MBA de la Universidad de Wisconsin-Milwaukee.
Margaret ofrece consultoría en marketing, ventas de negocios, transformaciones de negocios y franquicias. También es inversora en negocios locales.
Antes de fundar Inspire Marketing en 2003, Margaret adquirió su habilidad para los negocios, experiencia en ventas y marketing mientras trabajaba en respetadas empresas Fortune 1000.
Resumen de la formación y experiencia de Margaret Hines, la fundadora de Inspire Marketing.",
"target-suggestion-metadata": {
"agent": null,
"score": null,
"type": null
}
}
### 数据字段
在数据字段中,我们可区分以下类别:
* **字段(Fields)**:即数据集记录本身,目前仅支持文本字段,用于接收针对问题的回复。
* **source** 字段类型为`text`。
* **问题(Questions)**:向标注人员提出的问题,支持多种类型,包括`RatingQuestion`、`TextQuestion`、`LabelQuestion`、`MultiLabelQuestion`与`RankingQuestion`。
* **target** 字段类型为`text`,描述为“翻译该文本。”
* **建议(Suggestions)**:自Argilla 1.13.0版本起,新增建议功能以辅助标注人员完成标注流程。建议与现有问题关联,始终为可选字段,不仅包含建议内容本身,还包含关联的元数据(若有)。
* (可选)**target-suggestion** 字段类型为`text`。
此外,还有两个可选字段:
* **元数据(metadata)**:可选字段,用于为数据集记录补充额外信息,例如提供原始来源链接、作者、日期等,可与`argilla.yaml`配置文件中的`metadata_properties`关联。
* **external_id**:可选字段,用于为数据集记录分配外部ID,可用于将数据集记录与外部资源(如数据库或文件)关联。
### 数据划分
本数据集仅包含一个划分,即`train`(训练集)。
## 数据集创建
### 筛选理由:[需补充更多信息]
### 源数据
#### 初始数据收集与标准化:[需补充更多信息]
#### 源语言生产者:[需补充更多信息]
### 标注信息
#### 标注指南
本数据集为文本翻译数据集,请翻译文本字段中的内容。
#### 标注流程:[需补充更多信息]
#### 标注人员:[需补充更多信息]
### 个人与敏感信息:[需补充更多信息]
## 数据使用注意事项
### 数据集的社会影响:[需补充更多信息]
### 偏差讨论:[需补充更多信息]
### 其他已知局限:[需补充更多信息]
## 补充信息
### 数据集策展人:[需补充更多信息]
### 许可信息:[需补充更多信息]
### 引用信息:[需补充更多信息]
### 贡献:[需补充更多信息]
提供机构:
maas
创建时间:
2025-07-10



