danish-foundation-models/global-piqa-da
收藏Hugging Face2026-02-23 更新2026-04-05 收录
下载链接:
https://hf-mirror.com/datasets/danish-foundation-models/global-piqa-da
下载链接
链接失效反馈官方服务:
资源简介:
---
tags:
- rlfh
- argilla
- human-feedback
---
# Dataset Card for global-piqa-da
This dataset has been created with [Argilla](https://github.com/argilla-io/argilla). As shown in the sections below, this dataset can be loaded into your Argilla server as explained in [Load with Argilla](#load-with-argilla), or used directly with the `datasets` library in [Load with `datasets`](#load-with-datasets).
## Using this dataset with Argilla
To load with Argilla, you'll just need to install Argilla as `pip install argilla --upgrade` and then use the following code:
```python
import argilla as rg
ds = rg.Dataset.from_hub("danish-foundation-models/global-piqa-da", settings="auto")
```
This will load the settings and records from the dataset repository and push them to you Argilla server for exploration and annotation.
## Using this dataset with `datasets`
To load the records of this dataset with `datasets`, you'll just need to install `datasets` as `pip install datasets --upgrade` and then use the following code:
```python
from datasets import load_dataset
ds = load_dataset("danish-foundation-models/global-piqa-da")
```
This will only load the records of the dataset, but not the Argilla settings.
## Dataset Structure
This dataset repo contains:
* Dataset records in a format compatible with HuggingFace `datasets`. These records will be loaded automatically when using `rg.Dataset.from_hub` and can be loaded independently using the `datasets` library via `load_dataset`.
* The [annotation guidelines](#annotation-guidelines) that have been used for building and curating the dataset, if they've been defined in Argilla.
* A dataset configuration folder conforming to the Argilla dataset format in `.argilla`.
The dataset is created in Argilla with: **fields**, **questions**, **suggestions**, **metadata**, **vectors**, and **guidelines**.
### Fields
The **fields** are the features or text of a dataset's records. For example, the 'text' column of a text classification dataset of the 'prompt' column of an instruction following dataset.
| Field Name | Title | Type | Required |
| ---------- | ----- | ---- | -------- |
| hash | ID | text | True |
### Questions
The **questions** are the questions that will be asked to the annotators. They can be of different types, such as rating, text, label_selection, multi_label_selection, or ranking.
| Question Name | Title | Type | Required | Description | Values/Labels |
| ------------- | ----- | ---- | -------- | ----------- | ------------- |
| question | Spørgsmål | text | True | N/A | N/A |
| correct_answer | Korrekt svar | text | True | N/A | N/A |
| wrong_answer | Forkert svar | text | True | N/A | N/A |
| category | Kategori | multi_label_selection | True | N/A | ['common sense', 'culturally specific', 'physical reasoning'] |
<!-- check length of metadata properties -->
### Data Splits
The dataset contains a single split, which is `train`.
## Dataset Creation
### Curation Rationale
[More Information Needed]
### Source Data
#### Initial Data Collection and Normalization
[More Information Needed]
#### Who are the source language producers?
[More Information Needed]
### Annotations
#### Annotation guidelines
Du skal her finde på nye spørgsmål, som næsten alle danskere vil kunne svare på.
Hvert spørgsmål skal have to muligheder, som minder rigtig meget om hinanden,
men kun ét af dem er korrekt. Du må meget gerne inkludere ting, som er
specifikke til dansk kultur, historie og sprog. Spørgsmålet kan enten være et
decideret spørgsmål (altså som slutter med et spørgsmålstegn), eller også den
første del af en sætning, som de to muligheder færdiggører. Derudover skal du
også kategorisere dit spørgsmål i en af disse kategorier:
- "common sense" - spørgsmål om noget, som alle mennesker ville kunne svare på
(også folk, der ikke er fra Danmark)
- "culturally specific" - spørgsmål om noget, som specifikt omhandler dansk
kultur, historie eller sprog
- "physical reasoning" - spørgsmål om noget, som er baseret på forståelse
omkring den fysiske verden
#### Annotation process
[More Information Needed]
#### Who are the annotators?
[More Information Needed]
### Personal and Sensitive Information
[More Information Needed]
## Considerations for Using the Data
### Social Impact of Dataset
[More Information Needed]
### Discussion of Biases
[More Information Needed]
### Other Known Limitations
[More Information Needed]
## Additional Information
### Dataset Curators
[More Information Needed]
### Licensing Information
[More Information Needed]
### Citation Information
[More Information Needed]
### Contributions
[More Information Needed]
提供机构:
danish-foundation-models



