kamoo-ai/chameleon-facts
收藏Hugging Face2026-04-10 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/kamoo-ai/chameleon-facts
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-sa-4.0
task_categories:
- text-generation
- question-answering
- multiple-choice
language:
- en
tags:
- chameleons
- animals
- biology
- reptiles
- zoology
- taxonomy
- fun
- kamoo
pretty_name: Chameleon Facts
size_categories:
- n<1K
configs:
- config_name: facts
data_files:
- split: train
path: facts.jsonl
- config_name: species
data_files:
- split: train
path: species.jsonl
- config_name: qa
data_files:
- split: train
path: qa.jsonl
- config_name: trivia
data_files:
- split: train
path: trivia.jsonl
---
# Chameleon Facts
A curated, source-attributed dataset all about chameleons — the mascot of [Kamoo](https://huggingface.co/kamoo-ai). Built as a lighthearted nod to our logo, but structured and sourced well enough to use as a real playground for retrieval, Q&A, and small-scale fine-tuning experiments.
> **Why does this exist?** Our logo is a chameleon. We thought it would be fun. It also happens to be a nice compact, well-sourced dataset for testing retrieval pipelines, fine-tuning toy models, or demoing a RAG loop without pulling down gigabytes.
> ⚠️ **Data quality warning.** This dataset was assembled with the help of a multi-agent AI research workflow and only partially spot-checked by a human. A random sample of 30 rows during QA found roughly 1 in 4 entries had some kind of error — wrong year, wrong IUCN status, wrong range, or a source that didn't actually back the claim. The confirmed errors from that sample have been fixed, but **similar undetected errors almost certainly remain throughout the dataset**. Treat every `source_url` as a pointer, not a guarantee. For anything that matters — research, education, conservation decisions — verify against the primary source before using a fact. If you spot an error, please open a discussion on the dataset page.
## Dataset contents
The dataset is split into four configs, each covering a different angle on chameleons:
| Config | Rows | What's in it |
|-----------|------|------------------------------------------------------------------------|
| `facts` | 466 | Atomic English facts about chameleons, categorised by topic, with source URLs |
| `species` | 205 | Structured records for individual chameleon species (range, habitat, size, IUCN status, source) |
| `qa` | 100 | Question–answer pairs suitable for Q&A and instruction tuning demos |
| `trivia` | 50 | Multiple-choice trivia questions with answer index and explanation |
**Total: 821 rows.**
Every entry (except a small number of internal branding statements) has a `source_url` pointing to the Wikipedia article, IUCN page, Reptile Database page, or peer-reviewed paper it was derived from.
## Loading
```python
from datasets import load_dataset
facts = load_dataset("kamoo-ai/chameleon-facts", "facts")
species = load_dataset("kamoo-ai/chameleon-facts", "species")
qa = load_dataset("kamoo-ai/chameleon-facts", "qa")
trivia = load_dataset("kamoo-ai/chameleon-facts", "trivia")
print(facts["train"][0])
```
## Schemas
### `facts`
| Field | Type | Description |
|---------------|--------|--------------------------------------------------------------|
| `id` | int | Unique identifier |
| `category` | string | Topic tag (e.g. `color`, `tongue`, `species`, `conservation`, `research`, `folklore`, `media`) |
| `fact` | string | A single self-contained English fact |
| `source_url` | string | URL of the reference used to write or verify the fact |
### `species`
| Field | Type | Description |
|-----------------------|--------|--------------------------------------------------------------|
| `id` | int | Unique identifier |
| `scientific_name` | string | Binomial name, e.g. `Furcifer pardalis` |
| `common_name` | string | English common name (may be empty) |
| `genus` | string | Genus |
| `range` | string | Natural geographic range |
| `habitat` | string | Typical habitat description |
| `max_length_cm` | float | Maximum reported length in centimetres |
| `conservation_status` | string | IUCN Red List category |
| `notes` | string | Distinguishing features or interesting notes |
| `source_url` | string | Reference URL (typically Reptile Database or Wikipedia page) |
### `qa`
| Field | Type | Description |
|--------------|--------|------------------------------------------------------|
| `id` | int | Unique identifier |
| `question` | string | Natural-language question |
| `answer` | string | Self-contained English answer |
| `category` | string | Topic tag (same scheme as facts) |
| `source_url` | string | Reference URL backing the answer (may be empty for older derived entries) |
### `trivia`
| Field | Type | Description |
|----------------|---------------|--------------------------------------------------------|
| `id` | int | Unique identifier |
| `question` | string | Multiple-choice question |
| `choices` | list<string> | Four answer options |
| `answer_index` | int | Zero-based index of the correct choice in `choices` |
| `explanation` | string | Why the correct answer is correct |
| `category` | string | Topic tag |
| `source_url` | string | Reference URL backing the explanation (may be empty for older entries) |
## Example rows
```json
// facts
{"id": 36, "category": "tongue", "fact": "A chameleon's tongue can extend up to twice its body length and reach its prey in about 0.07 seconds, accelerating faster than a fighter jet.", "source_url": "https://www.brown.edu/news/2016-01-04/chameleon"}
// species
{"id": 84, "scientific_name": "Furcifer pardalis", "common_name": "Panther chameleon", "genus": "Furcifer", "range": "Northern and eastern Madagascar", "habitat": "Coastal lowland rainforest and secondary growth", "max_length_cm": 55, "conservation_status": "Least Concern", "notes": "Famous for vividly coloured male morphs named after Malagasy localities such as Ambilobe and Nosy Be.", "source_url": "https://reptile-database.reptarium.cz/species?genus=Furcifer&species=pardalis"}
// qa
{"id": 4, "question": "What is the smallest known chameleon?", "answer": "Brookesia nana, described in 2021, with adult males measuring only about 13.5 mm from snout to vent.", "category": "size", "source_url": ""}
// trivia
{"id": 3, "question": "What is the smallest known chameleon (and smallest reptile)?", "choices": ["Brookesia micra", "Brookesia nana", "Brookesia minima", "Rhampholeon spinosus"], "answer_index": 1, "explanation": "Brookesia nana, described in 2021, is the smallest reptile known to science.", "category": "fun", "source_url": ""}
```
## How the dataset was built
The initial 100 facts and 60 species were handcrafted. The dataset was then expanded using a research workflow of four parallel AI research assistants:
1. **Species enumerator** — enumerated species from the Reptile Database and IUCN Red List across all 12 chameleon genera (~145 new entries).
2. **Fact verifier** — cross-checked every original fact against primary sources, flagging corrections and providing source URLs. Four facts were rewritten with corrections; one unverifiable fact was removed.
3. **Wikipedia miner** — extracted 261 new atomic facts (paraphrased, not copied) from the main chameleon article, all genus articles, and notable species articles.
4. **Culture / history / science researcher** — added 109 facts covering folklore (Zulu, Bantu, Khoikhoi, Yoruba myths), taxonomic history (Aristotle, Pliny, Laurenti, Gray, Rafinesque), peer-reviewed research (Teyssier 2015, Anderson 2016, Prötzel 2018, Karsten 2008, Ott & Schaeffel 1995), and media (Rango, Tangled, Karma Chameleon, Marvel's Chameleon, Chameleon Twist).
All facts were then manually spot-checked against primary sources. A few errors detected in spot-checks were corrected or removed (e.g. a misattribution of Chameleon Twist to Sega, a range error for Nadzikambia mlanjensis, conservation status errors for two species). Duplicates were removed and IDs renumbered.
## Sources and accuracy
All entries are attributable to a real reference. Primary sources include:
- **Wikipedia** (CC BY-SA 4.0) — main chameleon article, genus and species articles
- **The Reptile Database** (`reptile-database.reptarium.cz`) — authoritative reptile taxonomy
- **IUCN Red List** (`iucnredlist.org`) — conservation statuses
- **Peer-reviewed papers** — notably:
- Teyssier et al. 2015, *Nature Communications* — active colour change via nanocrystal rearrangement
- Anderson 2016, *Scientific Reports* — record tongue acceleration in Rhampholeon spinosus
- Prötzel et al. 2018, *Scientific Reports* — bone-based UV fluorescence
- Brau et al. 2016, *Nature Physics* — tongue mucus viscosity
- Karsten et al. 2008, *PNAS* — Furcifer labordi annual life cycle
- Ott & Schaeffel 1995, *Nature* — negatively powered lens
- de Groot & van Leeuwen 2004, *Proc. Royal Society B* — elastic tongue mechanics
- Tolley et al. 2013, *Proc. Royal Society B* — African origin of chameleons
- Da Silva & Tolley 2015, *Molecular Ecology* — Bradypodion cryptic diversity
- Glaw et al. 2021, *Scientific Reports* — Brookesia nana description
- **Tolley & Herrel (eds.) 2014, The Biology of Chameleons** — University of California Press reference volume
This is a small community dataset, not a taxonomic authority. A manual spot-check of 30 randomly sampled rows found roughly a quarter contained at least one factual error (wrong year, IUCN status, geographic range, or a source that didn't actually support the claim). The confirmed errors from that sample were corrected, but the rest of the dataset has not yet been exhaustively re-verified, so assume undetected errors still exist. **For any serious use — scientific, educational, editorial — verify the specific fact against its primary source before relying on it.** If you spot an error, please open a discussion on the dataset page so it can be fixed.
## License
Released under [Creative Commons Attribution-ShareAlike 4.0 (CC BY-SA 4.0)](https://creativecommons.org/licenses/by-sa/4.0/). Use it, remix it, just credit Kamoo and release derivatives under the same license. This license was chosen so the dataset can freely incorporate facts paraphrased from Wikipedia (also CC BY-SA).
## Maintainer
[Kamoo](https://huggingface.co/kamoo-ai) — the Dutch open-data organisation whose mascot is a chameleon. For serious datasets check out [kamoo-ai/dutch-legislation](https://huggingface.co/kamoo-ai) and friends; this one is our little easter egg.
提供机构:
kamoo-ai



