CAMEO8: Multilingual Cultural Dialogue and Safety Evaluation Dataset
收藏IEEE2026-04-17 收录
下载链接:
https://ieee-dataport.org/documents/cameo8-multilingual-cultural-dialogue-and-safety-evaluation-dataset
下载链接
链接失效反馈官方服务:
资源简介:
CAMEO-8 is a multilingual prompt-and-dialogue dataset designed to evaluate and train culturally aware conversational AI. The collection spans eight languages\u2014Arabic, English, French, German, Hindi, Japanese, Korean, and Spanish\u2014and covers five evaluation families: (1) multilingual task-oriented dialogue for media\/content discovery and platform support (e.g., search, recommendation, genre and actor queries, events\/specials, subscription\/billing); (2) cultural-intelligence prompts targeting etiquette, holidays, regional norms, and cross-cultural scenarios; (3) bias & safety stress-tests; (4) emotion & sentiment cues; and (5) language-switching interactions.Data are provided as CSV files, organized by task family and language. Each row follows a consistent schema\u2014language, system, user, assistant, pattern_id\u2014so the same files can be used for generation or supervised fine-tuning\/evaluation. Prompts originate from normalized pattern templates to ensure broad coverage and reproducibility across languages while preserving cultural nuance. Companion scripts are included to regenerate expansions from the common prompts and automatically route outputs to the appropriate folders (Multilingual Datasets, Bias & Safety, Cultural Intelligence, Emotion & Sentiment, and Language Switching).The current release includes >100K expanded prompts across the multilingual and cultural-intelligence families, with additional task families following the same format. CAMEO-8 is intended for benchmarking multilingual safety, cultural competence, recommendation\/search dialogs, and affect-aware responses, and for building evaluation suites that reflect real-world, cross-cultural use.
提供机构:
Gopikanth Ankam; Satheeshkumar Ponugoti; SATYA KARTEEK GUDIPATI; Naveen Anand Mishra



