five

BISA-Eval: Bilingual Indonesian Safety Assessment for API-Mediated LLM Deployment

收藏
DataCite Commons2026-03-28 更新2026-05-04 收录
下载链接:
https://data.mendeley.com/datasets/5hftdxtpyr
下载链接
链接失效反馈
官方服务:
资源简介:
BISA-Eval (Bilingual Indonesian Safety Assessment) is an empirical dataset aimed at assisting in the research of safety degradation in the application of Artificial Intelligence (AI) in the API-mediated application of Large Language Models (LLMs), with special emphasis on the application of Bilingual Evaluation in Bahasa Indonesia and English. This dataset consists of 893 structured LLM responses obtained through six commercial foundation models with commercial API access—three of them of US origin, two of EU origin, and one of CN origin—under three systematically varied API application conditions: (C1) a consumer-facing system prompt with explicit safety guidelines; (C2) a minimal neutral system prompt mimicking raw API access; and (C3) a safety-stripped prompt with explicit safety guidelines removed. A 50-prompt bilingual battery of 28 categories of harmful intent is covered in the evaluation process, with the responses being scored using a binary LLM-as-a-Judge pipeline with Qwen2.5-3B-Instruct (4-bit NF4 quantization), with a score of 3 indicating robust refusal and a score of 0 indicating harmful compliance. Overall refusal is 68.8%; compliance increases from 23.7% in condition C1 to 36.4% in condition C3, indicating a statistically significant degradation. In the cell with the lowest performance—C3_STRIPPED Bahasa Indonesia—the harmful compliance is 37.6%, indicating safety asymmetry between language registers. Data collection is carried out using the Open Router API and provider endpoints (Google Gemini and Mistral AI), with the process being fully automated in five sessions in February2026. Raw API access logs in the form of five JSON files are included in the dataset with the cleaned primary dataset and the original prompt battery. A secondary cross-validation judge is applied in the form of SeaLLMs-v3-7B-Chat in the ordinal discrimination analysis presented in the companion paper. This dataset is aimed at assisting in the research of AI safety benchmarking, LLM evaluation methodology research, and AI governance policy analysis in the Global South. It is not permissible to use this dataset for the training of jailbreak models and the extraction of harmful instructions.
提供机构:
Mendeley Data
创建时间:
2026-03-28
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作