TCAB: Text Classification Attack Benchmark Dataset

NIAID Data Ecosystem2026-03-14 收录

下载链接：

https://zenodo.org/record/6615385

下载链接

链接失效反馈

官方服务：

资源简介：

TCAB is a large collection of successful adversarial attacks on state-of-the-art text classification models trained on multiple sentiment and abuse domain datasets. The dataset is broken up into 2 files: train.csv and val.csv. The training set contains 1,448,751 instances (552,364 are "clean" unperturbed instances) and the validation set contains 482,914 instances (178,607 are "clean"). Each instance contains the following attributes: scenario: Domain, either abuse or sentiment. target_model_dataset: Dataset being attacked. target_model_train_dataset: Dataset the target model trained on. target_model: Type of victim model (e.g., bert, roberta, xlnet). attack_toolchain: Open-source attack toolchain, either TextAttack or OpenAttack. attack_name: Name of the attack method. original_text: Original input text. original_output: Prediction probabilities of the target model on the original text. ground_truth: Encoded label for the original task of the domain dataset. 1 and 0 means toxic and toxic for abuse datasets, respectively. 1 and 0 means positive and negative sentiment for sentiment datasets. If there is a neutral sentiment, then 2, 1, 0 means positive, neutral, and negative sentiment. status: Unperturbed example if "clean"; successful adversarial attack if "success". perturbed_text: Text after it has been perturbed by an attack. perturbed_output: Prediction probabilities of the target model on the perturbed text. attack_time: Time taken to execute the attack. num_queries: Number of queries performed while attacking. frac_words_changed: Fraction of words changed due to an attack. test_index: Index of each unique source example (original instance) (LEGACY - necessary for backwards compatibility). original_text_identifier: Index of each unique source example (original instance). unique_src_instance_identifier: Primary key to uniquely identify to every source instance; comprised of (target_model_dataset, test_index, original_text_identifier). pk: Primary key to uniquely identify every attack instance; comprised of (attack_name, attack_toolchain, original_text_identifier, scenario, target_model, target_model_dataset, test_index).

创建时间：

2022-10-21

5,000+

优质数据集

54 个

任务类型

进入经典数据集