prli/wiki_draft_bf_4-0_0-01_0-02_qwen_falcon_perturb
收藏Hugging Face2026-04-25 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/prli/wiki_draft_bf_4-0_0-01_0-02_qwen_falcon_perturb
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是一个用于自然语言处理任务的数据集,专注于文本扰动和增强技术。它包含多个特征字段,如原始文本(text)以及通过不同方法生成的扰动版本,如同义词替换(synonym_substitution)、字符错位模拟(butter_fingers)、随机删除(random_deletion)、字符大小写变换(change_char_case)、空格扰动(whitespace_perturbation)和下划线技巧(underscore_trick)。这些扰动技术旨在测试模型的鲁棒性或用于数据增强。数据集仅包含验证集,共有4000个示例,总大小约为64.4 MB。
This dataset is designed for natural language processing tasks, focusing on text perturbation and augmentation techniques. It includes multiple feature fields such as original text (text) and perturbed versions generated by various methods, including synonym substitution, butter fingers simulation, random deletion, character case change, whitespace perturbation, and underscore trick. These perturbation techniques aim to test model robustness or be used for data augmentation. The dataset contains only a validation split with 4000 examples and a total size of approximately 64.4 MB.
提供机构:
prli



