five

styal/LucioleData-2M

收藏
Hugging Face2026-03-10 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/styal/LucioleData-2M
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: features: - name: text dtype: string splits: - name: train num_bytes: 7312277132 num_examples: 2029517 download_size: 4291600198 dataset_size: 7312277132 configs: - config_name: default data_files: - split: train path: data/train-* --- ``` from datasets import load_dataset faq = load_dataset( "styal/filtered-finephrase-faq", split="train[:50%]", ) tutorial = load_dataset( "styal/filtered-finephrase-tutorial", split="train[:50%]", ) table = load_dataset( "styal/filtered-finephrase-table", split="train[:50%]", ) math = load_dataset( "styal/filtered-finephrase-math", split="train[:40%]", ) py = load_dataset( "styal/filtered-python-edu.eng", split="train[:20%]", ) faq = faq.remove_columns([col for col in faq.column_names if col != "text"]) math = math.remove_columns([col for col in math.column_names if col != "text"]) table = table.remove_columns([col for col in table.column_names if col != "text"]) tutorial = tutorial.remove_columns([col for col in tutorial.column_names if col != "text"]) py = py.remove_columns([col for col in py.column_names if col != "text"]) from datasets import concatenate_datasets merged = concatenate_datasets([faq, math, table, py, tutorial]) ```
提供机构:
styal
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作