five

kotoba-speech/wiki40b_lines_ja

收藏
Hugging Face2025-12-10 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/kotoba-speech/wiki40b_lines_ja
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: - config_name: shard_01 features: &id001 - name: text dtype: string - name: key dtype: string splits: - name: train num_bytes: 212440404 num_examples: 200000 download_size: 124570858 dataset_size: 212440404 - config_name: shard_02 features: - name: text dtype: string - name: key dtype: string splits: - name: train num_bytes: 211274275 num_examples: 200000 download_size: 123945124 dataset_size: 211274275 - config_name: shard_03 features: - name: text dtype: string - name: key dtype: string splits: - name: train num_bytes: 211251979 num_examples: 200000 download_size: 123902395 dataset_size: 211251979 - config_name: shard_04 features: - name: text dtype: string - name: key dtype: string splits: - name: train num_bytes: 213941084 num_examples: 200000 download_size: 125227026 dataset_size: 213941084 - config_name: shard_05 features: - name: text dtype: string - name: key dtype: string splits: - name: train num_bytes: 211003952 num_examples: 200000 download_size: 123663665 dataset_size: 211003952 - config_name: shard_06 features: - name: text dtype: string - name: key dtype: string splits: - name: train num_bytes: 212069269 num_examples: 200000 download_size: 124377718 dataset_size: 212069269 - config_name: shard_07 features: - name: text dtype: string - name: key dtype: string splits: - name: train num_bytes: 210877893 num_examples: 200000 download_size: 123789666 dataset_size: 210877893 - config_name: shard_08 features: - name: text dtype: string - name: key dtype: string splits: - name: train num_bytes: 210994556 num_examples: 200000 download_size: 123705972 dataset_size: 210994556 - config_name: shard_09 features: - name: text dtype: string - name: key dtype: string splits: - name: train num_bytes: 211494812 num_examples: 200000 download_size: 124078944 dataset_size: 211494812 - config_name: shard_10 features: - name: text dtype: string - name: key dtype: string splits: - name: train num_bytes: 71696376 num_examples: 67228 download_size: 41988287 dataset_size: 71696376 - config_name: subset_2M features: *id001 splits: - name: train num_examples: 1867228 configs: - config_name: shard_01 data_files: - split: train path: shard_01/train-* - config_name: shard_02 data_files: - split: train path: shard_02/train-* - config_name: shard_03 data_files: - split: train path: shard_03/train-* - config_name: shard_04 data_files: - split: train path: shard_04/train-* - config_name: shard_05 data_files: - split: train path: shard_05/train-* - config_name: shard_06 data_files: - split: train path: shard_06/train-* - config_name: shard_07 data_files: - split: train path: shard_07/train-* - config_name: shard_08 data_files: - split: train path: shard_08/train-* - config_name: shard_09 data_files: - split: train path: shard_09/train-* - config_name: shard_10 data_files: - split: train path: shard_10/train-* - config_name: subset_2M data_files: - split: train path: - shard_01/train-* - shard_02/train-* - shard_03/train-* - shard_04/train-* - shard_05/train-* - shard_06/train-* - shard_07/train-* - shard_08/train-* - shard_09/train-* - shard_10/train-* ---
提供机构:
kotoba-speech
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作