PELAB-LiU/tlc_interduplication
收藏Hugging Face2023-11-10 更新2025-04-26 收录
下载链接:
https://hf-mirror.com/datasets/PELAB-LiU/tlc_interduplication
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: id_within_dataset
dtype: int64
- name: snippet
dtype: string
- name: tokens
sequence: string
- name: nl
dtype: string
- name: split_within_dataset
dtype: string
- name: is_duplicated
dtype: bool
splits:
- name: train
num_bytes: 70652063.18677872
num_examples: 53327
- name: test
num_bytes: 8799876.304434607
num_examples: 6642
- name: valid
num_bytes: 8831673.508786675
num_examples: 6666
download_size: 33772946
dataset_size: 88283613.00000001
---
# Dataset Card for "tlc_interduplication"
[More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
提供机构:
PELAB-LiU



