wulanbhai/subset
收藏Hugging Face2025-10-16 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/wulanbhai/subset
下载链接
链接失效反馈官方服务:
资源简介:
LaTeX公式数据集的桶索引,包含552,340个样本,按照80%训练集和20%测试集的比例随机分割(随机种子为42)。数据集根据公式长度和结构分为多个桶,如单行公式、多行公式、矩阵结构等,并保持了每个桶在训练集和测试集中的比例。
Bucket indices for the LaTeX formulas dataset, containing 552,340 samples, split into 80% train set and 20% test set with a random seed of 42. The dataset is categorized into multiple buckets based on formula length and structure, such as single-line formulas, multi-line formulas, and matrix structures, maintaining the ratio of each bucket in both the train and test sets.
提供机构:
wulanbhai



