daikin-industries-ltd/ja-fineweb-2-hvac-fastText-scored-v4
收藏Hugging Face2025-12-22 更新2026-01-03 收录
下载链接:
https://hf-mirror.com/datasets/daikin-industries-ltd/ja-fineweb-2-hvac-fastText-scored-v4
下载链接
链接失效反馈官方服务:
资源简介:
本数据集是基于FastText分类和LLM质量评分的日语空调节能(HVAC)相关文本数据集。数据集包含文本内容、URL、来源信息、语言评分、FastText分类评分(0.0-1.0)、LLM质量评分(1-5)及评分依据。数据集筛选了FastText评分高的文档,并进行了LLM的详细质量评估,旨在提供高质量的空调节能技术相关文本,适用于教育和技术研究。
This dataset is a Japanese HVAC-related text dataset based on FastText classification and LLM quality scoring. It includes text content, URLs, source information, language scores, FastText classification scores (0.0-1.0), LLM quality scores (1-5), and scoring rationale. The dataset filters documents with high FastText scores and conducts detailed quality evaluations using LLM, aiming to provide high-quality HVAC-related technical texts for education and research purposes.
提供机构:
daikin-industries-ltd



