five

infinite-dataset-hub/InnovationsPatentClassification

收藏
Hugging Face2025-02-16 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/infinite-dataset-hub/InnovationsPatentClassification
下载链接
链接失效反馈
官方服务:
资源简介:
InnovationsPatentClassification数据集是来自科技领域的专利文献集合,每个文献都被标记了一个反映专利创新领域的类别。该数据集旨在帮助训练机器学习模型,以便将专利文档分类到不同的类别,如人工智能、可再生能源、生物技术、机器人技术和量子计算。每份文档都经过预处理,以适应自然语言处理任务,包括分词、停用词移除和词形还原。分配给每个文档的标签指示了专利所属的领域,并设计为与其他技术类别不重叠。

The InnovationsPatentClassification dataset is a collection of patent documents from the technology sector, each labeled with a category reflecting the patents area of innovation. The dataset is designed to facilitate the training of a machine learning model to classify patent documents into distinct categories such as Artificial Intelligence, Renewable Energy, Biotechnology, Robotics, and Quantum Computing. Each document has been preprocessed for natural language processing tasks, including tokenization, stopword removal, and lemmatization. The labels assigned to each document are indicative of the field the patent pertains to and are crafted to avoid overlap with other technology categories.
提供机构:
infinite-dataset-hub
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作