IRM_chat_all

Name: IRM_chat_all
Creator: maas
Published: 2025-06-10 22:16:03
License: 暂无描述

魔搭社区2025-06-10 更新2025-03-29 收录

下载链接：

https://modelscope.cn/datasets/YKDuan/IRM_chat_all

下载链接

链接失效反馈

官方服务：

资源简介：

## 简介 **IRM_chat_little** 数据集由 **YKDuan** 在 ModelScope 平台上贡献。该数据集专为图书情报学领域的对话式交互而设计。 **数据来源：** - 大百科全书中关于图书情报学科的概念 - CSSCI（中文社会科学引文索引）中图书情报学科的题录信息这些基础文本通过大型语言模型（LLMs）构建了总计 **276,083 条对话轮次**。该数据集非常适合用于学术对话领域的 LLM 训练和微调，特别是在图书情报学领域。 ## 数据集内容 **IRM_chat_little** 数据集包含 **276,083 条对话轮次**。每个条目都以 JSON 对象的形式呈现，代表用户和助手之间的对话。 ### 格式 ```json {"messages": [{"role": "user", "content": "用户查询"}, {"role": "assistant", "content": "助手回复"}]} ``` ### 示例 ```json {"messages": [{"role": "user", "content": "什么是信息检索？"}, {"role": "assistant", "content": "信息检索（IR）是根据信息需求从信息资源集合中获取相关信息系统资源的行为。搜索可以基于元数据或全文索引。"}]} ``` ## 数据集统计 - **对话轮次总数：** 276,083 个独特条目 ## 使用方法该数据集可以通过 ModelScope 库进行访问和使用。 ### 使用 ModelScope SDK (Python) ```python from modelscope.msdatasets import MsDataset # 加载数据集 dataset = MsDataset.load('YKDuan/IRM_chat_little') # 迭代数据集 for item in dataset: print(item) break # 打印第一个条目以供演示 ``` ## 许可协议该数据集根据 **Apache 2.0** 许可协议分发。请参考 ModelScope 平台以获取最新的许可信息。 ## 引用如果您在研究或应用中使用此数据集，请引用以下论文： ```bibtex @inproceedings{Zhu2025LISGPT, title={{LISGPT: Boundary Knowledge Enhanced Academic Large Language Modeling and its Scenario Applications}}, author={Zhu, Y and Duan, Y* and Hu, J and Jin, J and Ye, J}, booktitle={Proceedings of ASIS\&T 2025}, year={2025} } ``` ## 联系方式有关此数据集的任何问题或疑虑，请联系 **duanyongkang@mail.bnu.edu.cn** 或参考 ModelScope 社区。

## Introduction The **IRM_chat_little** dataset was contributed by **YKDuan** on the ModelScope platform. This dataset is specifically designed for conversational interactions in the field of library and information science (LIS). ## Data Sources - Concepts related to library and information science from general encyclopedias - Bibliographic records of library and information science publications indexed in CSSCI (Chinese Social Sciences Citation Index) These base texts were used to construct a total of **276,083 conversational turns** via large language models (LLMs). This dataset is highly suitable for LLM training and fine-tuning in the domain of academic conversations, particularly within library and information science. ## Dataset Content The **IRM_chat_little** dataset contains **276,083 conversational turns**. Each entry is presented as a JSON object representing a conversation between a user and an assistant. ### Format json {"messages": [{"role": "user", "content": "user query"}, {"role": "assistant", "content": "assistant reply"}]} ### Example json {"messages": [{"role": "user", "content": "什么是信息检索？"}, {"role": "assistant", "content": "信息检索（IR）是根据信息需求从信息资源集合中获取相关信息系统资源的行为。搜索可以基于元数据或全文索引。"}]} ## Dataset Statistics - **Total Conversational Turns**: 276,083 unique entries ## Usage Instructions This dataset can be accessed and utilized via the ModelScope library. ### Using ModelScope SDK (Python) python from modelscope.msdatasets import MsDataset # Load the dataset dataset = MsDataset.load('YKDuan/IRM_chat_little') # Iterate through the dataset for item in dataset: print(item) break # Print the first entry for demonstration ## License Agreement This dataset is distributed under the **Apache 2.0** license. Please refer to the ModelScope platform for the latest licensing information. ## Citation If you use this dataset in your research or applications, please cite the following paper: bibtex @inproceedings{Zhu2025LISGPT, title={{LISGPT: Boundary Knowledge Enhanced Academic Large Language Modeling and its Scenario Applications}}, author={Zhu, Y and Duan, Y* and Hu, J and Jin, J and Ye, J}, booktitle={Proceedings of ASIS&T 2025}, year={2025} } ## Contact Information For any questions or concerns regarding this dataset, please contact **duanyongkang@mail.bnu.edu.cn** or refer to the ModelScope community.

提供机构：

maas

创建时间：

2025-03-22

5,000+

优质数据集

54 个

任务类型

进入经典数据集