five

Pile-Enron_Emails

收藏
魔搭社区2025-11-11 更新2025-06-14 收录
下载链接:
https://modelscope.cn/datasets/OmniData/Pile-Enron_Emails
下载链接
链接失效反馈
官方服务:
资源简介:
# 数据介绍 ## 简介 Pile-Enron Emails数据集是一个基于恩隆公司(Enron Corporation)的电子邮件通信记录构建的大规模电子邮件数据集。这些邮件涵盖了各种主题和内容,包括工作事务、会议安排、业务讨论等,涵盖了多个年份的通信记录,提供了广泛的业务领域和主题。 Pile-Enron Emails数据集可以用于企业通信研究、组织行为分析、文本挖掘等应用,帮助理解企业内部的沟通和业务活动。 ## 数据内容 ### 数据说明 Pile-Enron Emails数据集涵盖了0.04G的数据。 ### 数据示例 ``` { "id": "134526691", "source_id": "", "doc_id": "340627", "data_type": "text", "data_source": "pile", "data_url": "enwiki-c4-pile-ccnews", "content": "Oops - that was supposed to be AT&T - Lucent is the one that lasts forever \nand deals with the Utility Solutions business.\n---------------------- Forwarded by Mark Taylor/HOU/ECT on 02/22/2000 04:10 \nPM ---------------------------\nTo: Bryan Seyfried/LON/ECT@ECT\ncc: \nSubject: Re: Credit Derivatives Reference Entities \n\nI have read the Lucent CA and it is OK to go ahead and keep them on the list \n- the CA seems to cover a relatively small transaction and expires on March \n25 anyway.\n\nWe are in the process of pulling the CA's for Dow, DuPont, GE, IMC, 3M, \nPitney Bowes, Praxair and Sprint.\n", "remark": { "pile_set_name": "Enron Emails" }, "sub_path": "enron-emails/test" } ``` ## 引文 ``` @misc{conghui2022opendatalab, title={OpenDataLab: Empowering General Artificial Intelligence with Open Datasets}, author={Conghui He, Wei Li, Zhenjiang Jin, Bin Wang, Chao Xu, Dahua Lin}, journal={https://opendatalab.com/}, year={2022} } ``` ## Download dataset :modelscope-code[]{type="git"}

# 数据集介绍 ## 数据集概况 恩隆邮件数据集(Pile-Enron Emails)是依托恩隆公司(Enron Corporation)官方电子邮件通信档案构建的超大规模商用邮件数据集。该数据集收录了横跨多个年度的邮件通信记录,内容覆盖办公事务协调、会议日程安排、业务研讨交流等多元场景,囊括了丰富的业务领域与主题范畴。 该数据集可应用于企业通信研究、组织行为分析、文本挖掘等科研与工程场景,助力研究者深入洞察企业内部沟通模式与业务运作逻辑。 ## 数据集详情 ### 数据集规模 恩隆邮件数据集(Pile-Enron Emails)总数据量约为0.04吉字节(GB)。 ### 数据样例 { "id": "134526691", "source_id": "", "doc_id": "340627", "data_type": "text", "data_source": "pile", "data_url": "enwiki-c4-pile-ccnews", "content": "Oops - that was supposed to be AT&T - Lucent is the one that lasts forever and deals with the Utility Solutions business. ---------------------- Forwarded by Mark Taylor/HOU/ECT on 02/22/2000 04:10 PM --------------------------- To: Bryan Seyfried/LON/ECT@ECT cc: Subject: Re: Credit Derivatives Reference Entities I have read the Lucent CA and it is OK to go ahead and keep them on the list - the CA seems to cover a relatively small transaction and expires on March 25 anyway. We are in the process of pulling the CA's for Dow, DuPont, GE, IMC, 3M, Pitney Bowes, Praxair and Sprint. ", "remark": { "pile_set_name": "Enron Emails" }, "sub_path": "enron-emails/test" } ## 引用文献 @misc{conghui2022opendatalab, title={OpenDataLab: Empowering General Artificial Intelligence with Open Datasets}, author={Conghui He, Wei Li, Zhenjiang Jin, Bin Wang, Chao Xu, Dahua Lin}, journal={https://opendatalab.com/}, year={2022} } ## 数据集下载 :modelscope-code[]{type="git"}
提供机构:
maas
创建时间:
2024-07-05
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作