five

Brain Language Metrics on Earnings Calls Transcripts - Live Feed

收藏
Snowflake2022-07-12 更新2024-05-01 收录
下载链接:
https://app.snowflake.com/marketplace/listing/GZSVZD5BX2
下载链接
链接失效反馈
官方服务:
资源简介:
The Brain Language Metrics on Earnings Calls Transcripts (BLMECT) dataset has the objective of monitoring several language metrics for the quarterly earnings call transcripts of 4500+ US stocks. This data set contains historical data from January 2012 and live data updated daily within 12pm UTC. DATASET STRUCTURE AND KEY FIELDS The dataset is constituted of a single schema "LANGUAGE_METRICS_EARNINGS_CALLS" and it can be logically divided in two parts. For both parts the metrics calculation is reported separately for the the following sections of the earnings call: a. Management Discussion (MD) b. Analysts’ Questions (AQ) c. Management Answers to Analysts’ Questions (MA) Part one includes several language metrics for each section of the most recent earnings call transcript for each stock and it is saved in the table "METRICS_EARNINGS_CALL". The key metrics of part one are: 1. Financial sentiment; e.g the field "MD_SENTIMENT" refers to the financial sentiment of section MD. 2. Percentage of words belonging to financial domain classified by language types: - “Constraining” language; e.g the field "MD_SCORE_CONSTRAINING" refers to the percentage of financial domain constraining language of section MD of the last available transcript); - “Litigious” language; e.g the field "MD_SCORE_LITIGIOUS" refers to the percentage of financial domain litigious language of section MD of the last available transcript); - “Uncertainty” language; e.g the field "MD_SCORE_UNCERTAINTY" refers to the percentage of financial domain uncertainty language of section MD of the last available transcript). 3. Readability score, e.g. the field MD_READABILITY refers to the reading grade level for the MD section of the last available transcript). 4. Lexical metrics such as lexical density and richness of text, e.g. the field MD_LEXICAL_RICHNESS refers to the lexical richness of the MD section of of the last available transcript) 5. Text statistics such as the transcript length (e.g. the field MD_N_CHARACTERS refers to the length of the section “Management Discussion” measured in number of characters). Part two includes the differences between the most recent earnings call transcript and the previous one: it is saved in the table "DIFFERENCES_EARNINGS_CALLS". The key metrics of part two are: 1. Difference of the various language metrics; e.g. the field MD_DELTA_SENTIMENT refers to the difference of financial sentiment between the MD section of the last available transcript nd the same section of the previous transcript. 2. Similarity metrics between documents, also with respect to a specific language type, for example similarity with respect to “litigious” language or “uncertainty” language. For example the field MD_SIMILARITY_UNCERTAINTY refers to the similarity in terms of financial domain “uncertainty” language between the MD section of the last available transcript and the same section of the previous transcript. FACTSHEET Link to factsheet: https://braincompany.co/assets/files/BLM_ECT_summary.pdf DISCLAIMER The content of this dataset is not to be intended as investment advice. The material is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory or other services by Brain. Brain makes no guarantees regarding the accuracy and completeness of the information expressed in the dataset.
提供机构:
Brain
创建时间:
2022-07-12
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作