soundai2016/EmoAlignBench
收藏Hugging Face2026-02-25 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/soundai2016/EmoAlignBench
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
task_categories:
- audio-classification
- text-classification
- time-series-forecasting
language:
- en
tags:
- finance
- earnings-calls
- multimodal
- emotion-recognition
- prosody
- emoalignbench
size_categories:
- 10K<n<100K
---
# EmoAlignBench Raw Dataset: Multimodal Financial Earnings Calls
Welcome to the raw dataset repository for **EmoAlignBench**. This dataset contains foundational multimodal data (audio, transcripts, and dual-track emotion annotations) from corporate earnings calls. It is specifically curated for analyzing **emotion incongruence**—the discrepancy between textual semantics and vocal prosody—within financial discourse.
## 📂 Repository Structure
The dataset is organized into parallel directories named by corporate **Tickers**. Each directory contains the raw audio, a metadata index, emotion-tagged transcripts (JSONL), and historical stock data.
```text
.
├── TICKER/
│ ├── {uid}_Index.json # Metadata index linking all modal files
│ ├── {uid}_Emotion.txt # JSONL transcript with textual & vocal emotion tags
│ ├── {uid}.m4a # Raw audio file (or .mp3 / .wav)
│ └── {uid}_Stockprice.json # Daily historical stock prices and volumes
└── README.md
```
# 📄 Data Formats & Fields
## 1. Metadata Index (`*_Index.json`)
Links the disparate data files for a specific earnings call event.
```json
{
"uid": "ABBV_20250425_EC", // Unique event identifier
"ticker": "ABBV", // Stock ticker symbol
"event_start_et": "2025-04-25 09:00", // Event start time (US Eastern Time)
"sound_file": "ABBV_20250425.m4a", // Reference to the audio file
"transcript_sound": "...Emotion.txt", // Reference to the emotion transcript
"stock_price": "...Stockprice.json" // Reference to the stock data
}
```
### 2. Emotion Transcripts (`*_Emotion.txt`)
A JSON-Lines file where each line is an utterance with synchronized timestamps and emotion labels.
```json
{
"speaker": "Analyst Name", // Identity of the speaker
"timestamp": "2025-04-25 09:15:20", // Start time for audio alignment
"text": "The margin pressure is...", // Raw transcript text
"text_emotion": "fear", // Textual Emotion Label (fear, joy, none, etc.)
"emotion": "😰", // Vocal Emotion emoji (😰, 😊, none, etc.)
"events": "" // Auxiliary symbolic markers
}
```
### 3. Stock Price Data (`*_Stockprice.json`)
Standard daily time series data containing historical context around the call date.
```json
{
"Meta Data": { "2. Symbol": "ABBV", ... },
"Time Series (Daily)": {
"2025-04-25": {
"1. open": "165.20",
"2. high": "168.50",
"3. low": "164.10",
"4. close": "167.30",
"5. volume": "5200300"
}
}
}
```
# 📝 License
This dataset is licensed under the **Apache License 2.0**.
提供机构:
soundai2016



