hyperdemocracy/usc-nomic-chunks-v1-s8192-o512
收藏Hugging Face2024-02-23 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/hyperdemocracy/usc-nomic-chunks-v1-s8192-o512
下载链接
链接失效反馈官方服务:
资源简介:
---
configs:
- config_name: default
data_files:
- path: data/usc-113-nomic-chunks-v1-s8192-o512.parquet
split: '113'
- path: data/usc-114-nomic-chunks-v1-s8192-o512.parquet
split: '114'
- path: data/usc-115-nomic-chunks-v1-s8192-o512.parquet
split: '115'
- path: data/usc-116-nomic-chunks-v1-s8192-o512.parquet
split: '116'
- path: data/usc-117-nomic-chunks-v1-s8192-o512.parquet
split: '117'
- path: data/usc-118-nomic-chunks-v1-s8192-o512.parquet
split: '118'
dataset_info:
features:
- dtype: string
name: chunk_id
- dtype: string
name: congress_num
- dtype: string
name: nomic_topic_depth_1
- dtype: string
name: nomic_topic_depth_2
- dtype: string
name: nomic_topic_depth_3
- dtype: float32
name: nomic_proj_x
- dtype: float32
name: nomic_proj_y
- list:
dtype: float32
name: nomic_vec
- dtype: string
name: text
- name: chunk_metadata
struct:
- dtype: string
name: chunk_id
- dtype: int32
name: chunk_index
- dtype: string
name: congress_num
- dtype: string
name: legis_class
- dtype: string
name: legis_id
- dtype: int32
name: legis_num
- dtype: string
name: legis_type
- dtype: string
name: legis_version
- dtype: int32
name: start_index
- dtype: string
name: text_date
- dtype: string
name: text_id
- name: bill_metadata
struct:
- dtype: string
name: introduced_date
- dtype: string
name: origin_chamber
- dtype: string
name: policy_area
- list:
dtype: string
name: subjects
- list:
- dtype: string
name: bioguide_id
- dtype: string
name: district
- dtype: string
name: first_name
- dtype: string
name: full_name
- dtype: string
name: is_by_request
- dtype: string
name: last_name
- dtype: string
name: middle_name
- dtype: string
name: party
- dtype: string
name: state
- name: identifiers
struct:
- dtype: string
name: bioguide_id
- dtype: string
name: lis_id
- dtype: string
name: gpo_id
name: sponsors
---
提供机构:
hyperdemocracy
原始信息汇总
数据集配置
- 配置名称: default
- 数据文件路径及分割:
data/usc-113-nomic-chunks-v1-s8192-o512.parquet: 分割 113data/usc-114-nomic-chunks-v1-s8192-o512.parquet: 分割 114data/usc-115-nomic-chunks-v1-s8192-o512.parquet: 分割 115data/usc-116-nomic-chunks-v1-s8192-o512.parquet: 分割 116data/usc-117-nomic-chunks-v1-s8192-o512.parquet: 分割 117data/usc-118-nomic-chunks-v1-s8192-o512.parquet: 分割 118
数据集信息
- 特征:
chunk_id: 字符串类型congress_num: 字符串类型nomic_topic_depth_1: 字符串类型nomic_topic_depth_2: 字符串类型nomic_topic_depth_3: 字符串类型nomic_proj_x: 浮点数类型 (float32)nomic_proj_y: 浮点数类型 (float32)nomic_vec: 浮点数列表类型 (float32)text: 字符串类型chunk_metadata: 结构体类型chunk_id: 字符串类型chunk_index: 整数类型 (int32)congress_num: 字符串类型legis_class: 字符串类型legis_id: 字符串类型legis_num: 整数类型 (int32)legis_type: 字符串类型legis_version: 字符串类型start_index: 整数类型 (int32)text_date: 字符串类型text_id: 字符串类型
bill_metadata: 结构体类型introduced_date: 字符串类型origin_chamber: 字符串类型policy_area: 字符串类型subjects: 字符串列表类型sponsors: 结构体列表类型bioguide_id: 字符串类型district: 字符串类型first_name: 字符串类型full_name: 字符串类型is_by_request: 字符串类型last_name: 字符串类型middle_name: 字符串类型party: 字符串类型state: 字符串类型identifiers: 结构体类型bioguide_id: 字符串类型lis_id: 字符串类型gpo_id: 字符串类型
搜集汇总
数据集介绍

以上内容由遇见数据集搜集并总结生成



